The drive toward 5G and beyond has been a key factor in the demand for higher bandwidth and more reliable networks. This has, in turn, brought about a greater appetite for intelligent infrastructure supported by automation.
But while network automation is anticipated to benefit operators and customers by supporting, for example, capacity optimisation and the associated cost savings, what will that mean in terms of increased complexity? This was the topic of a workshop at this year’s virtual Optical Fiber Communication Conference and Exhibition (OFC) event, which saw speakers from across the optical communications supply chain offer their opinions and expertise about where we are with network automation, where it is going and how it is getting there, taking into account the challenges they have witnessed.
Jesse Simsarian from Nokia Bell Labs detailed some of the benefits of automation. ‘We expect increased capacity for optimal operation; reliability by getting warnings for equipment before paths actually fail; dynamicity for rapid service creation and restoration, and also proactive routing to avoid outages,’ he said.
Making sense
Simsarian explained there has been a lot of progress in these areas at Nokia Bell Labs itself, but also he acknowledged the work taking place elsewhere. In particular, he said, this progress has been made on advanced sensing, stream processing, capacity optimisation, network learning, modulation formats and SDN network control.
Staying with sensing, he also discussed some of the use cases that can be achieved by exploiting the movement of the fibre, which induces polarisation rotations on the optical signals, allowing it to be transformed into ‘the largest motion sensor in the world’. This, said Simsarian, can sense the environment and also help protect the fibre itself, providing early warnings in the case of fibre breaks.
‘We began researching this in 2017,’ he said, ‘looking at the span polarisation sensing and also at the coherent receivers. We also have a Bell Labs paper looking at OSDN and back scattering for really high sensitivity and accurate localisation of events. There’s also been some interesting work going on with field trials, where Google has shown that it could detect subsea earthquakes and ocean waves using coherent detectors, and Verizon has a tutorial about using distributed vibration sensing and back scattering, where they’ve shown they can detect vehicle traffic.’
Until recently, explained Simsarian, the focus has been on the optical networking equipment itself doing the sensing. ‘But,’ he said, ‘network automation really requires more awareness of the environment, so we want to expand our field of view to incorporate more IoT sensors outside of the optical transport, outside of the networking equipment. With 5G, we have an end-to-end network that supports a massive number of IoT devices and this sensor data is generated as streams of information, so we need to process this data as streams in a realtime and flexible way.’
To do this, Bell Labs has been developing a cloud-based stream processing platform. ‘We can combine different stream processing pipeline,’ continued Simsarian, ‘so we could have a moving window-based alarm generation, based on the streaming telemetry data, as well as a compositional neural network-based person detection algorithm that’s running on a video stream and we can programmatically create correlation alarms on these different sources to get further insight to the network operations.’
Focus on margins
An important application for network automation, explained Simsarian, is optimisation and reducing the margins put into the network. ‘Margins are the difference between the quality of transport metrics, such as the bit error rate and the receiver performance limit,’ he said. ‘So, by lowering the margins we can increase capacity and reduce network costs. The design margin can be reduced by the accurate quality-oftransmission prediction through good physical models and learning the devices in the actual state of the network.’
The margin allocated for transponders and ageing can also be reduced, said Simsarian, by using flexible elastic optical transponders. ‘This way,’ he said, ‘you can provide the maximum bit rate for any transmission distance. Our ASM business has shown that, by using the high degree of flexibility in rate and reach of probabilistic constellation shaping modulation format, we can optimise this rate-reach tradeoff and reduce the allocated margins.’
One of the biggest challenges when it comes to network automation is the acceptance from human beings for machines to act on the network autonomously. This is, said Simsarian, perhaps the biggest unknown that we face. He referred to the American folktale of John Henry, whose prowess as a steel driver was measured in a race against a steam-powered rock drilling machine. Henry won against the machine, only to die after the challenge as his heart gave out from stress.
‘Here we are 150 years later,’ said Simsarian. ‘We have highly trained network engineers competing against AI in these complex network control tasks, and we certainly don’t want the fate of poor John Henry to befall any of us. We think, at some point, AI should outperform a human in terms of dynamic operation, but for now we lack the trust in AI to actually take over. So, we will likely have some kind of a humanmachine collaboration for some time, where AI recommends actions and then the humans make them happen and, over time, we expect the AI to gain more autonomy.’
Efficient operation
Offering the operator’s standpoint was Telia Company’s Stefan Melin, a network architect who revealed that the company’s agenda – like many operators – is to increase efficiency and improve reliability in its optical networks. ‘We have some tools to do that,’ he said. ‘We have the open optical networks that will enable us to select the best offer and technology. We have automation and we have online networks that we believe can shorten the lead times and improve reliability of the network.’
Melin detailed some of the most important areas to address for operators with automation in the open disaggregated optical network. These include efficient use of installed resources, such as spectrum fragmentation. ‘Some online analysis and automated defragmentation would help us to be in better control of how the other resources are used,’ he said. Also important is time for service delivery, including lead time for planning; time for network deployment and operation and lifecycle management. ‘To automate fault management, to automate tasks in the multisupplier environment is something that is of interest to us,’ he continued.
Something Melin noted with regards to his fellow speakers is that an area of interest shared by all is open optical planning and impairment validation. ‘It was great to hear previous speakers mentioning this and that it has been acknowledged by the suppliers as well,’ he said. ‘To actually enable the potential of the open disaggregated optical networks, we need an online open optical planning impairment validation functionality and the prerequisite is that this would be supplier independent.’
For this to work, said Melin, it needs to be agreed what data should be shared for the validation. ‘With different suppliers and vendors, it can be troublesome to share some data,’ he acknowledged, ‘but we need to agree upon a framework here for doing that.’
Another prerequisite, said Melin, is application programming interfaces that support exposure relevant data for this online validation. Additionally, that online data be stored in network elements, domain controllers and inventory systems. ‘Data should be retrieved from the network,’ he said, ‘and independently if it’s static or dynamic data, the network should be the master that provides the data.’
Improved accuracy
In addition, he said, the network should be machine learning assisted to improve accuracy. ‘We believe that’s probably the way we need to go in a multi-supplier environment,’ he said, ‘to increase the accuracy of the planning.’
Providing a real-life ‘promising’ example, Melin referred to the Telecom Infra Project, a community project that involved members working together to accelerate the development and deployment of open, disaggregated and standards-based technology solutions. Within the project, the open optical and packet transport project group’s (OOPT) physical simulation environment (PSC) working group is developing a model called Gaussian noise model in Python (GNPy).
This is an open-source communitydeveloped library for building route planning and optimisation tools in real-world mesh networks. ‘This is a tool,’ said Melin, ‘that can be used for impairment validation in open optical networks and it may be the way to go. It is, to my knowledge, the only open initiative that opens up for mostly supplier validation.’
Addressing the challenges, Melin warned we need to be cautious when moving complexity from manually managed to automatically managed systems. ‘We know what we’re working with today with our current IT systems,’ he said. ‘They have grown and become very complex and we can’t allow the same thing to happen in the automation. This is a challenge, and how to manage this automation complexity cost efficiently is, of course, a part of it as well. It’s a challenge. It’s a new way of working. It is changing competencies, but this can be overcome, it’s just something we have to work on.’
Speaking to Fibre Systems following the event, Jurgen Hatheier, EMEA chief technology officer at Ciena, focused on automation in the orchestration of the network. ‘As automation has progressed,’ he said, ‘we need to look closer at the orchestration of the network. It’s not so much having a trigger in the customer order, and the challenges of executing it, but rather a number of stimuli that are triggering certain behaviours in the network. For example looking at some reactive actions that might be a failure or breakage in the network and we want the system to find a way to either recover itself, or call in the truck roll to come and fix certain elements.’
Image credit: Alexander Supertramp/Shutterstock.com
Design for life
According to Hatheier, this goes all the way back to network design. ‘You might have designed a network that only needed, say, a couple of hours a week or a couple of days in the year, and the capacity was sitting idle. We strongly believe the more economic factors or commercial business aspects you bring in, the less likely you will go out and take for peak performance rather than for peak management.’
Hatheier offered the use case example of a concert in a stadium, for which temporary 5G towers are set up and generate 100G of traffic. ‘That 100G is sitting on that fibre pair all year,’ he explained, ‘because people sit idle for 99 per cent of the time.’ Therefore, believes Hatheier, network automation orchestration should be used to automatically react to this, based on the configured business principles.
It is in ‘extraordinary cases’ said Hatheier, where the concept of AI or machine learning comes into play. ‘We give the network a mission to go and continuously optimise itself,’ he explained. ‘So bringing a cell phone tower temporarily for a concert, that’s predictable. But Covid-19 isn’t predictable, Microsoft pushing to millions of computers isn’t predictable, and the announcement of a certain game isn’t predictable, but it will create peaks that a network needs to react to.’
To put this into context, Hatheier used the example of Apple pushing out a big upgrade that generates multiple terabytes to all Apple devices at the same time. This could mean some of the other applications that need more real-time experience are pushed to the site, which could provide a bad service experience. ‘There could be a rebalancing of traffic required,’ he said, ‘and some tickets could be raised, and in the second or thirdlevel engineering department, somebody takes care of that issue, but it might be too late already, because you have costs, call receipts or even truck rolls because the root cause of the slowness was not quickly identified.’
Hatheier believes that network automation using AI or machine-learning-based systems can help to react to such issues as they appear, rather than waiting for somebody to come in and shuffle the traffic. ‘When we combine the network automation and orchestration with the big data we collect from the ecosystem, be it from our own devices or from third party devices, we apply the business rules. Then we apply the corrective action to a programmable infrastructure and the programmable infrastructure. Adaptive, closed-loop automation self-learning is the vision that we embrace, and that we discuss with and roll out to our customers.’
To conclude, Hatheier revisited the idea of network openness. ‘Over the last 10 years, vendors and operators alike have really put effort into standardising interfaces that allow you to pull data and configure systems, knowing the only way we can deliver a proper ecosystem is to be interoperable, to give people access to be transparent and not have, you know, black box systems where you can only configure with a very proprietary management system. So, this change of mindset, I believe, has built the foundation that operators are using now more than ever, to build those more complex automation scenarios.’
Software-defined access networks (SDAN) have brought a step-change in operational efficiency and agility, with many of the benefits coming from automation. SDAN takes network functions out of physical network assets, virtualises and hosts them in the cloud, along with their associated data sets. This means operators can control network functions centrally and apply network automation for complex and time-consuming tasks while reducing the need for manual intervention.
Challenge
While operators are under pressure from regulators and competitors to activate new high-tier (100Mb/s to 1Gb/s) broadband services quickly, automation routines can control costs and solve complexities when expanding networks. And operators often have complex multi-vendor, multi-technology environments to work through as they accelerate fibre deployments.
nbn and NetCologne are both in that case. nbn is the national broadband infrastructure provider in Australia, with more than five million broadband subscribers, over 50 per cent market share, and nationwide coverage. NetCologne is the largest regional alternative operator in Germany, with around 500,000 customers.
Solution
Both operators chose Nokia Lightspan access nodes and the Nokia Altiplano Access Controller for their SDN-enabled fibre-to-the-premises (FTTP) projects.
In the nbn and Netcologne network deployments, speed was of the essence: each had tens of thousands of new network access nodes to deploy. The traditional deployment method for a node is a multistep process in which a technician connects to the hardware to provision it. This must be done for each node – a time-consuming process prone to human error.
In their software-defined environments, provisioning has become ‘zero-touch’. Nodes are pre-configured in cloud software: once installed, powered up and connected to the network, they are automatically identified and Altiplano initiates the relevant provisioning commands.
The Altiplano platform also supports these operators’ multiple-supplier strategy. For example, nbn uses FTTC distribution point units from three different suppliers. Both operators sought a supplier for an SDN controller with open APIs to integrate multi-vendor nodes and orchestrate FCAPS operations across all types of assets. Another advantage is that SDN doesn’t have to pull data from network assets: instead, state, error and alarm data is continuously streamed into a central data lake, from which the SDAN controller gets real-time insight needed to execute commands. This is a prerequisite for intent-based automation, which incorporates network awareness and service assurance tasks, saving multiple steps of configuring devices and services. Another good match was that Altiplano easily integrated with both operators' IT stack, and Nokia had common IT tools and APIs with which nbn and NetCologne were already familiar, like MariaDB, REST and Kafka.
Benefits
Time is money, as they say. When you’re activating tens of thousands of nodes, any time saving can create significant cost savings, as well as accelerate time to revenue.
Always-on programmability and zero-touch device turn-up leads to fewer errors, fewer retries and faster fulfilment. With zero-touch provisioning, nbn saw both technician-installed nodes and customer-installed CPE activities have a more than 90 per cent first-time-right success, while NetCologne required the installation of new access nodes to take no longer than 20 minutes. In both cases, the Nokia zero-touch technology cuts the rollout costs and configuration time in half.
Horst Schmitz, head of technology at NetCologne, said: ‘Nokia delivered a highly customisable solution that is ideal for the next step in our network plans: bringing gigabit connectivity cost-effectively into buildings. On top, the cloud platform improves our automation capabilities today and for any future devices added to the network over time.’
The Altiplano access controller can manage hundreds of thousands of nodes in a network of millions of subscribers. The open cloud-native software platform has excellent high-availability and horizontal scalability for mass deployment, including good compliance to standards and the ability to manage all technologies and equipment from all vendors through a unified management interface. The unified view of the network lets nbn and NetCologne smoothly integrate operations via open APIs, get better insights in increasing volumes of operational data coming from more nodes, and roll out services more quickly.
Ray Owen, chief technology officer at nbn, said: ‘Nokia’s cutting-edge SDAN technology allows us to manage the various G.fast deployments we have across the nbn FTTC network. It also gives us the flexibility to enhance the customer experience management and advance our own operations systems with integration into the Nokia SDAN open environment.’
Future opportunities
nbn and NetCologne now have SDAN environments on which they can build. Traditional network management systems offer only limited customisation, leading to cumbersome development and integration cycles. While some workflows can be automated with the help of scripting, there are clear limitations, since neither data collection nor configuration changes of a traditional EMS can support real-time automation. In addition, data is typically stored in highly proprietary and inaccessible database systems, preventing reusability and leading to data consistency issues. All of these limitations are removed by the new data management paradigm of SDAN.
SDAN brings the following benefits:
- Faster diagnostics and troubleshooting – high-precision telemetry, timely access to data and easy correlation of alarms enable faster execution in operational processes.
- Better insights for proactive action – automated measurements, unified reporting and common analysis of data anomalies and trends reduce the cost of poor quality, and resolve a lot of inefficiencies in the existing processes.
- Faster innovation cycles – development time is rapidly reduced due to more efficient innovation with a common SDN controller, fully open APIs and automated test stages.
- Simplified back-up and restore – these are largely eliminated due to continuous versioning in the cloud and the capability to restore to any point in time.
Similar benefits as shown in the FTTP projects from nbn and NetCologne also apply to fibre-to-the-home, fixed wireless access and mobile backhaul deployments. As broadband network operators build their experience and confidence levels with software-defined access networks, all these opportunities will become apparent, with the resulting benefits for operators and their customers.