Introducing the Agile-First Approach to 5G Orchestration
In the telco industry it can take up to 1-2 years for new software to go live; in the world of 5G, this isn’t going to fly. 5G migrations will lead telcos to adopt DevOps principles and toolsets to expand and fully leverage software-defined networks.
Those who will be able to go through this transformation are likely to monetize on the 5G promise; those who do not may never be able to see a return on this investment. The purpose of this post is to introduce an agile-first approach that will help carriers and vendors to significantly speed up their transformation and save significant revenue in the process. Indeed, this approach can help a CSP become the next Netflix instead of the next Blockbuster.
What exactly is an agile-first approach? An agile-first approach applies DevOps best practices that are used to handle common IT workloads to highly distributed 5G networks. With this approach, handling change management and any continuous updates can be prioritized as the primary design goal over creating clean interfaces and APIs.
Before delving into this subject, it is important to review and fully understand the challenges with current approaches using 5G management platforms or DevOps tool chains.
What are the challenges using existing 5G management platforms?
5G management platforms are provided by incumbent 5G core telco vendors as part of an end user solution stack. The main issues with this approach are:
- Change management using DevOps is often done as an afterthought
- Closed stack – these platforms are often biased toward a single vendor solution due to the inherent conflict of interest of supporting competing solutions.
- Limited public cloud support – most of the tools treat the public cloud as a pool of low level compute , storage and network resources a.k.a VIM. Public cloud includes lots of managed services that often overlap with many layers of the traditional vendors solution.
- Non Agile API interface – not built to to support multiple standard APIs and continuous changes to standard APIs.
What are the challenges with existing DevOps tools?
Most DevOps tools are built as a set of tool chains that are integrated through CI/CD pipelines. The integration uses a CLI or container based interface to execute infrastructure orchestration tasks directly from a CI/CD pipeline. The primary issues with this approach are as follows:
- Not built to handle highly distributed networks
- Doesn’t support high latency environments
- Doesn’t handle partial failure scenarios.
- Doesn’t support highly secured ‘air gapped’ environment (no internet access)
- Infrastructure focused not service focused – making it hard to track the state of a distributed service that is deployed on multiple sites. Instead each service instance is managed separately making it extremely hard to handle any continuous day 2 operations on that service: updates, scaling, healing etc.
- Needs to be continuously updated or modified with any environment / topology change.
Introducing an agile first approach to 5G
In the networking world, the primary design focus has been on creating clean interfaces and APIs and drawing clear boundaries between each layer of the stack. In this approach continuous updates as well as handling a mixed reality where not all the components fit into those interfaces has been often sidelined as an ‘implementation detail’.
‘Agile first’ means that the primary design goal is set in the opposite order. We first focus on handling continuous updates and changes across a distributed environment as the primary design goal. The design needs to be flexible enough to fit into a continuously evolving environment where almost everything can change over time. API and interfaces are also assumed to be subject to change and therefore the architecture needs to be tolerant enough to adopt new API and interfaces without breaking the entire system.
Using Common DevOps best practices to manage 5G network
To achieve an agile first approach we start by adopting common DevOps practices and apply them into the world of 5G networks as the primary design target. This includes:
- Open source stack – open source is key to enabling continuous integration and customization. Any DevOps environment includes many services that need to be integrated to enable an end-to-end automation. An open source based stack allows each DevOps team to be self-sufficient when it comes to these integration tasks.
- Cloud native in both the core and the edge – Kubernetes and micro services are widely considered as industry best practices when it comes to handling large scale systems. In the context of 5G we extend that into the edge and in this way we achieve consistent application architecture across the entire network. This will allow us to significantly simplify the continuous update processes across the entire network.
- Public cloud as a first class citizen – integrating with higher levels of the public cloud stack where it makes sense (such as native cloud orchestration, DevOps tools, function as a service, DBaaS and Monitoring services)
- Integrating with major DevOps toolchains – this provides an out-of -the-box integration with common DevOps toolchains such as Ansible, Terraform, Jenkins, Git Actions, and CircleCI
- Developer friendly – the main interface that developers work with is Git to manage versions and updates of their software. Rather than forcing developers to learn a new language and orchestration system, we can abstract a lot of the update processes by tracking and triggering seamless changes on Git resources. In this case we can abstract all interactions with the backend services by monitoring Git actions such as Commit, Merge etc. This model is known as GitOps.
- Distributed update workflow – by abstracting a distributed update workflow through a single command we can easily manage update processes through CI/CD pipelines in the same way that we handle other update operations on a single cluster.
- Adaptive API interfaces – to achieve pluggable north and southbound API interfaces enabling API mapping between multiple standard interfaces without major impact on the underlying core services stack, we’re using a combination of adaptors. On the northbound interface we’re using API gateways to map the existing API into another form of API. The CI/CD workflow, and Serverless functions are also used as an adapter interface in this regard.
DevOps at 5G Scale – Introducing the Service Orchestration
5G is a highly distributed environment with a set of unique challenges such as scale and latency that is not well covered with most of the existing open source and DevOps stacks.
This is where things get more interesting. To apply DevOps principles into 5G networks we have to add another layer – a Service Orchestration layer that is aimed to abstract the complexity of the underlying distributed environment and integrate all the specific domain orchestrations to enable a fully automated continuous deployment workflow.
The service orchestration adds:
- A service oriented approach to managing a service that is distributed across many clusters and sites.
- Built in workflows to handle distributed deployments and updates of that service through a single command (e.g. batch updates).
- Aggregated logging and monitoring to simplify troubleshooting scenarios.
- Discovery to allow continued synchronization with the edge target environment. This should also allow us to address scenarios such as zero touch provisioning.
- Filters, tagging and grouping allows dynamic match making between the service requirements and the site capabilities.
- Service oriented view – providing a quick view of the distributed service state by allowing simple mapping of service of service instances into their discrete locations.
In the case of Cloudify, the service orchestration will connect to the 5G network with existing network infrastructure.
Enabling DevOps at 5G Scale using Service Orchestration
The diagram below shows how all the pieces fit together in the context of the specific implementation with AWS Services providing the capabilities to run Core and Edge (using AWS Outposts) network functions. Intel and Capgemini as the vRAN implementation and Cloudify as the service orchestration and self service portal.
The 5G Network Slicing Use Case
Network slicing is one of the most important added value services within the 5G network.
In a nutshell, it allows service providers to break their network into segments based on specific SLA requirements as described in the diagram below.
To achieve this level of SLA control over the network we need to be able to control all the various elements of the stack from core to edge. This makes the 5G network much more dynamic in nature than the previous generation network, thus having everything fully automated becomes critical. The multi-vendor nature of such an environment (it’s unlikely that a single vendor solution would manage all the network elements in the stack) and the scale makes network slicing an excellent use case to illustrate the benefits of the architecture mentioned above.
Not Limited to 5G
One of the key benefits of using a more standard DevOps led approach is the fact that the entire components and architecture is built on common best practices that are used to build any cloud native or multi cloud solution. This will reduce the barrier to entry needed to deliver such a solution. It will also help to reduce the time to market significantly as we can rely on already proven architecture and frameworks.
Another interesting benefit of using this generic architecture is that it can be applied to almost any multi-cluster Kubernetes use case such as multi-cloud or hybrid cloud as described in the diagram below:
The birth, evolution and maturity of the Cloudify agile-first approach
This evolution is important to understand the rationale behind each step of the way. This has indeed been a multi-year journey (from Cloudify’s perspective at the very least) which has accelerated significantly toward the end of 2020 as we embarked on a partnership with AWS and later with Intel and Capgemini.
The significant acceleration of the speed in which we could iterate through the different phases of the implementation is a testimonial in its own right on the maturity and agility level that we were able to accomplish with this approach.
2018 – The Open Source Phase
Our first attempt to address a network slicing use cases started in 2018. Back then our focus was primarily on delivering an open source alternative to handle such a use case, allowing community users (Orange Lab specifically) to demonstrate a full 5G network slicing use case during MWC completely independently without a need for any support from our side. Allowing the delivery of such a complex use case through open source was a great sign of maturity and it showed to what extent an open source based stack can be used.
The challenge was that it still took months to deliver such a use case.
December 2020 – The birth of the agile first approach
Last year we partnered with AWS to deliver a 5G network slicing use case as a demo for a leading carrier. This time we took a Cloud Native approach (EKS) on top of AWS public DevOps (Pipeline) stack and cloud infrastructure. One of the things that hit us immediately as we were going through the implementation phases was the speed in which we could turn around new updates and features with a fairly distributed team on both AWS and Cloudify. This is where all parties felt that we were onto something big and the term ‘Agile First Approach’ was born.
February 2021 – Adding multi vendor support
A few weeks after the first demo we had to switch 5G core providers from a commercial vendor to an open source alternative based on 5GiCE. This allowed us to demonstrate the architecture on a fully open source based stack.
This case led us to realize that the architecture that we used wasn’t just tolerant for continuous update of its existing services, it was also fairly tolerant for a major change that included a complete replacement of the entire core stack. Indeed we were able to switch an entire 5G core provider within less than 3 weeks with relatively minimal impact to the rest of the services.
June 2021- Delivering End-to-End Network Slicing
In the previous phases we were focused mostly on handling the 5G core services. Obviously to deliver end-to-end network slicing we had to add the radio access network to the mix. This is where we joined forces with Intel and Capgemini to deliver a fully open and cloud native 5G network slicing from core to edge.
Our goal was to demonstrate this use case during the 2021 MWC at the AWS virtual village. This was a fairly ambitious goal, as beyond technical integration we had to collaborate with two large organizations and deliver and end to end automation of the entire flow, as well as produce all the needed supporting material within less than 4 weeks!
I’m happy to say that we exceeded our own expectations with what we were able to accomplish in such a short period of time:
- The solution was built on an open source stack, leveraging Cloudify’s version 6 service orchestration for multi Kubernetes clusters and Open RAN delivered via Intel FlexRAN (DU) and Capgemini (CU) to deliver end-to-end network slicing on top of the AWS-rich cloud infrastructure, CloudNative, and DevOps stack.
- Multi vendor ‘best of breed’: Solution created and designed to support multi 5G vendors to save cost and effort often associated with handling complex integration work – as well as minimizing any potential vendor lock-in.
- Cloudify now provides support for on-prem environments based on VMware, OpenStack – allowing smooth transformation to cloud native and public cloud from existing networks.
- Proven Agility: Cloudify, AWS, Intel and Capgemini were able to deliver full 5g network slicing within a few weeks, handle new updates within a few hours as well as replacing an entire 5G core between two vendors in 3 weeks. Solution also tested and verified on a leading carrier’s network infrastructure.
- Solution is available for demos and PoC on a dedicated AWS lab and will be demonstrated at MWC – within the AWS Virtual Village: sign up here.
- Project serves as an entry point to features arriving in Cloudify’s version 6 (Q3 2021) which will include service orchestration for multiple Kubernetes clusters, support for EKS and Outpost discovery, and batch updates to simplify deployment and continuous update to a distributed cluster.
Software defined Networking – Handling low latency workload (Radio Access Network – RAN) on commodity software and hardware
5G network – specifically the radio access part of the architecture has to be fully optimized for low latency and deterministic performance behaviour. This is the main reason that in previous generations this part of the network stack was often highly proprietary and tailored to a specialized hardware stack. Obviously this resulted in a high cost and limited agility (i.e the ability to handle continuous update of the radio stack)
To address this challenge we used a software defined network (vRAN in this specific case) designed as a standard cloud native stack. To meet the latency requirements we used HW acceleration features that are integrated through standard Kubernetes extensions to optimize the performance. In this way we could meet both the agility and performance requirements of the 5G stack.
You can read the full details of how this works in this post: End-to-End 5G Network Slicing with Cloudify, Intel and CapGemini (Altran) on AWS
MWC 2021 launch marks a significant milestone as we proved without doubt that the agile first approach can solve one of the biggest challenges in managing such a complex distributed network use case – and that is handling continuous change , continuous innovation, in a fully automated fashion using an open, multi vendor based stack.
I’m very proud of what we accomplished so far with the team but we have no plans to stop here. Enhancements for the next phase of this project will focus on creating a more developer friendly environment as well as optimization for performance and latency of the vRAN.
Extending our strategic partnership with 5G vendors
In addition to our existing partnerships with AWS, Intel and Capgemini, we are working extensively with leading 5G vendors to provide this open architecture as a fully integrated stack that can support the most demanding 5G carriers. Stay tuned for more updates on this regard.
Making 5G a standard IT Service with a new ServiceNow integration
The integration with ServiceNow is one of the promising areas in which we could finally make the dream of turning network orchestration (as another IT service) into an actual reality.
Our partnership with ServiceNow has seen this process make significant advancement and we aim to be delivering our first ServiceNow plugin with the following capabilities:
- allowing IT to govern and control environment changes and update processes
- allowing IT to create a 5G environment on demand for development and for specific customer tenants.
- activation / deactivation of 5G services such as network slicing or private 5G networks through ServiceNow interface.
The previous network virtualization transformation was driven by efficiency i.e. reducing cost by moving from physical network to virtual network – moving from Capex to Opex .
The current transformation wave is driven by necessity. Organizations that will not be able to transform themselves will simply end up being the next ‘Blockbuster’ and those who would may end up being ‘Netflix’ as noted in our previous post The Public Cloud Effect On Telcos.
In this work we were able to prove that taking an agile first approach can be applied to even the most complex and distributed use cases – 5G network slicing. This simply means that we can adopt proven practices that are used to manage standard IT workload and apply them also to network use cases.
Having an independent service orchestration layer is critical to glue all the different orchestration domains, abstract the underlying complexity of the service as well as provide a service oriented view that makes the ability to manage this stack as part of a CI/CD pipeline possible.
AWS serves as a catalyst that enables the collaborative effort between Fortune 1000 vendors such as Intel and Capgemini as well as startups such as Cloudify to work on a joint solution as well as turn this work into a reference solution that is available immediately for trials and demos. This reference architecture also has the power to help other organizations to accelerate their transformation and achieve their desired agility.
The solution is available for demos and PoC on a dedicated AWS lab and will be demonstrated at MWC – within the AWS Virtual Village: sign up here.
Learn more about Cloudify and AWS 5G solution here: https://cloudify.co/aws-5g-cloudify/
In addition, Cloudify will now be available for free trial within Amazon’s AWS Marketplace.
Nati Shalom, CTO, Cloudify
- End-to-End 5G Network Slicing with Cloudify, Intel and CapGemini (Altran) on AWS
- Implementing 5G Network Slicing with Cloudify on AWS
- The Public Cloud Effect On Telcos
- Intel 5G network Vision
- IMPLICATIONS OF THE DISH / AWS ANNOUNCEMENT
- 5G Network Architecture of Rakuten and Dish: Commons and Differences
- An end-to-end open source mobile network with traffic prioritization mechanism (Orange Lab use case)