Cloud Migration in Enterprise
What is cloud portability? Essentially, it’s the ability to move data and applications in a cloud computing environment to another, with minimal service disruption.
The approaches to cloud portability have been numerous, attempting to tackle the challenge on various levels of the stack: containers, PaaS, nested virtualization, cloud orchestration, API abstraction, and others. In reality, however, the rapid evolution of cloud technology has compounded this challenge.
The following article is an exploration into the pros and cons of various approaches. By the end, hopefully you’ll have enough insight to arrive at a strategy on how best to leverage these strategies and implement a truly portable solution capable of addressing a variety of use cases and workloads.
Nested Virtualization
Running a VM within a VM is made possible through nested virtualization.
The primary advantage to the nested virtualization approach, is its capacity to be applied to nearly any application that is packaged into a VM. This enables it to be applied seamlessly to a variety of applications. That said, there are also limitations with this approach.
When it comes to transforming applications spanning multiple virtual machines, you must also adapt the way of configuring and managing the VMs to suit the new model. Oftentimes the move to the cloud is a piece of a larger puzzle: one component of a transition to an agile model. This of course means moving legacy and existing applications to the cloud but maintaining old practices for running them is counterproductive.
To sum up the pros and cons of nested virtualization:
Pros
- Seamless transition – move existing applications between clouds without hiccups
Cons
- Reliance on the cloud as a common denominator. Porting an application at this level of abstraction essentially limits the feature and service set the cloud infrastructure can provide, taking it down to basic computing and networking. In reality, the cloud provider offers a feature-rich stack that cannot be leveraged in this model.
- Inherent lack of self-healing, auto-scaling, and monitoring capabilities.
- There are performance overhead and degradation issues.
- Issues with degradation and performance overhead.
- This strategy is highly dependent upon third-party hypervisor providers for commonality between the clouds.
Containers
While VMs need to be run through a hypervisor layer, containers run within the OS as a service. This makes it a much lighter, much more portable model for packaging. Most cloud providers support containers, making packed applications easy to ship to any of these clouds.
Containers run on top of the OS, meaning they can also run in a nested virtualization setup. From my perspective, although it is viable, it doesn’t contribute much value to the approach.
Let’s look at the pros and cons of containerization:
Pros
- Proven simplicity for implementation, and the popularity of containers means they are widely supported and have a broad ecosystem.
Cons
- Inherent lack of self-healing, auto-scaling, and monitoring capabilities. Docker mitigates some of these challenges, though they are Docker-specific and not generally addressed by containers.
- The transformation of existing applications into containers can often be cumbersome and intensive.
API Compatibility
API compatibility essentially makes it possible to provide a common API that can be used across multiple clouds. When it comes to cloud compatibility APIs, there are two families to consider:
The first, are frameworks for API compatibility, such as LibCloud or JClouds. These frameworks map the cloud API info as an intermediate, exposing the least common denominator of all clouds. They are often limited to mapping compute semantics, and network and storage mapping in these frameworks are limited or nonexistent.
The other family is using one of the existing cloud APIs as the standard for the rest of the clouds. A good example for this is CloudScaling (now part of EMC), who chose to map the AWS API to OpenStack. There are also various attempts to use OpenStack as a compatibility API for other clouds. An interesting example in that regard is VMware Integrated OpenStack (VIO), which uses the OpenStack pluggable architecture in order to wire in VMware resources as the underlying resources. A few more are:
Alternatively there are existing cloud APIs you can use as a standard. As an example of this approach, CloudScaling (purchased by EMC years ago) mapped the AWS API to OpenStack. Others have made attempts to leverage OpenStack as a compatibility API.
An intriguing instance of using OpenStack as a compatibility API for other clouds, is VMware Integrated OpenStack (VIO), a solution that uses the pluggable architecture of OpenStack to wire in VMware resources. Others include:
- IBM, supporting OpenStack on top of their Softlayer API
- Microsoft, providing private cloud appliances compatible with Azure
- VMware (again) who have private and public cloud offerings vCloud and vCloud Air, respectively
This is a sensible approach when the primary cloud has already been set up, and to extend this cloud across API environments makes more sense. In this case, the same tool sets can be maintained to manage apps on both public and private clouds with consistency.
Pros
- Compatibility with all ecosystem frameworks within the chosen cloud
Cons
- Limited compatibility amongst supported clouds
- Confusion and complexity in mapping API semantics
- There is a risk of leaky abstraction between clouds due to behavioral differences
PaaS
Platform as a Service (PaaS) enables users to deploy apps to the cloud without being exposed to the cloud infrastructure, by providing a layer of abstraction. This is primarily suitable for web apps, and for those adhering to the twelve-factor methodology.
OpenShift and CloudFoundry offer an abstraction layer which enables users to run them on a variety of cloud infrastructures allowing applications that are deployed on the platforms to be run on the supported clouds, consistently. PaaS runs at a high level of abstraction, enabling portability not only in the way it’s deployed, managed, and monitored, but also enabling the portability of application binaries.
The primary limitation of the PaaS approach is the barriers of high level abstraction, and the requirement of some groups of applications to access the cloud services themselves. These applications include real time analytics, big data, existing and legacy apps, and more, narrowing the available scope of PaaS as a cloud abstraction approach.
Pros
- Simplicity (when using applications fitting the twelve-factor model)
- Consistent experience as it pertains to lifecycle, scalability, security, high availability, and deployment across clouds
Cons
- Only applicable to a restricted set of stacks and applications
- Abstraction doesn’t avail advanced cloud features, including networking capabilities and monitoring.
Orchestration & Automation
Essentially, orchestration is a means of modelling the manual processes of management and deployment of specific applications, and automating them. This automation is applicable for a wide range of applications, enabling portability on the application management layer. Since the application modeling is typically kept portable, it enables you to execute the same deployment across all clouds, provided they are supported by the orchestration engine.
Orchestration platforms can be divided into two primary families: pure-play orchestration (such as Cloudify or Mesosphere), and container orchestration (such as Docker or Kubernetes).
For those aiming for a heterogeneous deployment along with containerized applications, Cloudify and Mesosphere are the better choices. If, on the other hand, you are more concerned with containers as a priority, Docker orchestration or Kubernetes are better suited.
Pros
- This strategy is applicable to a wide range of applications, such as big data, networking, legacy applications, and complex stacks
- Portability of failover, auto-scaling, and continuous deployment processes
- Leverages the full power and capacity of underlying cloud services and APIs
Cons
- Full portability requires some adjustment
- Modeling can be a tedious process
Choosing the Right Combination
Portability discussions often present the above options as competing approaches, but I tend to disagree. Technically speaking, they can be combined in one way or another to achieve the best, most suitable of all worlds. Of course this begs the question: what combination will satisfy the best outcome for portability? See below for a diagram of a layered approach. In this case, a combination of containers, PaaS, and orchestration.
Containers enable the easy migration of portable and lightweight packages between cloud environments. PaaS provides the abstraction layer used for handling simple apps. Orchestration provides a common automation layer, meaning you have a portable way to manage deployment, auto-scaling, self healing, and general monitoring of big data, legacy, network apps (NFV), real time analytics, and more complex apps. Containers are capable of running directly on cloud infrastructure via the orchestration layer, or indirectly via PaaS.
This layered approach enables the portability of the widest range of applications. It ensures portability not just of the application binary, but also the portability of the application management and maintenance.
Adding TOSCA To The Mix
The aforementioned approach will provide a good solution to ensure application portability. What about the orchestration service, though?
The portability challenge has effectively been moved to the orchestration layer, and this is where TOSCA comes in to save the day. TOSCA will provide a standard modeling language which can be used across a variety of cloud infrastructures.
Final Words
Cloud portability is a non-trivial challenge, and so far many attempts to address this challenge have proven to be fairly limited. Part of the reason for this is that there’s more than one approach to address this challenge depending on the layer of the stack where you’re targeting the application, all while each approach comes with its own limitations that makes it relevant only to a subset of use cases and application workloads.
Ultimately, my recommendation for best strategy is to apply a layered approach, leveraging all of these solutions to enable portability for a wide range of applications and use cases. Deciding how to integrate them depends on certain considerations. Will you set a specific cloud as primary? If so, you’ll use API compatibility and extend it to other clouds. If instead your environment is more heterogeneous, the layered approach outlined above is likely your best bet. Is your primary approach based on containerizing your entire environment? It’s wise then to consider a similar stack while using Kubernetes or Docker for your main orchestration.