In my previous post, I discussed how to achieve hybrid cloud in the real world for VMware and OpenStack – which was previously an involved undertaking and today, can be done fairly simply.
In this post, I wanted to put my money where my mouth is – and actually show a live demo of this type of hybrid cloud model in action. A use case we’re constantly presented with by Telecom and enterprises alike is the much-needed virtualization of network functions (i.e. VNFs or NFV), which is an ongoing pain point.
When we demonstrate the hybrid cloud model we usually do so through a simple web app based on a NodeJS frontend, and MongoDB backend (we call this our nodecellar app example), which can also be demoed in Docker containers.
Hybrid cloud orchestration for VMware. Watch our joint webinar with VMware to learn how! Go
The basic process for this is to install the Docker images, configure the network ports (and any other networking considerations required), and then just run the app. (Yes it is seriously that simple). That’s if you’re running locally – if you’d like to run on the cloud – OpenStack or vCloud Air for the sake of the example, you would simply modify the blueprint to use a vCloud Air VM rather than a local VM. Obviously when we run on a *real* cloud we would like to use more advanced security, networking and storage resources, so we’ll have to include those detailed in the blueprint as well. The rest of the process is the same.
You can read more about this VMware/OpenStack example, and watch also the video in this follow-up post on TOSCA orchestration for VMware.
However, we know that the enterprise and Telecom NFV use case is often times much more complex – and so to demonstrate the kind of complexity involved with orchestrating really intricate networking functions – we built an on-demand “Skype”-like application – based on Metaswitch’s Clearwater, that we then ran on VMware vCloud Air and OpenStack – all using the same blueprint and UI for easy management.
So let’s get going, and talk about how we did this.
Let’s start with the actually topology of what could be considered a complex app. For the sake of demonstration, we chose an application comprised of 12 microservices (user provisioning with Ellis, edge proxy using Bono, IP gateway using Sprout, a DNS, Cassandra backend – Homer, Homestead, and billing based on Ralf), and we configured all this using Chef for the CM tool. Here’s a picture of what that looked like:
And this is how we set to work orchestrating a full-blown app.
We started by creating the compute and network services for each of the 12 components listed in the diagram above. Clearwater comes with a set of preconfigured Chef cookbooks so it only made sense that we will use the Cloudify Chef Plugin to configure and install each of these services on the relevant VM.
On top of that Cloudify adds a monitoring agent using Collectd, and set logging and monitoring using logstash and Elasticsearch.
The result of this TOSCA blueprint looks something like this:
The live demo walks you through this step by step process. To demonstrate that this service actually works, we conducted a real video chat that uses an open source SIP client named Jitsi.
The blueprint is broken into five main parts.
- Setting the VMs
- Setting the network
- Configuring and installing the software using Chef
- Add the logging and monitoring
- Adding the self-healing policies.
Now we’re going to zoom in on each part of the blueprint to dive into how to do this.
- Setting up the VMs
In this first part we defined a new type of host and named it clearwater.nodes.Server. This type of host is derived from cloudify.vcloud.nodes.Server which is a specific type that is implemented as part of the Cloudify vCloud Air plugin.
vCloud Air VM definition:
In the rest of the blueprint we will point to this type for every service. For example, this is how the definition of the instance for the homestead service looks:
Similarly we will define vm for Ellis, edge proxy using Bono, Sprout, Homer and Ralf.
Note that the use of application-specific types that are derived from the plugin types is useful in the event that we want to reference a similar definition multiple times in our blueprint.
In this case rather than specifying the properties for the vCloud Air VM instance per service, we defined a generic VM type that uses default values for the VM properties, in this way, we only needed to reference the application type for each instance.
- Setting up the network
In a vCloud Air environment it is custom to use a common gateway service with a single floating IP for the application (vApp), and then use ports that point to all the instances from that common gateway.
We will start by defining the port. Since we will need a port for each of the services we will define a generic type that will set default values for our application and name it clearwater.nodes.Port.
We will use the clearwater.nodes.Port definition to create a specific port instance per service. Here is how the port definition for the homestead service looks:
And here is how we connect this homestead port to the homestead VM:
In TOSCA it is common to use the relationships model to connect a definition of one element to the other. There are various types of possible relationships. In this case we used the cloudify.vcloud.server_connected_to_port to set connect between the VM and the port.
In the Clearwater application the the Ellis service is the user management service. Since the Ellis service is a customer facing service, we want to expose it through a public IP. Since the floating IP is assigned to a single service (Ellis) only, it doesn’t make sense to create an application-specific type as we did previously with the VMs and ports. Here’s how we create a floating IP:
Now we’ll assign the floating-ip and port to the ellis_vm using the relationship model:
- Install the software
Now that we created the VM instances, and connected them to the network, the next step is to install the software services in each of those VMs.
There are many ways in which we can install and configure software components, and Cloudify comes with a rich set of plugins that support Chef, Puppet, Docker, SaltStack or even a simple script model.
The Clearwater application happens to be configured using Chef, so in this specific case we will use the Cloudify Chef plugin. Using this plugin, we will call the Chef client and provide the Chef Cookbook and runlist configuration per service.
Here is how we will install the Ellis service by executing the Ellis role in the Clearwater cookbook:
To assign the Ellis software to the corresponding ellis_vm that we created in the previous step, we will use the contained-in relationship. This relationship tells Cloudify to run the Ellis Chef command as defined above in the ellis_vm.
– type: cloudify.relationships.contained_in
Similarly we will use the same method to install all other services.
- Add logging and monitoring
Cloudify uses Diamond as a monitoring agent and Grafana and InfluxDB for handling the monitoring information and Elasticsearch for logging.
Most of the setup for logging is handled implicitly. We normally define the monitoring agent to enable the collection of the application metrics. Those metrics will be used as alerts for auto-healing and auto-scaling.
To add monitoring to a service we will need to add the following section under the VM definition section of the service.
Here, you can read more about how Cloudify monitoring works.
- Add Policies and Workflows
Now that we created the VMs and defined the network, installed our services, and added monitoring and logging we’re ready to take all this to the next step of automation, and that’s where we add the policies and workflows to the mix.
A workflow represents an operation that is executed on a subset group of our application. There are couple of built-in workflows that come out of the box with Cloudify for handling the installation, uninstallation, and auto-healing.
Since the install and uninstall workflows operate on the entire application, it doesn’t require specific definition in the blueprint. Auto-healing on the other hand is more application-specific, and we therefore need to define it as part of the application blueprint.
Policies are usually built out of one or more workflows, and are often triggered automatically by a given metric.
To define an auto-healing workflow we start first by declaring the workflow.
Note that in this definition were only referring to the default workflow implementation.
Next we’ll assign the workflow to a the relevant group in our application:
Note that in this specific example we only included the bono_vm as part of this group. Obviously we could have included all the services in this application to this group as well, by configuring them in the same manner.
The previous post focused on the value of adding TOSCA-based orchestration to vCloud Air, where this post really illustrate how this can be done through real life working examples. The first example shows a simple use case using Docker, NodeJS and MonogDB. It demonstrates how simple it is to use the same TOSCA model to run on a single machine, as well as the vCloud Air public cloud.
The second example points to a real life use case of setting up an entire video conferencing service, which is based on an open source project named Clearwater by Metaswitch, which is basically very similar to Skype. In the case of Clearwater it isn’t too hard to imagine what it takes to manage the deployment of such a complex video conference service today.
Now what if we’re a Telecom operator and we want to manage not just a single instance of that service but many instances of that service per customer? The complexity in this case hits the roof!
This example shows how extensively we can automate even a fairly complex stack, and create a production-ready environment that goes beyond the initial installation, and also includes monitoring, logging, self-healing, and all through a single deployment command; all while using TOSCA to build this level of automation. As you can see, the TOSCA model provides a fairly simple and elegant abstraction of how we define many details of this application, but at the same time keeps the level of verbosity to a minimum.
Now that the entire process is fully automated, we can actually, fairly simply, create a production ready Skype on-demand. Nirvana!
With that in our hands it doesn’t take much of a leap to imagine the cost savings involved just by reducing these labor intensive deployments. That said though, that is not the right way to think about it IMO. I would say, that just think that by making this complex deployment simple, we can now grow and evolve to the next level and create new class of business services that were too complex to manage before. Being able to create a call center on-demand is just a first step in this thinking.