This post was originally published on InfoQ.com in February 2015.
Mapping the Current Docker Orchestration Landscape
Following this interesting post on Docker orchestration and why you need it – the basic premise is that orchestration plays the role of timing container creation based on application and tier dependencies, as well as enables the containers to communicate with one another and pass runtime properties to each other – I’d like to take an even deeper dive on this idea in this post.
It’s no news that containers represent a portable unit of deployment; however, where it usually becomes more complex is that many times an application is often built out of multiple containers. What’s more, setting up a cluster of Docker images can be fairly cumbersome because you need to make one container aware of another and expose intimate details that are required for them to communicate. This is not trivial especially if the containers are not on the same host. For example, think about setting up a Mongo or Cassandra cluster out of Docker containers; you need to know which port to expose, what volumes to mount, and other complexities involved with setting up a clustered container environment.
Get Cloudify – The Pure-Play, TOSCA-based Docker Orchestrator. Go
These scenarios have instigated the demand for some kind of orchestrator. The list of container orchestrators is growing fairly fast, as noted in this excellent post on Docker containers + microservices, also referenced in this InfoQ article. I’ve listed just some of the known names in this category (obviously there are many more).
- Kubernetes – An open source tool from Google for orchestrating Docker containers, specifically designed around orchestration of microservices.
- Fleet – CoreOS Docker orchestration designed for installing containers on CoreOS
- Docker – Docker recently announced Docker Swarm and Compose – Docker orchestration service designed to run Docker clusters.
In addition to this, CoreOS recently announced Rocket, an alternative container format so it would be interesting to see how this will impact the orchestration landscape.
The Need for a Pure-Play Orchestration Specification
We often use shipping containers to describe the role of software containers as a standard packaging unit. Shipping containers are a great example of the benefit of standardization. Without standard containers, the world of shipping goods would have looked completely different than it does today. As with shipping containers, it doesn’t make sense that the actual specification of the shape, size and assembly of containers would belong to a single container manufacturer. As with shipping containers it is very likely that we will have more than one container manufacturer. The recent Rocket announcement by CoreOS is just a first sign in this direction.
TOSCA to the Rescue
TOSCA (Topology Orchestration Specification for Cloud Applications) is governed by the OASIS organization (those who brought you XML). TOSCA orchestration is already fairly mature, with a proven track record and speed of development, and many organizations are betting on and contributing to its success. TOSCA is now beyond its second major revision, has been around for a couple of years now, and is gaining traction in both commercial and open source projects such as: Juju, Cloudify, IBM Cloud Orchestrator, OpenStack Heat. It’s also being adopted by leading Telecom vendors such as Alcatel-Lucent, Huawei, and Cisco.
The fact that TOSCA is backed by a standards body (OASIS) makes it a great platform for defining a standard container orchestration specification that is portable across various cloud environments and container providers.
In this post I wanted to provide an example that illustrates how easy it is to map the current Docker API specification into a portable TOSCA specification, thus providing an intuitive way to declaratively describe complex Docker-based application topologies.
TOSCA Blueprint for Docker
To make the initial mapping process simple we used the docker-py python client as the basis for the proposal below.
The following is a snippet from a TOSCA blueprint describing a Docker-based MongoDB server which is part of a larger application topology. The mapping is YAML-based, as per the latest TOSCA specification, and a suggested modeling tool for Docker containers using TOSCA.
mongod: type: docker_container properties: image: dockerfile/mongodb command: mongod --rest --httpinterface --smallfiles exported_ports: - 27017 - 28017 port_bindings: - 27017: 27017 - 28017: 28017 volumes: - ~/my-host-dir, container-dir, rw
The TOSCA format is very similar to other kinds of dependency injection, with the main difference being that it is based on YAML and it’s more specific to the orchestration domain. As such, it also includes definitions of lifecycles, relationships, policies and plans that also describe the operational aspect of the application services.
Running a Simple Docker Example with TOSCA
To get a feel of how TOSCA-based orchestration works, I will use a simple TOSCA command line parser provided through Cloudify, an open source orchestration tool.
We will use the nodecellar application based on a NodeJS as the web app, backed by MongoDB as the database. Each of these services will run on an independent Docker container as illustrated in the diagram below.
The nodecellar TOSCA blueprint will also take care of downloading the Docker images from DockerHub and later reuse them for subsequent installation/uninstallation. This will allow us to spawn the entire application in roughly 10 seconds!
Step-by-step Guide for Running the Example
The example was tested on Ubuntu (precise) 12.04 as the underlying operating system. The example comes with two blueprint implementations. One for single host deployments and the other for cluster deployments on OpenStack. For the sake of simplicity, in this post I chose to focus on the single host example.
To run the example follow these steps
|Setting Up the Environment|
|Install Docker||> curl -sSL https://get.docker.com/ubuntu/ | sudo sh
Check the installation with this command:
> sudo docker ps
|Install Cloudify||> pip install cloudify
Check the installation with this command
> cfy –version
You can also use vagrant-box to get a pre-configured cloudify image or use other installation options as outlined here
|Setup the TOSCA/Docker Example|
|Download the example||Download and unzip the example from this link|
|Set the current TOSCA blueprint to point to the example blueprint file||>cfy local init -p blueprint/docker-singlehost-blueprint.yaml -i blueprint/cfy-local-inputs.json –install-plugins|
|Execute the install workflow||>sudo cfy local execute -w install
This command will execute the install workflow on the blueprint that was set in the previous step.
The install/uninstall workflow is one of the implicit workflows that is available with every blueprint. The install workflow walks through the TOSCA dependency graph and executes the lifecycle commands of each node based on the order specified in the relationship and dependency definition. Similarly the uninstall workflow does the reverse operation.
In this specific blueprint, we will download the docker images on the first run (this may take a few minutes), run two docker instances, one for the mongod instance and the other for the nodejs instance and install the application and the database on those instances respectively.
|Check the output||> cfy local outputs
The output should print the host:port of the nodecellar application
|Open the application||Open the following link on your browser based on the hostname and port from the previous step:
http://<output host name>:<output port>
Note that if you’re running on a vagrant-box you should replace the default value for the host_ip attribute from ‘localhost’ to the host-ip of the vagrant vm
With the current pace of IT infrastructure evolution, we need to assume that it is likely that we will continue to have more than one container provider and orchestration engine. With this in mind, it only makes sense that the specification for the way we define and manage containers clusters will be standardized and supported by more than one container or orchestration provider. Otherwise the entire promise of containers as a portable “shipping” unit becomes questionable.
In addition, we need to assume that for many organizations it would be unrealistic to expect that all of their workloads would be running within containers. It is more likely to assume that they will be running a mixed workload. Having a separate orchestration engine per workload type could easily become an operational nightmare. This is another reason for having a more independent orchestration engine that can support not just different containers, but also a mixed environments that include non-containerized services based on Chef, Puppet, SaltStack, et al.
TOSCA is currently fairly advanced in its specification for orchestration which makes it an ideal candidate for becoming the standard blueprint definition for containers.
Having a standard orchestration format isn’t the only consideration for choosing TOSCA. Unlike most of the current container-based orchestrators that deal almost exclusively with the initial setup and installation stage, TOSCA covers the entire application lifecycle, including post-deployment aspects such as monitoring, additional workflows (continuous deployment, scaling out, remediation), and policies to automatically trigger some of these workflows. Thus providing a more holistic orchestration definition.
The purpose of this exercise was to illustrate how simple it is to orchestrate Docker workloads using standard TOSCA YAML. I hope that this post will provide a good basis for discussion and brainstorming with the community.