Auto-Scaling your Apps with TOSCA & Cloudify – An Orchestration Tool Comparison Pt II of II
This is the second part of our Orchestration tool comparison. You can find part I here.
PLEASE NOTE: This blog post does not offer a fully working example of auto-scaling with TOSCA & Cloudify, only a theoretical example of how it would work. We hope to have a working example in the near future.
This
section assumes basic knowledge of Cloudify and TOSCA.
If you don’t happen to have this knowledge, you can have a look at the
following links to help you understand what they’re all about:
TOSCA
is an evolving open standard for a templating language to orchestrate
applications on the cloud. It is not bound to any specific cloud
provider, and is in itself nothing but a templating language, with no
engine to back it up. This is inherently different from Heat,
which provides both a templating language and the orchestration engine
behind it.
OpenStack Orchestration made easy with Cloudify. Get it now. Go
So
just for context, and for those who don’t know much about Cloudify, it
is an orchestration engine that understands TOSCA templates and can
interact with multiple cloud providers, and that is what we will be
demonstrating in this section.
The
most important concept to understand about TOSCA
templates
is that every single entity in your application topology is a node_template, which is of a certain node_type, and has a lifecycle
interface that exactly depicts how to manipulate, create, and delete
the entity.
A
node_template
can be thought of as the equivalent to a resource
and a node_type
is
equivalent to the various Heat
Resource Types.
The lifecycle
interface however, does not have a correspondent concept. This fact
will prove very important to us later on.
So
again, let’s dive right in to an example. We will try to adapt the same
use case from before, auto-scaling a WordPress server that connects to
a static and shared MariaDB instance, to be written in TOSCA
and consumed by Cloudify.
This
is a definition of a node_template
of the host that we want to install the MariaDB instance on. In TOSCA,
each logical entity in the application should be modeled separately,
with relationships tying these entities to one another. In order to
install the MariaDB instance on this host, we define a new node_template.
With
Cloudify you don’t have to use user_data
in order to install the software; instead, you break the installation
process and the lifecycle
interface hooks into different installation parts, with all the logic
itself residing inside scripts. Notice the relationships section, which
tells the db to be installed specifically on the db_host. Let’s move on
to the Scaling related part.
So
like the previous post, I’ll recap, any auto-scaling process
implementation should always answer three basic questions:
Which
resource to scale?What
does the scaling process do?When
should the scaling process be triggered?
Q1: The Which
Here
we defined a pretty cool node_template.
It is of type cloudify.nodes.openstack.Server,
and it has some additional interfaces to give it monitoring
capabilities. We can see that the cloudify.interfaces.monitoring_agent
takes care of installing a monitoring agent called Diamond,
and the cloudify.interfaces.monitoring
configures the agent with various collectors that gather data from the
host. Remember that with OpenStack
Heat,
all of this configuration is hidden from the user, but it also exists.
Now let’s define the actual WordPress application.
Notice
that we are defining cloudify.interfaces.monitoring
on this node_template
as
well, which tells the monitoring agent on the host to add an HttpdCollector,
which is part of many built-in
collectors
in Diamond.
This is awesome, because we can later use these metrics to do some
intelligent application specific scaling.
Q2: The What
Now
we want to understand exactly what the scaling process will do. In
Cloudify, every process is thought of as a workflow
(to learn more about this you can read this post “What
is a Workflow?”),
which is essentially an automation process algorithm. Each workflow is
comprised of invocations to interface operations associated with a
specific node_type.
The
workflow definition itself is part of the template:
And
the workflow code is a python method written with the Cloudify
Workflow API.
Let’s
see a small snippet:
Remember
that every node_template
implements the lifecycle operations differently, and these are what
embody the difference between scaling a MariaDB instance or a WordPress
instance. But the process remains the same. Also, because this is all
part of the template itself, and not part of the workflow
engine,
a user can customize this process how ever he sees fit.
Q3: The When
This
part is what TOSCA defines as Groups
and
Policies.
However, as of writing this post, policies haven’t yet really been
fully designed. Cloudify has its own version of this, which might find
its way to the official spec eventually, but for now, this is how it
looks:
Let’s
break down the elements to understand what’s going on.
For
the most part, we can see that the terms used here are relatively
familiar to us. We have a group called autoscaling_group that
defines in its members
which nodes will be considered for examination. Notice that we are
specifying the wordpress
node_type,
and not the wordpress_host, since we are interested in metrics that are specific to the application, not the host.
We
also define the scaleup_one_instance policy, which instructs the Cloudify engine to trigger some action once
the ReqPerSec
metric value is above the 1000 threshold for a measurement period of at
least 60 seconds. It is clearly shown what sort of action the engine
will take. This is encapsulated in the triggers
section of the scaleup_one_instance
policy,
which declares that the scale
workflow should be executed with the given parameters, and in our case,
these parameters are telling the engine to add an additional <span
wordpress_host
instance.
On
the backend side of Cloudify, all of the metrics are stored in a time
series metric database called InfluxDB,
which can be easily queried. Cloudify also provides a very elegant python
rest client.
With this client you can very easily trigger any kind of workflow you
like:
This
is of course also available via REST
API.
This means that if a curtain calculation is needed upon which scaling
should be made, and they are not exposed in the policies, it should be
fairly easy to implement this separately.
Summary
So,
what did we learn from all of this? The purpose of this post series was
to understand how one can perform automatic scaling on OpenStack, as
well as understand the current gaps in the implementations, and see
what can be improved.
We
saw two different methods, and I think that it is a fair bet to say
that you won’t find any good tools out there that do this in a
completely different manner. So it’s safe to say, that what we just saw
covers pretty much all there currently is.
The
first method was done using the native OpenStack
orchestration
tool, aka, Heat. I think that Heat does a very good job with regards to
what Heat was initially built for, which is orchestrating OpenStack
infrastructure resources. It still feels like they don’t really live
and breathe application orchestration, which makes it difficult to
manage and scale applications, not just infrastructure. The fact that
the scaling process is hard-coded inside the engine might prove to be a
serious limitation for complex use cases, but is actually an advantage
for the most common ones.
The
second way was by using an open
source orchestration
engine called Cloudify, which adheres to the TOSCA open standard. Using
this tool gives you more native integration with applications of
various sorts, but less integration with OpenStack resources. You might
find that some resource types are not yet supported, but overall it has
very good coverage of the OpenStack
API.
Using Cloudify, you have full control over the scaling process. You can
extend and modify the built-in workflow to suit your specific needs,
which is a great ability to have.
Granted,
this will require some python coding at the moment, but plans are being
made to make this workflow composition completely declarative. Another
thing worth mentioning, is the fact that Cloudify is completely cloud
agnostic, it does not rely on built-in type definition or monitoring
capabilities of a specific cloud provider. This is as a direct result
from using a standard like TOSCA, which makes all of it possible using
the concept of a lifecycle
interface.
This means that you can take you Cloudify installation and templates,
and migrate or extend your workload to a different cloud provider with
no changes to the scaling-related aspects of the template.
I
hope this gives you fairly extensive coverage of the options
out there. Eventually, you will have to think carefully about your use
case and choose the approach that best fits what you’re looking to
achieve. That said, regardless of what that is, you will most
likely find a solution, which is very good news.