Nodecellar Gets Real (Distributed)

In a previous post titled “Deployment Composition In Cloudify”, I described using a custom type/plugin to compose multiple blueprints. This post describes an expansion of that effort, to make nodecellar into a real distributed app. In essence, I describe a concrete implementation of the blueprint described in Nati Shalom’s recent post which described various approaches to application orchestration. The same deployment proxy is used as described before, but the focus here is the growth of the composed blueprints: mongo and nodejs in an Openstack environment.

Requirements
For the Mongodb blueprint to be more realistic, it needs to be a sharded, replicated, distributed cluster, not a single mongod instance. To quickly review Mongodb architecture, there are a few runtime components/services:

  • mongod – holds the data
  • mongoc – holds cluster configuration
  • mongos – shard server

A minimal replicated scenario requires a couple of mongod servers, and three mongoc servers. Mongos is a lightweight service that ideally runs near the client. In our case, the client is nodejs. So the basic layout looks like:



Orchestrate your first deployment in minutes with Cloudify. Try it today!  Go


Since we’re moving to Openstack, there is more complexity in the blueprints related to setting up security groups and floating ips, but otherwise they are straightforward to understand. The mongoc and mongod hosts need to have a security group so they can be reached by the mongos nodes in the nodejs blueprint. One the nodejs side, the hosts containing NodeJs need floating IPs so they can be accessed from the internet.

Orchestration Details – mongo blueprint

The orchestration should proceed as follows:

  • Start mongod hosts and mongoc hosts (in no particular order)
  • Install and start mongod and mongoc services as hosts become available
  • Wait for the above steps to complete, publish outputs, and configure replica sets

The first two steps are fairly unremarkable, standard Cloudify precedence defined by relationships. The final step exploits Cloudify relationships by defining a node that doesn’t inhabit a VM, but only exists to coordinate and publicize the rest of the action. In the blueprint, this node is called joiner.
joiner uses the Cloudify “connected_to” relationship to both force ordering, and to get information from the mongod and mongoc nodes. The node itself is just a “cloudify.nodes.Compute” node, with the “install_agent” property set to false. This has the effect of creating an environment on the Cloudify management server where the coordination can take place, without the need to start an actual VM. When using this technique it is imperative that any interface implementations use the “central_deployment_agent” executor.
joiner has the goal, besides waiting for everything else to complete, of publicizing the IP addresses and ports of all the mongo services in the runtime properties “cfghosts” and “dbhosts”, both being comma separated lists. It does this by using the relationship to put each host IP and port in it’s own runtime_properties, keyed by a prefix string and the instance id of the target node.

After all the relationship scripts have fired, joiner has a runtime property for each mongoc and mongod host in the deployment. The “start” lifecycle script then iterates through the runtime_properties using the REST API. The REST API is used because the ctx object provides no equivalent capability.

Note that this technique can be used in any situation where you need to aggregate information from connected nodes while avoiding concurrency issues. Once the required values are aggregated, they are set to a runtime property that is exposed by the blueprint’s “outputs” section:

Orchestration Details – nodejs blueprint

As discussed in my previous post, the nodejs blueprint uses the deployment proxy plugin to wait for, and then acquire values from the mongo blueprint/deployment. These output values are copied to runtime properties in the proxy node for use elsewhere in the deployment. In this case they are required to initialize and load the database. In the simple nodecellar blueprints, the database is loaded from the webapp itself, which of course is slightly unrealistic. As the database load must occur after the mongos nodes are started, we are faced with a similar situation that joiner was created to solve.
In this case, a node called winedb is used to serve the same purpose. winedb is basically a container for scripting that is contained in a “cloudify.nodes.Compute” node, as discussed before. winedb is dependent on the other nodes in the blueprint, and its scripts are invoked last. It performs the following tasks:

  • it collects the various mongos hosts via the technique described in the previous section
  • it gets the mongoc and mongod hosts from the proxy
  • add a shard for each replicaset found
  • load the wine database

Conclusion

This project took the stock nodecellar example and made it more realistic, but still oriented to a one time demo. A real implementation would no doubt separate the create, destruction, and loading of the database into separate workflows, and checks would be made not to clobber an existing database (if desired). On the other hand, there is no “right answer” for a blueprint; it is very much a product of a specific environment and management policies. Besides taking the next step with “NodeCellar”, this post illustrated a few techniques that can be useful in many common automation scenarios with Cloudify. The source for both projects is located here (https://github.com/dfilppi/cfy3). Comments and suggestions welcome as always.

comments

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Back to top