Load Balancing HAProxy on the Cloud
The interwebs is basically our fantasy world. Here we can develop, program, automate, and hard-code all of the things, in a way that we’re not really able to with real stuff. I may not be able to juggle 6 bowling pins, but I can load balance nodes in a web application.
HAProxy is one of many popular applications out there that can distribute load across a few servers.
“But I hate configuring”, you say, “and sometimes things need tweaking, small things, like IP addresses and hostnames. Why isn’t there something that can just do it for me?”
Cloudify made it easier to set up load balanced environments with minimal hard labor. And it actually helps you get a better idea of the big picture, because you can put your entire application in a single blueprint – the networks, the security groups, the web servers, the application, the databases, AND – the load balancer.
Cloudify your most complex environments at scale. Go
No hopping between servers, just one HAProxy blueprint. So this can be you:
Let’s use the Nodecellar application as a basic example. It uses a NodeJS application server and a Mongo backend NoSQL DB. It’s a great example, because these two technologies are designed to scale. Adding a load balancer, puts them to work.
Maybe in the real world, you would not have a single HAProxy load balancer and a single MongoDB. You might have multiple shards, a pair of HAProxy servers, and an additional service tracking their availability. But for the sake of an example, I think this keeps things simple.
This example uses OpenStack as the deployment environment. We’ll update the repo soon with other cloud providers. Let’s look at what resources are needed for OpenStack:
First our orchestration topology:
There are four virtual machines (three nodes, from right to left):
- Two virtual machines that host the Node.js apps (nodejs_host) .
- The virtual machine that hosts the MongoDB (mongod_host).
- A virtual machine that hosts the HAProxy load balancer (haproxy_frontend_host).
On the application layer, we have:
- Two nodejs server nodes, each installed on its own nodejs_host virtual machine.
- One mongodb database server, hosted on mongod_host.
- One http_in haproxy load balancer hosted on haproxy_frontend_host.
Suppose you want three nodejs servers instead of two. Or seven. You just need to set instances-deploy on the nodejs_host to the desired number in the blueprint:
There are also three security groups, a floating public IP address for the load balancer. All of these components are needed to make this application load balance. It’s not a lot but try keeping it all in your head. The Cloudify blueprint let’s you document it somewhere in documents that are standardized using the TOSCA guidelines.
Let’s take a look at the node_template for the http_in load balancer. First, notice that it inherits from the haproxy.nodes.Proxy node type. This node type is defined the types/haproxy.yaml file. (That’s included in the repository that you will download.)
Let’s review the definitions of these properties:
- default_backend: This is the name of the backends that you declare in your haproxy.cfg template.
- mode: The protocol of the server.
- port: The port that the load balancer should listen on.
- timeout_connect: The maximum time to wait for a server connection to succeed.
- timeout_client: The maximum timeout on the client side.
- timeout_server: The maximum timeout on the server side.
These are the essential configurations that are required in the configuration file (haproxy.cfg.template). Since this is open source, you are free to modify the code to set any environment you want to build.
You’ll notice that there’s a bit more to the node type below. This describes monitoring, which we’ll discuss below.
First, let’s get this blueprint running.
I assume here that you have installed Cloudify in a virtual environment and have a manager running in OpenStack. If not, get started with Openstack.
First verify that you are using the right version of Cloudify. Both the CLI and your manager should be running the same version.
(3.1)$ cfy --version
Cloudify CLI 3.1.0 (build: 85, date: )
Cloudify Manager 3.1.0 (build: 85, date: ) [ip=the-ip-address-of-your-cloudify-manager]
Initialize the environment if you haven’t already:
(3.1)$ cfy init
(3.1)$ cfy use -t [the-ip-address-of-your-cloudify-manager]
Now, clone the Github repo or download it:
(3.1)$ git clone https://github.com/EarthmanT/cloudify-haproxy-blueprint.git
(3.1)$ wget https://github.com/EarthmanT/cloudify-haproxy-blueprint/archive/3.1.zip
Change into the right directory and upload the blueprint:
(3.1)$ cd cloudify-haproxy-blueprint-3.1
(3.1)$ cfy blueprints upload -p openstack-nodecellar-example-blueprint.yaml -b haproxy
Copy the inputs.json.template file and, if desired, customize it:
(3.1)$ cp inputs.json.template inputs.json
Create the deployment:
(3.1)$ cfy deployments create -b haproxy -d haproxy -i inputs.json
(3.1)$ cfy executions start -w install -d haproxy
While that runs, let’s reopen the blueprint and look at the rest of the http_in node_template:
An interface is a way to map operations in our blueprints to tasks in our plugins. In this example, we’ve included the diamond_plugin, which can be used to log monitoring metrics to Cloudify’s internal messaging.
Using a special HAProxy Diamond collector, we can track the performance of our HAProxy server. The only required parameters are:
- enabled: Enable collecting these metrics
- url: Url to stats in csv format. The url is configured in the haproxy.cfg.template file.
Run a stress test. There’re a lot of options on the web.
Open up your manager, and go to the deployment. Cloudify’s manager UI has a dashboard that you can configure. Enter the monitor request for the HAProxy front end, and the monitor requests for the NodeJS backend. Now use a stress test tool to actually send traffic to this application, and you can actually visualize and watch how HAProxy load balances between these two servers. Nifty.
Some of the interesting ones are bout (bytes out) and req_tot (total requests). Make sure to select both the frontend HAProxy and the backend Nodejs servers.
Now, watch your monitoring in action:
You can see various graphs that display your application’s performance data.
Maybe you want to do your part of the stress test now:
With minimal effort, you just created a load balanced environment, and now you have metrics too that you can use to benchmark performance!
Are you so excited?
So now what you’ve just done in five minutes was create a load balanced environment with Cloudify and HAProxy. You even monitored it to see that it’s actually performing as it should be.
In five minutes.
So excited.
Play around with the example here – and be sure to give us feedback through the comments.