Mongo DB Sharding in the Pet Clinic Application recipe
Can you shard on a cloud platform in a galaxy far far away?
In general, managing a “sharded Mongo” setup is not an easy task but it’s much more difficult to do it on any cloud.
When you deploy a sharded Mongo on a cloud (any cloud…), you have quite a few barriers that you need to overcome :
* How to ensure that the Mongo Routing Server finds the location of the shards
* Does this router know where the Mongo Config Server resides ?
* How can you verify that your sharding actually works?
* Will you be able to monitor your application’s performance?
* Can you scale out or in ?
Cloudify enables you to achieve all of the above and more.
So, in order to understand how to get this done, I will give you a more thorough walk-through into the MongoDB recipe in the Pet Clinic Application recipe that we described in the “Quick Start Guide”:/guide/2.7/qsg/qsg.html.
What is Database Sharding?
Database sharding can be simply defined as a “shared-nothing”: partitioning scheme for large databases across a number of servers, making new levels of database performance and scalability achievable.
Another way of “looking” at it, is :
Breaking your database down into smaller chunks called “shards” and spreading those shards across a number of distributed servers.
MongoDB Sharding
MongoDB supports an automated sharding/partitioning architecture, enabling horizontal scaling across multiple nodes.
A MongoDB shard cluster consists of two or more shards, one or more configuration servers, and any number of routing processes to which the application servers connect.
Each shard consists of one or more servers and stores data using MongoD processes (MongoD is the core MongoDB database process).
In a production environment, each shard will consist of multiple servers to ensure availability and automated failover.
About the Pet Clinic Application
The Pet Clinic Application presented in the Quick Start Guide is a port to Grails of the standard “Java based Spring framework PetClinic sample app”:http://static.springsource.org/docs/petclinic.html.
It uses MongoDB instead of MySQL as the backend store, and leverages the Grails GORM bindings to MongoDB. The users of the application are employees of the clinic who need to view and manage information regarding veterinarians, pet owners, and their pets.
The application’s production environment is based on a two-tier architecture:
# Web and Business Logic Tier – Comprised of the Grails application deployed on Apache Tomcat.
# The Data Tier – Comprised of “a sharded MongoDB cluster(Sharded MongoDB)”, which is comprised itself of the following components:
## “Configuration Server(MongoDB Configuration Server)” Stores the cluster’s metadata, which includes basic information on each shard server and the chunks contained therein.
## Routing Processes (MongoS): A routing and coordination process that mediates between client requests and the actual mongod instances. When receiving client requests, the MongoS process routes the request to the appropriate MongoD instance(s) and merges any results sent back to the client.
## MongoD instances (AKA shards): In our case, there are two instances of MongoD to store the application data.
Obviously, the amount of instances for each of the above components can easily be changed, by modifying the numInstances attribute of the relevant Cloudify recipe file (i.e: <service-name>-service.groovy).
*Note*: When you install MongoDB on Windows 7, you need to make sure that you login as an administrator. See “Installing MongoDB on Windows 7”
Our Pet Clinic Topology
As mentioned above, in our recipe we’re going to use: two shards (MongoD), one configuration server (MongoConfig), one routing server (MongoS) and one Tomcat.
This is our Pet Clinic Application’s Topology:
h2. Our Pet Clinic Recipe Anatomy
The following diagram depicts the application recipe file and folder structure:
Mongo services in the Pet Clinic Recipe
Let’s take a closer look at the Mongo entities in the PetClinic application recipe.
# The Shards (mongod) recipe folder
# The Configuration Server (mongoConfig) recipe folder
# The Routing Processes (mongos) recipe folder
In each of the above folders you will find all the files required to run the corresponding service.
A “service” in this context is an application tier – e.g. web container, database, etc.
The Mongo Data Shards (mongod) Recipe
The Mongo Data Shards (mongod) Recipe does the following:
* Calculates its port according to its instance ID and properties file and notifies the serviceContext, so that the other services will be able to access it.
* Downloads, unzips and installs the Mongo binaries on the VM
* Starts the Shard Server process.
* Detects a successful startup using the MongoLivenessDetector plugin
* Looks for metrics (Active Read Clients, Active Write Client etc.) to be displayed in the Web UI.
The mongod service file is located at <cloudify root>/recipes/apps/petclinic/mongod/mongod-service.groovy.
The following is the installation event handler of the mongod service:
And here’s the start event handler of the mongod service:
The Mongo Config Server (mongoConfig) Service Recipe
The Mongo Config Server (mongoConfig) Server recipe does the following:
* Calculates its port according to its instance ID and properties file and notifies the serviceContext, so that the other services will be able to access it.
* Downloads, unzips and installs the Mongo binaries on the VM.
* Starts the Configuration Server process.
* Detects a successful startup using the MongoLivenessDetector plugin.
* Looks for metrics (Active Read Clients, Active Write Client etc.) to be displayed in the Web UI.
The mongo-config service file is located at <cloudify root>/recipes/apps/petclinic/mongoConfig/mongoConfig-service.groovy.
The installation event handler of the mongoConfig service is similar to the mongod service.
Here’s the start event handler of the mongoConfig service:
The Mongo Sharding Server (mongos) Recipe
The Mongo Router (mongos) recipe does the following:
* Calculates its port according to its instance ID and properties file and notifies the serviceContext, so that the other services will be able to access it.
* Downloads, unzips and installs the Mongo binaries on the VM.
* Starts the Router Server process.
* Detects a successful startup using the MongoLivenessDetector plugin.
* Looks for host address and port of the mongoConfig service (using the serviceContext API) and the Mongo data shards (mongod), and configures sharding.
* Looks for metrics (Active Read Clients, Active Write Client etc.) to be displayed in the Web UI.
The mongos service file is located at <cloudify root>/recipes/apps/petclinic/mongos/mongos-service.groovy.
The installation event handler of the mongos service is similar to the mongod service.
Here’s the start event handler of the mongos service:
And here’s the post start event handler of the mongos service:
Now, the tomcat service can retrieve the data in <cloudify root>/recipes/apps/petclinic/tomcat/tomcat_start.groovy
From this point, the tomcat service instance knows the host address and port of the mongos service instance.
As you can see, by using Cloudify, you are now able to quickly shard and manage MongoDB on the cloud of your choice.
For more information on Mongo sharding with Cloudify visit “Cloudify In Action”:/cloudifysourcetv.html online demo.
We also recommend that you register to the “Manage your Sharded MongoDB Setup”:/participate.htm webinar which will be held on April 24th, 2012.