Jan 22 2013

Chef and automated service discovery

Chef — the short story

Chef is a cloud-oriented open-source integration framework. By describing in a platform-independent way how a specific component of your service architecture should be deployed (a cookbook), Chef takes care of automating the deployment and configuration of your infrastructure. More infos about Chef in the official website.

Services and interdependencies

A common scenario is working on an architecture composed by different services communicating between each other through APIs. Possibly, your system is fully replicated in two or more environments (acceptance, production, and so on) running in one or more different cloud systems (AWS, Eucalyptus, etc).
If such is the case, every service may need to have an endpoint specified in some configuration file for all the services it needs access to.

Chef allows you to create dynamically the needed configuration files through a simple and powerful templating system. Simply enough, a template can describe how a configuration file (or whatever file your service needs to run) has to be structured, and values to populate it can be injected from attributes that can depend on your environment or on some other conditions (for instance, theĀ role given to the node).

The limit of this approach is that Chef scripts are basically handling a local per-service deployment, while your components’ configurations very frequently depend on the deployment of other components (i.e. they require their endpoint to communicate succesfully).
For instance, you may need to deploy a Java component that relies on an Apache Zookeeper server to work correctly, but such service itself needs to be deployed (through Chef) dynamically, so you eventually require a way to specify its endpoint on our Java component configuration.
Of course, you can simply deploy Apache Zookeeper, note down its address and write it on the Java component configuration (or in its Chef cookbook) manually, but this approach doesn’t scale and doesn’t provide full automation.

Chef Search and Note attributes

Automizing service discovery can be done through a Chef functionality called Search. Search allows you to query Chef server and many different flavours can be applied to retrieve information about our cloud architecture deployment:

  • the number of nodes of the system
  • the nodes in which a specific service is deployed
  • the services of the system

and many others.

Going back to our example, what we need to do is querying Chef to give us an instance of Zookeeper. Assuming that Zookeeper has been installed through a cookbook named “zookeeper”, this can be easily done within the cookbook recipe of our Java component:

# Search for any zookeeper instance to attach to the server
search(:node, "chef_environment:#{node.chef_environment} AND recipes:zookeeper") do |matching_node|
configuration["zookeeper_endpoint"] = matching_node["ipaddress"]+":"+matching_node["zookeeper"]["port"].to_s
end

This statement will loop through the nodes where the “zookeeper” cookbook has been run (in the specific node environment) and take the last one to create the endpoint.

Every Chef Node is described through a JSON, and new attributes can be added to a node whenever a cookbook recipe is run. In the case of our Zookeeper recipe we will need to add:

1
2
3
bag_item = data_bag_item("subsystems", "zookeeper")
# Register a Node attribute for the port, so that other services can get it
node.default["zookeeper"]["port"] = config["port"]

Doing so, in every Search inquiry including the zookeeper node a “zookeeper”:”port” attribute will be accessible, and this would let the other components know the Zookeeper port dinamically at deployment time.

Conclusions

In order to understand fully this post, I suggest you to take a look at the introductory material of Chef (otherwise, the terms used here may result a bit misleading).
This approach is pretty simple, but probably there are more elegant ways to achieve the same. Either way, around this approach you can plug-in different behaviours (for example, fallback to a default statically typed endpoint if the search action returns nothing) and manage different kind of situations.
What expressed above also doesn’t take into account the presence of load balancers, but such situation can be easily handled with a combined mixture of attributes and data-bags elements.

About

Globetrotter Software Engineer. I try to conjugate where the ambitions lead me with an environment that makes me feel happy. I love sun and sea. My mind is always spinning, for good and for bad. I enjoy traveling and experiencing new places by being constantly surprised by things I would have never even conceived.

Leave a Reply

Your email address will not be published. Required fields are marked *