TL;DR: I've created chef cookbook for acceptance testing using cucumber and aruba. Get it here.
The main idea behind this article is to demonstrate you the approach for testing infrastructure setup. I'm not convincing you to use the tool I've created but rather I want to show you how easy it is to create your own tool that will help you maintain the quality of infrastructure and avoid regression problems.
I'm using Opscode chef for managing and automation of rails application infrastructure. Currently we have 5 application shards and several auxiliary nodes for staging/CI purposes. As our infrastructure is constantly growing we started experience regression problems. When you have one node and something goes wrong you can quickly ssh to it and fix it manually - but when you do a deployment to 5 nodes simultaneously and something goes wrong - that could be a huge problem.
I've never considered myself as a system administrator and never worked in such role - but projecting my developer's experience to the problem the first and the most obvious solution that came into my mind was - we need tests.
I've reviewed several existing tools but I haven't found anything that would satisfy me. The common problem with these tools, in my opinion, is their sometimes overkill complexity and limitations. Chef itself is a complex tool and adding additional complexity makes it practically unusable.
The only way that I've found was to figure out something myself. Here is the list of things that I wanted to get from my testing tool:
After thinking about it for several days I decided that it is time to do something. It took several hours to build a simple tool that completely satisfied me. This tool is a chef cookbook that does two main things: it copies test suite to a target node and sets chef handler that fires after
chef-client run and executes this test suite.
Such approach is perfect for regression testing and acceptance tests. After each
chef-client run I could be sure that all systems working and I haven't broken the system.
This thing is open sourced and you can find it on github - https://github.com/iafonov/simple_cuke
The main idea begind implementation is to keep it as simple as possible. So here are all three steps that are taken to run tests:
files/default/suitecookbook's folder with remote node via calling
Test suite is basically a set of cucumber features. In these features you can test whatever you want - for example you can test whether daemon is running or uploads directory is writeable by the user responsible for running the application.
The cookbook will automatically install and link aruba gem for you. Aruba is a set of handy cucumber steps that are intended to test CLI applications and test manipulation with file system. This is exactly what is needed during verification of infrastructure setup. You can see the full list of steps here
There is no limitations on using custom steps - you can use your own defined steps. Put step definitions to
features/step_definitions/<you_name_it>_steps.rb file and they would be loaded automatically.
With cucumber tags you can control on which nodes current feature or scenario should be run. If there are no tags on scenario it would be executed on all nodes.
Trivial example - here we check that apache process appears in
ps output. Both steps are aruba's standard steps so you don't have to write your own step definitions. This feature would be run only on nodes that have role
appserver in their run list.
@appserver Feature: Application server Scenario: Apache configuration check When I successfully run `ps aux` Then the output should contain "apache"
Slightly more advanced example - lets check that services are running, bind to their ports and aren't blocked by firewall:
Feature: Services Scenario Outline: Service should be running and bind to port When I run `lsof -i :<port>` Then the output should match /<service>.*<user>/ Examples: | service | user | port | | master | root | 25 | | apache2 | www-data | 80 | | dovecot | root | 110 | | mysqld | mysql | 3306 | Scenario Outline: Service should not be blocked by firewall When I run `ufw status` Then the output should match /<service>.*<action>/ Examples: | service | action | | OpenSSH | ALLOW | | Apache | ALLOW | | Postfix | ALLOW |
git clone git://github.com/iafonov/simple_cuke.git cookbooks/simple_cuke)
Chef is in charge of full control of application's infrastructure including deployment. Every time we do deployment
chef-client converges node and deploys a new version of application. I don't run
chef-client periodically in the background, the run could be triggered only manually. I use combination of rake & knife scripts to do a deployment. All I need to do is run
rake deploy:production from the chef repo. With this setup after each successful run the test suite is run and I'm presented with its results.
Here is pretty self-explanatory list of files that are in charge of testing your node setup:
verify_handler.rb suite/ - Gemfile - Gemfile.lock - features/ - env.rb - step_definitions/
The cookbook uses bundler to setup test environment on the remote node. If you open
files/default cookbook's folder you'll see Gemfile and Gemfile.lock files. Potentially you can add your own gems to it and use them during testing. You can even get rid of cucumber and use rspec or your favorite testing framework.
You can edit command that triggers test run in
For now tests result goes to stdout that makes it practically unusable if you're running
chef-client periodically in background. I'm thinking about adding ability to define custom reporters that could for example send test results to email or accumulate them in file.