TL;DR: I've created chef cookbook for acceptance testing using cucumber and aruba. Get it here.
The main idea behind this article is to demonstrate you the approach for testing infrastructure setup. I'm not convincing you to use the tool I've created but rather I want to show you how easy it is to create your own tool that will help you maintain the quality of infrastructure and avoid regression problems.
I'm using Opscode chef for managing and automation of rails application infrastructure. Currently we have 5 application shards and several auxiliary nodes for staging/CI purposes. As our infrastructure is constantly growing we started experience regression problems. When you have one node and something goes wrong you can quickly ssh to it and fix it manually - but when you do a deployment to 5 nodes simultaneously and something goes wrong - that could be a huge problem.
I've never considered myself as a system administrator and never worked in such role - but projecting my developer's experience to the problem the first and the most obvious solution that came into my mind was - we need tests.
I've reviewed several existing tools but I haven't found anything that would satisfy me. The common problem with these tools, in my opinion, is their sometimes overkill complexity and limitations. Chef itself is a complex tool and adding additional complexity makes it practically unusable.
The only way that I've found was to figure out something myself. Here is the list of things that I wanted to get from my testing tool:
After thinking about it for several days I decided that it is time to do something. It took several hours to build a simple tool that completely satisfied me. This tool is a chef cookbook that does two main things: it copies test suite to a target node and sets chef handler that fires after chef-client
run and executes this test suite.
Such approach is perfect for regression testing and acceptance tests. After each chef-client
run I could be sure that all systems working and I haven't broken the system.
This thing is open sourced and you can find it on github - https://github.com/iafonov/simple_cuke
The main idea begind implementation is to keep it as simple as possible. So here are all three steps that are taken to run tests:
files/default/suite
cookbook's folder with remote node via calling remote_directory
LWRPTest suite is basically a set of cucumber features. In these features you can test whatever you want - for example you can test whether daemon is running or uploads directory is writeable by the user responsible for running the application.
The cookbook will automatically install and link aruba gem for you. Aruba is a set of handy cucumber steps that are intended to test CLI applications and test manipulation with file system. This is exactly what is needed during verification of infrastructure setup. You can see the full list of steps here
There is no limitations on using custom steps - you can use your own defined steps. Put step definitions to features/step_definitions/<you_name_it>_steps.rb
file and they would be loaded automatically.
With cucumber tags you can control on which nodes current feature or scenario should be run. If there are no tags on scenario it would be executed on all nodes.
Trivial example - here we check that apache process appears in ps
output. Both steps are aruba's standard steps so you don't have to write your own step definitions. This feature would be run only on nodes that have role appserver
in their run list.
@appserver
Feature: Application server
Scenario: Apache configuration check
When I successfully run `ps aux`
Then the output should contain "apache"
Slightly more advanced example - lets check that services are running, bind to their ports and aren't blocked by firewall:
Feature: Services
Scenario Outline: Service should be running and bind to port
When I run `lsof -i :<port>`
Then the output should match /<service>.*<user>/
Examples:
| service | user | port |
| master | root | 25 |
| apache2 | www-data | 80 |
| dovecot | root | 110 |
| mysqld | mysql | 3306 |
Scenario Outline: Service should not be blocked by firewall
When I run `ufw status`
Then the output should match /<service>.*<action>/
Examples:
| service | action |
| OpenSSH | ALLOW |
| Apache | ALLOW |
| Postfix | ALLOW |
git clone git://github.com/iafonov/simple_cuke.git cookbooks/simple_cuke
)recipe[simple_cuke]
to node's run_list
files/default/suite/features
folderchef-client
and enjoyChef is in charge of full control of application's infrastructure including deployment. Every time we do deployment chef-client
converges node and deploys a new version of application. I don't run chef-client
periodically in the background, the run could be triggered only manually. I use combination of rake & knife scripts to do a deployment. All I need to do is run rake deploy:production
from the chef repo. With this setup after each successful run the test suite is run and I'm presented with its results.
Here is pretty self-explanatory list of files that are in charge of testing your node setup:
verify_handler.rb
suite/
- Gemfile
- Gemfile.lock
- features/
- env.rb
- step_definitions/
The cookbook uses bundler to setup test environment on the remote node. If you open files/default
cookbook's folder you'll see Gemfile and Gemfile.lock files. Potentially you can add your own gems to it and use them during testing. You can even get rid of cucumber and use rspec or your favorite testing framework.
You can edit command that triggers test run in verify_handler.rb
file.
For now tests result goes to stdout that makes it practically unusable if you're running chef-client
periodically in background. I'm thinking about adding ability to define custom reporters that could for example send test results to email or accumulate them in file.