Rolling restart of thin instances using monit

Published on 04/27/12

With Net-at-hand’s recent move to AWS EC2, I had a chance to build the new server from scratch.

In doing so I started using thin as my application server instead of mongrel, and I started using monit for monitoring my application servers instead of god. The main reason for the changes had mostly to do with memory usage by each of them. Monit uses much less memory and is more stable than god, and my rails app on thin is only using around 70MB instead of 110MB+.

One thing that I’ve always wanted an easy way to do is a rolling restart of my applications servers. A “rolling restart” restarts only a portion of the running applications at any given time so there is no time when there are no application servers available.

When configuring monit, you can assign a group to the running instance, but it turns out that you can assign more than one group by putting in multiple “group” directives for the same instance. So all the thin instances that are listening on even-numbered ports have the following lines in their configuration:

group thin
group thin-even

And then all the thin instances that are listening on the odd-numbered ports have the following lines in their configuration:

group thin
group thin-odd

So if I need to restart all of them at once I sould issue this command:

sudo monit restart -g thin

However, to do the rolling restart I just restart the “thin-even” group first, wait for a bit, then restart the “thin-odd” group. This is done automatically in my capistrano deploy.rb file by issuing a “sleep 10” in between the two restart commands. The restart task in my deploy.rb file looks like this:

task :restart, :roles => :app do
    sudo '/path/to/monit -c /path/to/monit.conf restart -g thin-even'
    sleep 10
    sudo '/path/to/monit -c /path/to/monit.conf restart -g thin-odd'
end

So now, I push code changes to Net-at-hand in the middle of the day with no hesitation because I know it isn’t going to cause an outage (of course, I do it all on the staging server first just to make sure nothing blows up :).

Comments

Leave a comment