The vcweb framework needs to be able to support flexible scheduling of experiments which can run over the course of a month, or in the course of an hour within a controlled computer lab environment. In the latter case a typical experiment run involves a combination of timed rounds where participants make decisions via the web interface, and untimed rounds where participants read instructions, debriefings, or answer survey / quiz questions and only move on when the experimenter has made sure that everyone is on the same page. In order to support timed rounds for controlled settings AND long-running experiments, we need some way to signal our web application that X amount of time has elapsed or that the given long-running experiment round (say a 24-hour round) has completed and that we should now execute our custom experiment-specific logic to calculate results and prep the data needed as input for the next round (or that the experiment is now over, for instance).
Implementation details: Celery and RabbitMQ scheduling and heartbeat
In order to meet these requirements a few choices had to be assessed:
- go with a custom cron-based solution
- use python's threading library
- go with some kind of scheduling / event queue mechanism
After some research and reading of tea leaves and tortoise shells, I decided to go with integrating with Celery and RabbitMQ. Celery is our scheduling library, and RabbitMQ is a high-performance AMQP message queue implementation used by Celery. This gives us quite a bit of flexibility in terms of allowing us to schedule periodic tasks such as a persistent heartbeat with one-second granularity that can also dispatch messages to our Django signals specified in
core/signals.py. RabbitMQ has a lot of functionality and appears to be a very mature and robust piece of software implemented in Erlang. We may use it to handle real-time chat within groups in the future. For now though we're going to punt on orbited+stomp+rabbitmq integration but will eventually follow up on implementing real-time browser interaction (i.e., server push).
In order to set up the scheduling you'll need to start three additional services:
- First, install
- celerybeat executed via
python manage.py celerybeat.
celerybeatprovides the "heartbeat" for the system, allowing a single core periodic task that runs every second.
- celeryd executed via
python manage.py celeryd.
celerydis the worker daemon that pulls off the periodic tasks generated by
celerybeatand actually executes them.
- rabbitmq executed via
/etc/init.d/rabbitmq-server start. Acts as the underlying messaging queue / system used by Celery.
We may want to revisit these design decisions in the future as I'm starting to become concerned about the number of external / additional services that need to be installed, configured, and set up in order to run the software. At this point short of switching to a J2EE / BlazeDS or GraniteDS / Flex solution I don't know what else we can go with though. Apparently liftweb has some Comet support as well but the learning curve for Scala + liftweb might be too esoteric, at least moreso than Django + everything we're using.
Next is to add real-time server push support so that we can implement group chat that is private to the particular experiment's group. Another remaining issue is that a one-second heartbeat generates a lot of log messages to rabbitmq, but it doesn't appear that there is a way to tune rabbitmq's logs without disabling them entirely by piping them to