Monday 28 September 2009

Simpler long polling with Django and gevent

Recently released Tornado web server includes an example chat application. This post describes a modification of that example that runs on Django and gevent wsgi server. The modified version achieves the same goal while staying within a familiar web framework Django and using simpler concurrency model.
It implements a simple web chat room with instant notifications and does so by using Ajax with long polling. A dynamic web application of this kind is thought of as a better fit for an asynchronous framework like Tornado or Twisted, than for a traditional thread-pool and/or process-pool based approach like apache+mod_wsgi. This is because each user participating in the chat (or simply not closing their browser window) maintains an open connection with the server for message updates and the amount of memory an open connection takes on server is significantly different depending on the server setup: a few KB for async versus a few MB for thread/process.

However, using an asynchronous framework requires twisting your code somewhat: because it's all running in the same thread, you cannot simply wait in a view handler until a new message is available, then construct a response and return it to the framework. Instead, you usually return a callback that will be called when a new message is posted to generate the response and thus complete the long polling request. Not a huge obstacle but it does obscure the code flow.

If only we had more light-weight units of execution than threads and processes, implementing ajax apps like this chat would a lot be simpler. Turns out we do, and there are options: Stackless Python and greenlet. The latter is an extension module that runs on a stock Python and that's what gevent currently supports.

The wsgi server bundled with gevent creates a new greenlet for each incoming connection making it's possible in a request handler to sleep, wait for event and even access network without blocking anyone. Greenlets are cheap, memory-wise, so it's about as scalable as callback- or Deferred- based solution. The logic, however, becomes much more transparent:
  • When a new message is posted, set the event
  • When a client requests the updates (and it already has the latest message), wait for the event
See the actual code: views.py

7 comments:

  1. hi Denis, nice work! here is my port of your version to eventlet: http://github.com/ckreutzer/eventlet-django-webchat

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Christian, in the code you posted you should monkey patch threading module, in particular, threading.local. Otherwise all the active handlers will share the same database connection, which is not what Django expects.

    This is not as straightforward with eventlet as with gevent though. Although there's eventlet.util.wrap_threading_local_with_coro_local(), I doubt it can be a proper drop-in replacement for threading.local, because it doesn't rerun __init__ for each new thread/greenlet like threading.local does.

    There is more stuff in threading that should be monkey patched too.

    With gevent it's all taken care of by
    from gevent import monkey; monkey.patch_all()

    ReplyDelete
  4. hi Denis, thanx for your feedback. Is monkey patching threading.local also needed when running behind spawning with --threads=0? as far i understand spawning does some monkey patching of its own, right?

    ReplyDelete
  5. Christian, I'm not sure about the options, I haven't used spawning myself. Looking at the source suggests that it uses eventlet.util for monkey patching.

    ReplyDelete
  6. Cool! Add MySQL database connectivity, and you can count me in too. :)

    ReplyDelete
  7. Ole, if your database connector is pure Python it's already gevent-enabled via monkey patching.

    ReplyDelete

Note: only a member of this blog may post a comment.