Multifoo Architecture

Aaron Stone
February 5, 2007

Update: multifoo architecture has been part of dbmail since 2.3.3

Up through DBMail 2.2.x, we have a simple multiprocess architecture, with one process per connection. It scales up to several hundred active connections at a time, but falls flat anywhere past that point.

For DBMail 2.3 and beyond, we need to have a multifoo architecture. This means multiplexed IO, multithreading, and multiprocessing all together to maximize our ability to respond to ready commands and accept new connections and commands as they arrive.

So we're looking at something like this:

manager process (runs as root, restarts middle process when needed)
 \--> middle process (runs as nobody, acts as the main thread, listens on all sockets)
      \--> handler thread pool (does the real work, responds on the sockets, each thread has its own database connection)

The latest design that I've come up with is to have a single main thread that listens for socket events using libevent, and does one of three things:

  1. On accept(), creates a new client structure.
  2. On initial read(), creates a new command structure.
    1. At the end of the read(), calls a protocol-specific callback function to see if the command is done.
    2. If not, the command remains the active command for the client, and on subsequent read(), the command is appended to and the protocol-specific callback is called to check for completeness of the command.
    3. When the callback reports that the command is complete, it is moved onto the all-threads work queue.
  3. On close(), cleans up the client structure.

Part and parcel with the event loop is the handler thread dequeueing. Or perhaps this should be managed by a second dedicated thread, that waits on the work queue, and whenever it sees something, dequeues it and hands it off to the next available handler thread. We'll have to see what shakes out. The key issue here is how to wake up the thread – some of the Unix API's for doing this are very clumsy and have well-known pitfalls that we have to watch out for.

Ok, you ask, what next? Well, the handler thread has been given a complete command. It no longer needs to read anything from the socket. The handler thread simply executes the command and returns the result to the tx side of the same socket connection. Ok, smarty pants, what about pipelined commands? They just work. In SMTP/LMTP, for example, most commands are considered to be completed when \r\n arrives. As each line hits its \r\n, that command structure is enqueued for work. If five are put on the queue quickly while only one thread is handy to service the replies, no problem! Same for IMAP, in fact.

There's a hairy spot here: DATA\r\n signifies that the next 'command' ends at \r\n.\r\n, rather than just \r\n. I think we can resolve the corner case by looking for any single line pieces of the message that were spun off as their own command before the status was changed and the subsequent command's completeness callback was changed to look for \r\n.\r\n while processing the results of the DATA command.

There's a double hairy race condition here with any command that changes the application state. State-changing commands will have to set some kind of strict-ordering flag on the client structure to indicate that the next command may not be dequeued until the current command has finished. It is perhaps only a small number of cases where a subsequent command is allowed to return before a previous command. IMAP has a couple of these cases. I bet that those will turn out to be the special cases, rather than the well-ordered cases being the special ones. If we assume a well-ordered dequeueing, then there is no race condition between returning a BAD command to the first few lines of headers following SMTP DATA and the change of state from the DATA command itself; at worst, those command structs would be flagged as part of the DATA command, but they wouldn't ever get a BAD response from a thread that got a timeslice before another thread processed the DATA command.

On the issue of a database connection pool, I think that issue is mooted by this architecture. There are relatively few commands that do not require any database interaction, and the ones that don't are so fast to process as to not warrant the additional abstraction so that they won't tie up a database handle for their 0.05ms of processing time.

multifoo_architecture.txt · Last modified: 2012/02/27 21:48 by bas
DBMail is developed by Paul J Stevens together with developers world-wide