Sunday, February 5, 2012

A Solution to CPU-intensive Tasks in IO Loops

Back in October 2011, Ted Dziuba infamously said that Node.js is Cancer.  A provocative title to a provocative article.  The only thing it didn’t really provoke in the commentary was much thought ;)  Zing.

My interpretation of the article is that Ted holds up the classic blocking-IO process-per-request (or  thread per request; same difference) model as superior.  Yet we all remember where the blocking-IO forking model got Apache in the early days.  So I’m a tad uncomfortable with his solution from the good old days.  But fundamentally you can’t get away from the fact that uncooperative tasks can stall an event loop architecture.

I’ve written a few event loops in my day, including hellepoll and contributions to Deft's inner loop.  And I've used and hacked on a few more, including Python's twisted and tornado.  So I'm heavily into the event loop architecture that I was using in Turbo Pascal long before I met VB and Symbian's ActiveObject.  But I've been caught-out by stalling event loops with much the same frequency throughout the years despite growing familiarity and experience.

Its worth clarifying that there are two types of blocking tasks from an event loop perspective; the CPU-intensive working is obvious, but less obvious is the thread waiting.  When you make a blocking IO call (classically this slips in as a DNS name lookup) your thread stalls waiting for a result and your event loop isn’t looping.

The normal way that an event loop deals with blocking tasks is by avoiding them.  Some programmers might split the thinking into several steps and spread them out over several loops.  I prefer to fire off processes or threads for the CPU-intensive tasks and have them report back to the main loop when done.

Of course the fatal flaw in this is programmer awareness.  If the programmer isn’t alert to the possibilities and problems, these blocking tasks slip into production unnoticed until fate puts a smattering of them into a tight space and people wonder about the sudden intermittent slowdowns on the site…

I have thought a lot about how an event loop architecture can be built defensively to automatically move between threading to mitigate blocking tasks.  At most it will be mitigation; threading won’t help you if you have more work to do than CPU to do it with.  But it can go a very long way to removing the fear of blocking tasks stalling an event loop, and it can mean that programmers stop having to be alert to the dangers; the dangers recede; they get the best of both worlds.

Some languages put concurrency at a language level and I admire it of them.  My (uniformed) understanding of Go is that multiplexes its tasks over several OS threads so that if one should block (on blocking IO or be CPU intensive) others can be running.  Stackless Python uses green threading to be able to switch between tasks within a thread.  I imagine Erlang is a bit of both.  If you own the language, you can do it seamlessly.

Sadly in, say, C++, you can at best hope that the programmer is using the framework in the correct way and not accidentally using any libraries or code that breaks any contracts.  That’s the kind of rubbish framework writers in general purpose languages have to face.  But anyway, onwards with a C++ limitations without green threads ( ← great blog post, must read, but tangential!) approach:

Most frameworks say that there is a single main event loop; that there is one handler per IO handle, that the callbacks it issues may fire in any order.  Often databases are talked to via queues and there are guarantees in execution order there too, but that’s a side subject.

The single event loop is often used to avoid locking and to alleviate programmer hassle in that direction.  That’s a very big win.  They say you should spawn as many servers as there are CPU cores on the machine, and load-balance between them (when the runtime is slow enough that that makes sense, of course; you wouldn’t want a load balancer in front of hellepoll slowing it down for example).

When you control the runtime you can put all the necessary locks in at compile time (STM in Clojure sounds a super cool way to go).  But in this hypothetical C++ server we want to transparently mix blocking and non-blocking tasks which means threading and means programmers have to be aware that shared state between requests must be synchronised.  Once that constraint is added suddenly a way forward becomes clearer:

A nice property of epoll and the other selector syscalls is that many threads can all select on the same descriptors.  Its horrid state management, although edge triggering does tidy it up a little bit.  Its generally madness to try and multiplex normal handlers between threads for fear that one is executing some read in one thread whilst more read turns up and gets taken by another thread, meaning the handler is executing simultaneously in two or more threads and is unaware of this!  But listeners are a classic thing to select in multiple threads.

So the server has multiple threads.  If a handler blocks in one thread, another thread can pick up incoming requests.  So far, so good.  In return for needing to carefully synchronise access to shared state, we get to efficiently share that state (even if its just a hot cache of secure session cookies - things you don’t want to be validating every incoming request etc) between many threads and multiplex incoming requests between them.

Lets take our transparent blocking task safety valve a step further:

A watchdog thread can be running every n milliseconds.  This is very low load on the system.  This watchdog can inspect and store the progress of each event loop thread at that snapshot time i.e. the number of loop iterations and perhaps the address of the current task in it.  If the loop has not moved onwards since the last sample or two it can be deemed stalled.

Now you’ve identified the stalled thread but you can’t actually interrupt and stop it.  There’s no green threading.  This is C++ and EINTR being an InterruptedException will only work on syscalls, not CPU-intensive tasks.

But you can move the other events in the affected loop to a fresh thread; you can go sideways when you’ve detected a blocking task.

It is routine for tasks to fire off other asynchronous IO and add timers and callbacks and it’d be too much of a burden on the framework and programmer to suddenly have those fire in other threads.  Its important that these sub-tasks are not migrated.

Hellepoll has the concept of a task tree - tasks can be subtasks of others, and this simplifies tidy-up when one aborts.  This explicit linking of tasks and callbacks can be used to determine what gets migrated when a task blocks, to ensure that the cascade of events associated with a request do not themselves get split, but fire in the originating thread and in the right order even if at some point the thread triggers the blocking watchdog.

So, you have the lean mean event loop architecture that has so much more mechanical sympathy than multi-threading/processing but you transparently migrate to multi-threading every time there’s a mess up and that’s what fits better.

I am not a node.js user but I wonder if this approach could be transparent in node and not actually break any of the API contract there?

If you really want to stay away from the added programmer complexity of juggling asynchronous events, and your load is not worth the hassle or you can purchase CPU to get out of your problems then multi-threading/processing model is a winner.

If all you have is a handful of CPU-intensive tasks you’d likely be better off both performance and sanity -wise going with multi-threading/processing.

But for everyone building asynchronous event loops you might find this multi-threading approach very appealing.  It forces you to juggle multi-threading and synchronising access to shared state too, but beyond that bump you get even better performance and you have recourse to the kind of transparent sideways scaling when you get blocking tasks as outlined in this article.

Read all the way to the bottom?  That is a vote of confidence; now go back to the link-farm you came from and upvote! ;)

Notes

  1. evandrix reblogged this from williamedwardscoder
  2. syslevel reblogged this from williamedwardscoder
  3. williamedwardscoder posted this

 ↓ click the "share" button below!