Changes: * this version includes backport of Greg Price's patch for speedup startup http://bugs.ruby-lang.org/issues/7158 . ruby-core prefers his way to do thing, so that I abandon cached-lp and sorted-lf patches of mine. * this version integrates 'array as queue' patch, which improves performance when push/shift pattern is heavily used on Array. This patch is accepted into trunk for Ruby 2.0 and last possible bug is found by Yui Naruse. It is used in production\* for a couple of months without issues even with this bug. * this version integrates speedup of method's lookup. It has been used in production\* for a couple of months, so that I pretend for it to be stable. * this gist contains two separate patches: [falcon.diff](https://raw.github.com/gist/4136373/falcon.diff) do not integrates `backport-gc` patch cause it seems it has no much benefits (in my measures) [falcon-gc.diff](https://raw.github.com/gist/4136373/falcon-gc.diff) integrates `backport-gc` patch for those, who believe it costs a bill. Separated 'patch by feature' could [be found here](https://gist.github.com/4136519) \*"production" is a small but heavily used services based on EventMachine. --------------- PS. Don't forget tune GC with environment variables: export RUBY_GC_MALLOC_LIMIT=60000000 export RUBY_FREE_MIN=200000 Application will use more memory, but will run much faster (except you are messing with heavy C extension like RMagick). It is almost always better both in memory and performance concern to run ten workers with tuned GC instead of twenty workers without. PPS. Use high performance allocators like jemalloc and tcmalloc For example tcmalloc with Ubuntu: apt-get install libtcmalloc-minimal0 export LD_PRELOAD=/usr/lib64/libtcmalloc_minimal.so.0.1.0 It will give you ~8-10% of performance for free. Memory consumption could increase or decrease... it is hard to predict.