Poor man’s tail recursion in node.js

If you find yourself in the situation of doing recursion over a large-enough input in node.js, you may encounter this:

node.js:201
        throw e; // process.nextTick error, or 'error' event on first tick
              ^
RangeError: Maximum call stack size exceeded

Oops, I smashed the stack. You may reproduce it with something like this:

var foo = []
 
for (var i = 0; i < 1000000; i++) {
    foo.push(i)
}
 
var recur = function (bar) {
    if (bar.length > 0) {
        var baz = bar.pop()
        // do something with baz
        recur(bar)
    } else {
        // end of recursion, do your stuff
    }
}
 
recur(foo)

“Thanks, that’s very thoughtful. But you’re not helping.” Bear with me. The solution is the obvious tail call elimination. But JavaScript doesn’t have that optimization.

However, you may wrap the tail call in order to call the above recur() function in a new stack. The proper recur() implementation is:

var recur = function (bar) {
    if (bar.length > 0) {
        var baz = bar.pop()
        // do something with baz
        process.nextTick(function () {
            recur(bar)
        })
    } else {
        // end of recursion, do your stuff
    }
}

Warning: please read this carefully. I gave you the solution for recurring over such a large input, but the performance is poor. Using process.nextTick (or a timer function such as setTimeout for that matter, slower BTW) is an expensive operation. Didn’t test where’s the actual bottleneck (epoll itself under Linux, libuv | libev, etc).

time node recur.js
node recur.js  1.36s user 0.28s system 101% cpu 1.610 total

The cost of this method is high. Therefore, don’t attempt this in a web application. It kills the event loop. For instance, I don’t use node for writing web applications. It is a difficult task, while the cost of the event loop itself isn’t that negligible as you may think. It useful as long as the CPU time is negligible compared to the time spent doing IO. Therefore, please, don’t include me in the group of people that thinks about node as the hammer for all the problems you throw at it.

If you’re wondering why I won’t just simply iterate the object, the answer is simple: because that “do something with baz” involves some async IO that would kill the second data provider. Sequential calls ensure that everybody in the architecture stays happy. Besides, I don’t actually use bar.pop(), but something like bar.splice(0, 5000) for packing more data in less remote calls and less events. bar.shift() in a situation like this is as slow as molasses in January. In an async framework, the order of the items from a TODO list is not relevant, therefore use the fastest way.

If you’re still wondering why I still use a solution like this, the above technique is part of the cost associated with the start-up cost. The application fetches all the required data in RAM. Having the application to kill the event loop for 20-30 seconds before hitting the Internet pipe is negligible for a process that runs for hours or days. After the application hits the Internet, only then I can say that node is in use for the stuff where it shines. I know, before this, I listed all the wrong reasons for using node as a tool.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.