I encountered a rather crazy-making problem today…
If you read the documentation for listen(2) on Linux or listen() in Perl, it’ll say that the listener will accept new connections up to a maximum specified by a “backlog” integer.
But that’s a lie.
When I tried it today on Linux 3.16.7, I found that it actually accepted new connections up to N + 1.
This in contrast to most of the information online which says that Linux uses a “fudge factor” of N + 3.
People get this idea from the book “UNIX Network Programming” by Richard Stevens, which specified that all sorts of different systems had different fudge factors, and that Linux 2.4.7 had a fudge factor of N + 3.
Well, we’re not using 2.4.7 anymore, but the majority of the information online – even more modern sources like forums and blogs – continue to say N + 3.
So I Googled like a demon, and I found someone else talking about this exact same topic: http://marc.info/?l=linux-netdev&m=135033662527359&w=2
Indeed, starting from Linux 2.6.21, it seems that it became N + 1 (https://lkml.org/lkml/2007/3/6/565). You can even see it in the stable Linux git repository: http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=64a146513f8f12ba204b7bf5cb7e9505594ead42
Although before I found that info, I started wondering if Perl was tampering with my listen() backlog… so I started looking at https://github.com/Perl/perl5/blob/blead/pp_sys.c#L2614 and https://github.com/Perl/perl5/blob/blead/iperlsys.h#L1323, but quickly found my knowledge of C and Perl’s internals to be wanting.
At the end of the day, the behaviour doesn’t matter too much for my project, but it’s nice to know the reason why it’s N + 1 and not N + 3.
It also makes you think a bit about truth, documentation, and the Internet.
Programmers should know that you can’t always trust documentation. Sometimes, you have to go to the source code to actually figure out what’s going on. In this case, I probably could have shrugged my shoulders and made a comment saying “the backlog is N + 1 rather than N or N + 3 because reasons”, but that’s not very helpful to the next person who comes along and experiences N or N + 3 when they’re using a different kernel.
Anyway, that’s my last post for 2015. Hopefully it helps someone out there in the wild who is tearing their hair out wondering why the listen(2) backlog is N + 1 and not N or N + 3. Of course, by the time you’re reading this, the kernel may have changed yet again!