The lie that is listen(2)

I encountered a rather crazy-making problem today…

If you read the documentation for listen(2) on Linux or listen() in Perl, it’ll say that the listener will accept new connections up to a maximum specified by a “backlog” integer.

But that’s a lie.

When I tried it today on Linux 3.16.7, I found that it actually accepted new connections up to N + 1.

This in contrast to most of the information online which says that Linux uses a “fudge factor” of N + 3.

People get this idea from the book “UNIX Network Programming” by Richard Stevens, which specified that all sorts of different systems had different fudge factors, and that Linux 2.4.7 had a fudge factor of N + 3.

Well, we’re not using 2.4.7 anymore, but the majority of the information online – even more modern sources like forums and blogs – continue to say N + 3.

So I Googled like a demon, and I found someone else talking about this exact same topic: http://marc.info/?l=linux-netdev&m=135033662527359&w=2

Indeed, starting from Linux 2.6.21, it seems that it became N + 1 (https://lkml.org/lkml/2007/3/6/565). You can even see it in the stable Linux git repository: http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=64a146513f8f12ba204b7bf5cb7e9505594ead42

Although before I found that info, I started wondering if Perl was tampering with my listen() backlog… so I started looking at https://github.com/Perl/perl5/blob/blead/pp_sys.c#L2614 and https://github.com/Perl/perl5/blob/blead/iperlsys.h#L1323, but quickly found my knowledge of C and Perl’s internals to be wanting.

At the end of the day, the behaviour doesn’t matter too much for my project, but it’s nice to know the reason why it’s N + 1 and not N + 3.

It also makes you think a bit about truth, documentation, and the Internet.

Programmers should know that you can’t always trust documentation. Sometimes, you have to go to the source code to actually figure out what’s going on. In this case, I probably could have shrugged my shoulders and made a comment saying “the backlog is N + 1 rather than N or N + 3 because reasons”, but that’s not very helpful to the next person who comes along and experiences N or N + 3 when they’re using a different kernel.

Anyway, that’s my last post for 2015. Hopefully it helps someone out there in the wild who is tearing their hair out wondering why the listen(2) backlog is N + 1 and not N or N + 3. Of course, by the time you’re reading this, the kernel may have changed yet again!

Advertisements

In the wake of the Debian Jessie upgrade

As I wrote earlier, I spent a fair chunk of this weekend upgrading Debian from Wheezy to Jessie.

Well, as part of that upgrade, some packages were removed unexpectedly. While some packages had replacements, there was no replacement for “gedit” or “seahorse”. While I don’t rely on either on a day to day basis, they’re nice to have when you want a convenient graphic text editor or graphical manager for assorted encryption keys.

I also noticed that while the packages for PostgreSQL 9.4 had been installed, they’d actually been installed alongside PostgreSQL 9.1. So no actual upgrade had occurred there per se. It was up to me.

Well, I don’t use PostgreSQL very often. In my day to day work, I use MySQL/MariaDB with Koha. While I sometimes use PostgreSQL with DSpace and other projects, I’m more familiar with MySQL. So I did some Googling and found this link: http://nixmash.com/postgresql/upgrading-postgresql-9-1-to-9-3-in-ubuntu/.

In essence, I ran the following:

“service postgresql stop”

“pg_dropcluster –stop 9.4 main”

“pg_upgradecluster 9.1 main”

“service postgresql start 9.4”

Then I tried “psql” which reported that both the client and server were 9.4, rather than before when it said that the client was 9.4 but the server was 9.1.

I also poked around in my “dspace” database to make sure I could still access all the data stored therein.

“service postgresql stop”

“pg_dropcluster –stop 9.1 main”

“service postgresql start”

“apt-get remove postgresql-9.1”

“apt-get remove postgresql-client-9.1”

There were still some remnants of 9.1 around so I had to do the following:

“apt-get purge postgresql-9.1”

“dpkg -l | grep postgres”

That last command just shows me what packages are installed with “postgres” in there name. I can now see that 9.1 is totally gone and only 9.4 remains. I can also verify that by doing:

“ls /var/lib/postgresql”

And noting that the 9.1 directory is gone and only 9.4 remains. Same with “/etc/postgresql”.

Actually, it’s interesting if I run the following to list all packages that have “rc”, that is packages that have been “removed” but which still have “configuration” files present:

“dpkg -l | egrep ‘^rc'”

I see several hundred packages that have been removed but still have configuration files present.

It’s actually interesting reading through these, as I see another package that Debian removed which I actually wanted to keep. It was “swat” which is a web-based tool for managing Samba shares.

Interestingly enough… while it was easy to re-install “seahorse” and “gedit”, it looks like “swat” doesn’t have a package in Jessie. The latest package I see for it is in Wheezy.

After a quick Google, I can see that’s because “swat” had too many security vulnerabilities and there was no one willing to maintain it. Honestly, I rarely used it in the past, and it actually created problems in the long-run because it would overwrite the configuration file which you may have edited using Ansible or edited by hand.

In fact, during the upgrade, I noticed that /etc/samba/smb.conf had an update. At the time, I’d backed up the existing file (for the millionth time), and let the package version replace it. I just spent a few minutes comparing the differences between the files with “vimdiff”, and added what few differences that were still required (ie a couple of shares and some global options).

Anyway, most of these “rc” entries don’t really bother me, so I won’t bother with them for now.

After a “service smbd restart” and a check over that the Samba shares are working as expected across the network, I think it’s time to take a well deserved break.

Upgrading Debian from Wheezy to Jessie

Recently, I decided that it was time to upgrade my desktop.

Wheezy was originally released on May 4th 2013, and while it’s still receiving updates, it’s getting long in the tooth. If I recall correctly, the versions of PHP and PostgreSQL shipped with Wheezy are both either EOL or about to be EOL. I’ve also noticed that I want or need more recent packages for sync and backup tools like git-annex and obnam. While I got git-annex via wheezy-backports, the time had come to upgrade.

Unfortunately, it’s been a bit of a long process.

I updated Wheezy on Friday morning, so that it was as up-to-date as possible. Friday night, I started downloading the Jessie updates… but due to inexplicably slow download speeds, it took all day Saturday to download them. On Saturday night, I kicked off the upgrade and went to sleep to resume it in the morning.

But when I went to log into my desktop this morning, I found no login box. The screen was black. Each key press or mouse movement caused a flicker of the Debian Wheezy background, but that was it.

So I hit Ctrl + Alt + F1 to get to the command line. Apt was locked due to the apt-get upgrade I had started in that X session that I couldn’t resume anymore. It had probably hanged on a user input prompt to update a configuration file, and there was nothing I could do about it. So I rebooted, still couldn’t log in, so Ctrl + Alt + F1, apt-get upgrade -f, and wait.

Lots of dependency errors, a few prompts to replace old configuration files, another apt-get update, another apt-get dist-upgrade, and then an apt-get update… which failed because there was no Internet.

Another reboot.

I was able to finally log in! But the background was black and there were a few other graphical issues. Still no Internet. So continue on with the apt-get upgrade, apt-get dist-upgrade, and another reboot.

Finally, I was able to see the Debian desktop background, all the graphics worked and looked beautiful, ran all the apt commands to make sure everything was good, started doing application-level checks.

I was annoyed that some of my gnome preferences for the terminal seemed to be gone or ignored. That surprised me, but I just made a note to look at that later. I’ve started using Ansible to control those sorts of things, so I’ll deal with that later.

I anticipated problems with Apache 2.4, and sure enough… I couldn’t contact Koha or DSpace with Apache.

I was able to access DSpace via Tomcat, although I noted that the way I deployed DSpace was rather suboptimal. I originally installed Koha and DSpace on my home desktop when I first started working in “Library IT”. At work, we used a lot of alternative practices, which weren’t documented anywhere, so I decided that I would follow the official documentation at home, and reverse engineer the process to understand our work practices. It must have worked, as I am now the lead Koha developer at work, and I completely overhauled our development and deployment practices. I’ve also done quite a few DSpace installs for clients along the way, especially for those who weren’t using us for cloud hosting. I wanted to make sure their installations were absolutely standard, so that they could get help down the track from anyone and not just us.

Anyway, I could access DSpace with Tomcat, but not Apache. The configuration file looked fine. I’ve written a lot of Apache config files over the past few years, so I was confident  that I knew what I was doing.

I decided I would try disabling the site and re-enabling it, but Apache complained. It said the site didn’t exist. How could it not exist? That’s when I remembered that I had followed an alternative convention in the past which left off “.conf” at the end of the filename, and that Apache 2.4 was stricter than Apache 2.2. I removed the existing symlink in sites-enabled, added “.conf” to the end of the file, enabled the site, and voila! It was back up!

Koha was a simple answer after that. While it had been enabled before, it hadn’t been re-enabled after the latest upgrade. The Koha upgrade had already fixed the filename problem, so all I did was re-enabled the site, and there it was!

I had actually thought about removing DSpace and Koha from this desktop ahead of the upgrade. I don’t really need either of them anymore for learning purposes. Yet, I kind of like having them there as a reminder if nothing else. A reminder that I wasn’t always this knowledgeable or this skilled; I developed both over time by experimenting at work and at home. They take up space on the hard drive, but I like having the reminder.

I also tell myself that one day I’ll catalogue all the books in the house and put them into Koha. That I’ll create borrower accounts for my friends, and that I’ll keep track of the books I lend out using Koha! Not that I lend many books. I also have a great memory. You better believe that I remember who has borrowed what from me and for how long they’ve had it. The one I think about most often has had a book for a year now. How long are loan periods for friends? 😉

In any case, Debian Jessie is slick! I had originally thought about installing Ubuntu on this desktop, but I had some issues recently with a minor 14.04 upgrade on my laptop, so I’ve decided to stick with Debian on my main computer for the sake of stability.

Sure, I had a few issues upgrading, but I imagine that was just a coincidence. I imagine gdm was either uninstalled or in the process of being uninstalled when I wanted to use it to log back in to finish the upgrade. I was able to get it sorted without too much trouble. The main pain was actually the slow download on Saturday!

In any case, upgrade from Wheezy to Jessie complete! Time to tweak the Ansible playbooks and restore order to the force.

 

Using the Z39.50 client “yaz-client”

If you have technical experience in libraries, you’ve probably heard of Z39.50 (https://en.wikipedia.org/wiki/Z39.50). It’s an old standard, but it’s something libraries still use.

Koha, the open source ILS/LMS, uses Indexdata’s Zebra as its search engine, and it communicates with it using Z39.50 to fetch records in MARC format (another old standard still in use).

MarcEdit is a program used for editing MARC records, and it integrates with some of Koha’s HTTP APIs for adding and updating bibliographic MARC records. It also uses Zebra for searching Koha’s database of MARC records.

I was reading Koha’s listserv when I encountered someone having troubles connecting to Zebra from MarcEdit. I had the same troubles years ago, so I decided to give it another shot now that I’m a lot more experienced with Koha, Zebra, Z39.50, MARC, networking, and really all things IT.

I followed the instructions in the Koha file for exposing Zebra over TCP, but MarcEdit was still failing to connect. Unfortunately, it didn’t really give me any useful information about why it was failing. So I decided to download a Z39.50 client and see if I could get down lower and see what was happening under the hood.

I’ve been using yaz-client (http://www.indexdata.com/yaz/doc/yaz-client.html) for years, as it comes bundled with the YAZ libraries used by Zebra. However, I’ve always been using it on the Linux servers running Koha. In this case, I wanted to connect from my Windows desktop…

Fortunately, you can download YAZ for Windows! I visited Indexdata’s page about installing YAZ on Windows at http://www.indexdata.com/yaz/doc/installation.win32.html, and downloaded the latest version of YAZ.

I still couldn’t connect to Zebra using yaz-client.exe, but now it was clear that it was a connection problem.

I tried using yaz-client on the Linux server that hosted Zebra, and it worked fine. It was clearly a networking issue… the default TCP port for Zebra (ie 9998) was probably being blocked by my network router. So I looked around for a free port using ‘netstat -ln | grep “3000”‘. Port 3000 wasn’t being used by anything else, so I changed the Zebra configuration, restarted my Zebra server, and tried to connect to Zebra using yaz-client.exe on Windows again.

This time the client connected successfully and I had anonymous read-only access to Zebra!

In hindsight, I should have tried port 210 which is the official port for Z39.50 communications or port 7090 which is also sometimes used for Z39.50 servers. But this was good enough for my temporary experiment. I tried again in MarcEdit, and it worked! I was now able to search for records in Koha, download them as MARC21, edit them in MarcEdit, and then update them in Koha using one of Koha’s HTTP APIs!

Now, I was connecting over a private local network. If you were to forward that port on your router so that it was Internet accessible, I would highly recommend changing the default username and password for Zebra!

Anyway, even if you’re not using Koha or MarcEdit or Zebra, you can still use yaz-client on Linux and yaz-client.exe on Windows to connect to all sorts of other Z39.50 servers like the ones at the Library of Congress, LibrariesAustralia, the National Library of Australia, and so on.

This can be useful when you’re trying to troubleshoot Z39.50 connection issues or if you just want to have more control over your Z39.50 requests than what you have baked into your ILS/LMS. Maybe you’re trying to fetch a record from the Library of Congress and you know it’s there… but your ILS/LMS doesn’t seem able to find it. Whip out yaz-client and check it out yourself!

Questions about Linked Data (in libraries)

In the library world, there is a lot of buzz about “Linked Data” and “BIBFRAME”, but I haven’t really found much information about practical uses for either.

There’s information about BIBFRAME (http://www.loc.gov/bibframe/faqs/) and people who are looking at implementing it (http://www.loc.gov/bibframe/implementation/register.html), but it all seems rather vague.

The Oslo Public Library and the National Library of Sweden are both working on new library software systems that will rely on data stored in RDF, but those are still in development and there’s no concise summary of their efforts that I’ve found yet.

While I haven’t explored it extensively, DSpace 5.x provides methods for converting its internal metadata into RDF which is stored in a triple store and made accessible via SPARQL endpoints. Here’s an example record: http://demo.dspace.org/data/handle/10673/6/xml?text. However, that record seems quite basic. If you look at its links, they’re to real files or to HTML pages. They’re not linked to other records or resources described in RDF.

So, it seems to me that it’s straightforward to publish data as Linked Data. You just serialize it using RDF. You store it in a triple store, and you provide a SPARQL endpoint. Your data is open, accessible, linkable.

However, how do you actually make “use” of Linked Data in a way that is useful to humans?

In the case of the DSpace record, it doesn’t contain many links, so you could just run it through a XSLT and get usable human readable information. However, look at a BIBFRAME record like this one: http://bibframe.org/resources/sample-lc-1/148862. It has a lot of links. How could this be made usable to a human?

My first guess is that you have a triple store with your Linked Data, and then you have some other sort of storage system to contain the dereferenced data. That is, once in a while, your server follows all of those links and stitches together a cached human readable copy of the record, which is then shown to the human. I can’t imagine this is done for every single web request as that it would be a lot of work for the server…

Plus, how would you index Linked Data? You need to store human readable/writeable terms in your indexes so that you can retrieve the correct record(s) for which they’re searching. Do you index the cached human readable copy that you generate periodically?

In the case of the BIBFRAME record, let’s say that the Library of Congress is storing that linked data record, dereferencing it, indexing it, and showing it to users. What happens when someone else wants to use that BIBFRAME record to describe a resource that they have stored locally at their library? Do they download it using OAI-PMH, store the reference copy, dereference it in order to index it and show it to users, and maybe add a local item record in their own library system?

Linked Data seems strange for libraries. Let’s look at the dbpedia entry for Tim Berners-Lee: http://dbpedia.org/page/Tim_Berners-Lee. In this case, dbpedia isn’t claiming to store Tim Berners-Lee somewhere. They just have a metadata record describing him. dbpedia serves as a central repository for records.

So with libraries… wouldn’t it make sense for the Library of Congress (or some other significant library entity) to be a central repository of records, and libraries themselves would only need to keep simple records that point to these central repositories? I suppose it couldn’t be that simple as not all central repositories have #alltherecords, so you might need to do original cataloguing and in that case you’re creating your own Linked Data record… although why would it be significant for you to create a Linked Data record at some small library in the middle of nowhere?

Also, when you download BIBFRAME records from the Library of Congress, they would wind up being re-published via your SPARQL endpoints, no? Or, when downloading the record, would you re-write the main IRI for the record to be an IRI for your particular library and its database? Otherwise, aren’t you just hosting a copy of the original record? What happens if someone links to your copy of the BIBFRAME record… and you’re updating your copy from the original Library of Congress BIBFRAME record? Doesn’t that set up a really inefficient and silly chain of links?

I think that’s most of my questions about Linked Data… it mostly boils down to:

1. While Linked Data records are machine readable, how do you dereference them to create human readable and indexable records?

2. How do you copy catalogue with Linked Data? All the links are pointing to the original, do you need to re-write these links to point to your local storage? Or are you just downloading a copy, dereferencing to index and show humans, and then adding local data separately to refer to physical items?

Forcing a stack trace in Perl

Have you ever tried to debug someone else’s Perl code and found it frustrating when it dies and the only error message is for a general function in a module that isn’t explicitly referenced at all in the code?

Do you curse the name of the person who wrote that code and wish that they had used Carp.pm, so that you could have something… anything… on which to base your debugging efforts?

Well, good news! You can trap the DIE signal and use Carp::confess() to force a stack trace. Now you can trace that “die” back to the actual code at hand, so you don’t have to waste time dumping variables, printing your own error messages at different points of logic, or slowly tracing the class lineage of your objects back to that low level API in some XS code which isn’t stored on CPAN or your local system anyway.

Here’s the code I added to the script I was debugging:

use Carp;
$SIG{ __DIE__ } = sub { Carp::confess( @_ ) };

That’s it. I got my stack trace, and I was able to quickly find the problematic line of code and start troubleshooting the actual problem.

While the problem was what I had suspected from the moment I got the first support call, this way I had evidence to prove it and a suggestion for how the original developer could improve their work.

Using multiple network adapters on Windows

I imagine many of us don’t think too much about networking our computers. We care that we have Internet access, that we can connect our phone to the WiFi, and maybe that we can access files on a NAS. But after that, we sort of bury our heads in the sand a bit.

So my work station has built-in network adapters for Ethernet and wireless access. For the most part, I just use the wired Ethernet connection. It’s fast and it connects to a gateway that gets me where I need to go. I rarely use the WiFi, but sometimes it’s useful for connecting to other devices wirelessly. Sometimes, I have active connections over both adapters.

That seems a bit weird though… how does that work? Surely I just need one connection to the Internet, yeah? How does my computer know which connection to use?

Well, the default is to use the Ethernet adapter. If I do “tracert koha-community.org” on my Windows Command Prompt, I’ll notice the first hop is through the gateway I connect to via the Ethernet adapter. Every time.

But if I want to SSH into another device on the wireless network, I can do that too, because my computer’s network routing table has an entry that says all requests to a certain IP address range should go through the wireless gateway instead of the Ethernet gateway.

That’s pretty cool.

Well, sometimes, I need to use a third network adapter. There’s a 3G modem I use to connect to another network. Unfortunately, I’ve noticed that it doesn’t add an entry into my routing table that tells my computer to send requests to a certain IP address range to it instead of the default gateway. If I type in “route print”, I see that the Ethernet and wireless interfaces handle destinations that match their own IP address, but nothing for this modem. I’m not sure why, but it doesn’t matter too much, because the address I want to reach wouldn’t be in that typical IP address range anyway.

The problem is that if I try to connect to the IP address I want, it’ll go through my Ethernet adapter, and it won’t get where it needs to go. So I need to add a route to tell my computer to send requests for that IP address through the 3G modem.

So I look up the interface number for the modem using the following command:

“netsh int ipv4 show interfaces”

I then type out something like:

“route add XX.XX.XXX.X mask 255.255.255.0 YY.YYY.Y.YY if Z”

In this case, the Xs are the destination range, and the Ys are the gateway I want to use. Z is the interface number that I found using that command starting with “netsh”.

And that’s it! Now when I try to reach XX.XX.XXX.X in my browser or through a SSH client, it connects using the 3G modem. However, if I try to SSH into someone else’s server or visit http://koha-community.org in my web browser, it’ll use the default Ethernet adapter.

Anyway, there’s way better resources out there which explain everything, but this is just a little reminder to myself about how to do this (on Windows) and why it might be necessary.

Edit: If you want this route to persist after rebooting, you can add the “-p” flag between “route” and “add”.