Sometimes, swap space still matters


Most modern, general purpose Operating Systems (OS), come with a full-fledged Virtual Memory (VM) system that generates the illusion of having more memory than the real amount installed in the machine. Whether this _virtual_ memory is backed by real RAM, disk swap or it just isn’t backed by any physical device, is something up to the OS.

This way, you can _reserve_ (though the man page of malloc() says allocate, when you know how it works behind the scenes, reserve sounds better) 2 GB of memory in a machine with only 1GB and no swap. Of course, if you try to use that amount memory within your application, the OS will have a hard time trying to make space for it to run (probably reaping file sytem caches and buffers), and at some point it will take the decision of either killing the process (like the _OOM Killer_ of Linux) or refusing to provide more pages to the space of that process, causing this one to die with a segmentation fault.This ability of allowing the processes to reserve more memory than the amount physically available, is possible thanks to on-demand paging. This means that physical resources (in this case, memory pages) are not consumed by a process until it really access them:

[c] #define SIZE 1024*1024*100

int main() { int i; char *buf;

buf = (char *) malloc(SIZE);

/* At this point, memory usage on the OS shouldn’t have changed considerably */

for (i=0; i<SIZE; ++i) { buf[i] = ‘A’; }

/* Now, free memory in the OS should have decreased by 100 MB */

free(buf); } [/c]

Many applications take advantage of this feature, and reserve more memory than they really use in all their lifecycle (if this is a proper behavior or not, is something beyond the scope of this article). You can check this in your own UNIX OS by comparing the columns _VIRT_ and _RSS_ (or RES) of the _top_ utility.

So, how much memory can be reserved?

This is something that depends entirely on the OS you’re running. But most of them calculate a “safe” value determined by the amount of RAM and the size of the disk swap. This is the case for Solaris, which goes even further by strictly limiting the amount of memory to be reserved to the sum of the RAM and the swap minus 18 of the first one.

What could happen if you configure a low amount of swap space? (A real world example and the motivation of this article)

This morning, in one of our OpenSolaris servers, we started to receive messages like this one: _“WARNING: Sorry, no swap space to grow stack for pid …”_ In vmstat, the_swap_ column (a misleading name, since it’s the amount of virtual memory available to be reserved, and not something strictly related to physical swap space) showed pretty low numbers (under 50MB) while the _free_ column was telling us that there were over 14 GB of physical RAM available. How can this be possible?

This machine has 32 GB of RAM and only 512 MB of swap (the default size in OpenSolaris, my mistake). This means that the total amount of virtual memory available to be reserved is something around 30 GB. It provides CIFS service to the network with SAMBA, thus there’re lots of “smbd” processes running on it, and each process usually has a _VIRT_ size of 40 MB, and a _RSS_ of 20 MB. With 1000 processes, the required amount of physical memory would be a little under 20 GB (it fits in RAM), but the amount of virtual memory is something near 40 GB.

This way, when our system reached the limit of virtual memory (over 30GB), it still had more than 10 GB of real RAM available, but the OS refused to allow more reservations. And since there’s still plenty of free pages, the kernel doesn’t try to rebalance the size of it’s buffers, leaving you with and exhausted system with lots of free memory.

So, keep in mind: Even with lots of RAM installed in your server, sometimes, swap space still matters.