While most operating systems support
some sort of virtual memory, if the system starts paging memory out
to disk, performance will take a nose dive. But performance will
typically be heavily degraded even before it runs out of memory as
the applications start stealing memory used for I/O caching. Hence
setting an appropriate value for ServerLimit in Apache (or the
equivalent for any multi-threaded/multi-process server) is good
practice. For the remainder of the document I will be specifically
focussing on Linux, but the theory and practice apply to all flavours
of Unix and MSWindows too.
Tracking resource usage of the system
as a whole is also good practice – but beyond the scope of what
I'll be talking about today.
The immediate problem is determining
what an appropriate limit is.
For pre-fork Apache 2.x, the number of
processes is constrained by the serverLimit setting
For most systems the limit will be
driven primarily by the amount of memory available. But trying to
workout how much memory a process uses is actually surprisingly
difficult. The executable code is memory mapped files – these are
typically readonly and shared between processes.
Running 'strace
/usr/sbin/httpd2-prefork -f /etc/apache2/httpd.conf' causes over 4000
files to be “loaded” on my local Linux machine. Actually few of
the are read from disk – they are shared object files already in
memory which the kernel then presents at an address accessible to the
httpd process. Code is typically loaded into such shared, read only
pages. Linux has a further way of conserving memory. When it needs to
copy memory which might be written to, the copy is deferred until a
process attempts to write to the memory.
The net result is that the actual
footprint on the physical memory is much, much less than the size of the
address space that the process has access to.
Different URLs will have different
footprints, and even different clients can affect the memory usage.
Here is a typical distribution of memory usage per httpd process:
This is further complicated by the fact
that our webserver might be doing other things – running PHP, MySQL
and a mailserver being obvious cases – which may or may not be
linked to the volume of HTTP traffic being processed.
In short, trying to synthetically work
out how much memory you will need to support (say) 200 concurrent
requests is not practical.
The most effective solution is to start
with an optimistic guess for serverLimit, and set MaxSpareServers to
around 5% of this value. Note that after the data capture exercise,
you should up MaxSpareServers to around 10% of serverLimit +3. Then
measure how much memory is unused. To do that you'll need to set up a
simple script running periodically as a daemon or from cron,
capturing the output of the 'free' command and the number of httpd
processes.
Here I've plotted the total memory used
(less buffers and cache) against the number of httpd processes:
This system has 1Gb of memory. Without
any apache instances running, the usage would be less than the
projected 290Mb – but that is outwith the bounds we expect to be
operating in. From 2 httpd processes upwards, the average size and
variation in size for each httpd process is very consistent – but
since the variation in size is consistent that means the size of the
total usage envelope will expand as the number of processes
increases. The dashed red line is 2 standard deviations above the
average usage, and hence there is a 97.5% probability that memory
usage will be below the dashed line.
I want to have around 200kb available
for the VFS, so here, my ServerLimit is around 175.
Of course the story doesn't end there.
How do you protect the server and manage the traffic effectively as
it approaches the serverLimit? How do you reduce the memory usage per
httpd process to get more capacity? How do you turn around requests
faster and therefore reduce concurrency? And how do you know how much
memory to set aside for the VFS?
For help with finding the answers, the
code run here and more information on capacity and performance tuning
Linux, Apache, MySQL and PHP....buy the book!
If you would like to learn more about
how Linux Memory Management then this (731 page) document is a very
good guide: