swap space??
Paul Rockwell
yellowdog-general@lists.terrasoftsolutions.com
Fri Aug 23 09:37:01 2002
Good discussion here. I'd like to throw in a couple more points.
On Monday 19 August 2002 03:28, Rick Thomas wrote:
> As others have said, how much space to allocate to swap is very
> dependent on your configuration and workload. So it tends to be a
> religious issue. Here are some facts based on 25 years of UNIX system
> administration experience.
Well, I don't agree that it's a religious issue, but it's a technical issue
based on OS behavior and your workload.
One needs to differentiate between swap and page-out behavior. "Swapping", in
the traditional sense, has tended to mean the wholesale copy of a program's
data pages to the swap/page device. (You don't need to copy out shared code
pages as they are read only and/or may be in use by another process).
Paging, on the other hand is the copy out of selective pages (usually based on
least-recently used algorithms). Conceptually, they're similar, but swapping
is the most extreme behavior.
Not being a Linux kernel guru, I can only speak from my experiences in
operating systems (UNIX and others). Most modern systems, when there's a
memory shortfall, will attempt to "steal" least recently used pages from
other processes - it's less I/O than forcing out an entire process image. If
this doesn't work, then a process is selected for swap-out. A little paging
is OK, it just means that memory is being optimally used. Continual memory
shortfalls which force processes to be swapped in and out of disk as they
need to run causes the poor performance ("thrashing") behavior.
The question now becomes how do you allocate that space on the swap devices
for non-resident pages. As Rick has explained, you can:
1) Do it up front as the operating system expands the process' address space.
In UNIX this happens predominantly when an sbrk or brk call is made (initial
stack/data allocation, or as a response to no more memory being free blocks
available during a malloc call). The OS then allocates the requested memory
chunk in the swap space. So if you need to swap out, you're guaranteed that
the space is there. If you make a request for more memory and the OS can't
allocate it out of the swap file, will an out-of-memory indication will be
returned to the requesting program.
2) Allocate swap on page or swap-out. I've heard this called "lazy swap
allocation" When the brk/sbrk call is made above, the OS simply gives the
process an memory area, but no swap is allocated. What happens when you get
into a situation where a page needs to be paged in, and the OS has to free up
memory? The first step is to select a process, then write to the page out to
disk. If there's no space allocated in the swap file for that page, then get
some. If this swap space allocation fails - some process will get an
out-of-swap error. Some systems can't guarantee which process will get the
error - the process that wants a page-in request, or the one that's selected
for page-out. Not very nice if the process that's selected is an important
service.
So, the safe swap allocation mechanism is to have up-front swap allocation.
But it's wasteful of disk resource.
For most workstations (general use) the "lazy" allocation works well. You only
allocate what you estimate will be the shortfall between memory demands and
physical RAM. But for servers, I'd want up-front allocation to get a bit more
deterministic behavior.
Some OS's will allow you to select which behavior you want.
> 2) Today, all those factors have changed. So modern UNIX systems don't
> allocate swap space until they need it -- on the expectation that they
> will never need it. Cheap RAM means you don't want to swap at all if
> you can avoid it. Cheap, fast, disk means that even if you do swap
> occasionally, it won't be a big deal.
>
> Frequent swapping is still going to be a problem and should be addressed
> by system tuning (such as buying more RAM, or spreading the swap space
> out onto multiple disks) or tuning the workload. (There is usually much
> more to be gained by tuning the workload than by throwing resources at
> an algorithm that scales poorly!)
Reminds me of a saying... nothing helps virtual like real. If you're concerned
about performance, you do NOT page. Period. Spreading the swap space onto
multiple disks only masks the problem - your memory problem has now just
become an I/O problem, and I/O is nowhere near as fast as memory access.
- Paul
---------
Paul E. Rockwell
paulrockwell@mac.com