MandrakeUser.Org - Your Mandrake-Linux Knowledge Base!


 
 

* DocIndex - Administration

Managing Processes II

* Process Accounting
* Estimating Process Resource Usage
* Setting Resource Limits
* Dealing With Bad Processes

Related Resources:

man top
man free

Outline of the Linux Memory Management System
The Linux-PAM System Administrators' Guide, 6.12
man kill
man killall

Revision / Modified: Jan 17, 2002
Author: Tom Berger

 

* Process Accounting

Process accounting enables you to keep detailed accounting information for the system resources used, their allocation among users, and system monitoring.
(Enabling Process Accounting on Linux HOWTO)

You will need to install the 'psacct' package from the first Mandrake Linux CD which will in turn install a system service of the same name. Documentation (and an amusing bit of Unix lore) can then be found in info accounting.
Both 'Webmin' and 'Linuxconf' offer graphical modules to configure and monitor process accounting.

Notice that the 'psacct' daemon script in Mandrake Linux 8.1 has a smallbug.

* section index * top

* Estimating Process Resource Usage

Estimating the resource usage, especially the memory consumption of processes is by far more complicated than it looks like at a first glance.
How much resources a process needs, depends on many factors, most of them varying. One is the amount of physical RAM in the system. If there is much free RAM available, more caching will be performed and thus more memory consumed. However this doesn't really count as resource usage, since this cached memory is available in case some other process needs it. Up to 20% of system memory can be used for buffering and caching. This dramatically improves disk accesses, so you don't 'loose' anything when memory is assigned to this task. Only unused memory is wasted ;-).

Graphical applications depend on certain collections of code libraries for displaying their interface (buttons, menus etc). Common collections are GTK+ (GIMP, GNOME), Qt (KDE), Tcl/Tk and others. If you use applications of many different collections, all these libraries have to be loaded into system memory, thus using more resources. An application written with Qt for KDE will use more system resources in GNOME than on KDE. If you start a second Qt application, however, it will use about as much resources as on KDE, since the first application already did load the needed interface libraries. If you close both applications, not all resources will be freed up again, instead the loaded libraries will be cached (if the amount of free RAM permits it).

The numbers for each process presented by process monitors tend to mislead and are easily misinterpreted. On the previous page you've already read about threads. But even if the process doesn't spawn threads, it is in most cases almost impossible to determine how much system memory an applicationreally consumes.

Let me give a simple example:

  1. top: Mem: 900040K av, 411864K used, 488176K free, 1048K shrd, 58592K buff, 166816K cac
  2. Starting the 'bluefish' HTML editor.
  3. top entry for bluefish: 4716 (size) 4716 (real) 3580 (shared).
    top entry for system: Mem: 900040K av, 414268K used, 485772K free, 1432K shrd, 58620K buff, 167592K cac

According to the process table entry, 'bluefish' uses 4716 KB of real memory and 3580 KB of shared memory. The system however only reports an increase of 2404 KB general memory usage, of these being 400 KB more in shared memory, 30 KB more in buffers and 600 KB more in cache.
Regardless of how you add or subtract these numbers, you will never get the numbers from the process entry and the system numbers in sync.

I rather trust the system numbers, but even they can be misinterpreted quite easily:

42 processes: 41 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: 0.9% user, 1.3% system, 0.0% nice, 97.6% idle
Mem: 257676K av, 252940K used, 4736K free, 202852K shrd, 7464K buff
Swap: 130748K av, 256K used, 130492K free, 197620K cached

What usually confuses people here is the cache and buffer management of Linux. Let's have a look at it:

Mem: 257676K av, 252940K used, 4736K free, 202852K shrd, 7464K buff
Swap: 130748K av, 256K used, 130492K free 197620K cached

'Mem' shows some 252 MB RAM installed. In fact it's 256 MB. What happened to the remaining 4 MB? Simple: they are used for 'shadowing' the system's BIOS and contain the GNU/Linux kernel.
The next fields seem to indicate that the system is on the brink of collapse: almost all of the memory seems to be used up. Is that so? No, it isn't. Look at the last entries of those lines. These tell you that about 200 MB of system memory are used for caching and buffering. This memory is available for every application which needs it.
The 'free' command line tool gives a more comprehensible overview:

$ free
total used free [...]
Mem: 257676 253624 4052 [...]
-/+ buffers/cache: 50360 207316
Swap: 130748 256 130492
                  

Have a look at the third line:

-/+ buffers/cache: 50360 207316
                  

displaying again the actual amount of memory in use by applications (50 MB) and free (202 MB). The rest is used for caching and buffering, i.e. to make your system faster.

To estimate the resource consumption of a process, it's easiest to run 'free' before, during and after running the process. The figures will vary because of the already mentioned reasons, so you should do this several times to get an average.

* section index * top

* Setting Resource Limits

PAM (Pluggable Authentication Modules, man 8 pam) offers a mechanism to set limits to resource usage, setting limits via shell mechanisms ('ulimit' in bash, 'limit' in csh etc) is deprecated.
You configure these limits as 'root' in '/etc/security/limits.conf'. Among other things you can set the maximum number of processes per user or group, process priorities, maximum CPU time and more.
Parameters and examples are supplied in the configuration file.

More information can be found in the appropriate chapter of the L-PAMSAG.

* section index * top

* Dealing With Bad Processes

If a process behaves badly, the kernel sends it a certain signal (man 7 signal) to terminate it.
The most infamous signal is SIGNAL 11 (alias SIGSEGV, alias 'segmentation fault'). The process has tried to access a memory segment which hasn't been allocated to it.
Reasons for a segfault might be a programming error, a wrong library version or a hardware failure. The default action of a process when receiving a Signal 11, is to terminate itself and write the contents of the system memory to a 'core' file (in Unix slang, the process 'dumps the core'). This core file can be useful to a programmer for debugging. Notice that Mandrake Linux by default disables the creation of core files.

Things get a bit hairy however when the process is that bad that it ignores this signal and starts to hoard all the available processor time on the system, thus slowing down ('starving out') all the other processes. The kernel marks this process as 'uninterruptible' and its status in the process table will be switched from 'R' or 'S' to 'D'. If you see a process entry with a 'D' in its STATUS row, it's definitely time for you to do something.

Almost all signals can be handled by processes which means that they can also be ignored by them. There is however one signal which can't be ignored by any process and that's SIGNAL 9 (SIGKILL), the appropriately named 'kill signal'. If you send that signal to a process, you will quite literally kill it by taking away all its resources and removing it from the process table. You can then safely go on in your daily work. All unsaved data of that process will be lost, too, though.

Most system monitors offer you a possibility to kill a process via the right mouse click context menu (in console 'top', press the<k> key). There is one problem though: processes run with the permissions of the user who started them. A user can only kill processes he himself has started, only 'root' can kill all processes. Your system monitoring program will and should usually run under your user account which means you can't kill processes from other users or 'root' via it.

If a process started by root or another user misbehaves, you have to switch to the 'root' account on a console using the su command. You can then either kill the process by its process ID (PID) or by its name.
If you don't know the PID of the process yet, you can find out with

# ps aux --sort %cpu

which lists the most CPU hungry process last, the PID being the leftmost number. You can then go on to actually kill the process with the 'kill' command:

# kill -s 9 PID

You can also kill a process by name. Notice that this can be dangerous, since you either might accidentally kill processes you didn't intend to kill (or none at all). The command is

# killall -s 9 name

If you ever happen to come across a non-Linux Unix system, makesure to read the man page for 'killall' on that system. Some Unixes take this command literally which can get very ugly ...

A rather harmless type of a bad process is the so-called 'Zombie' (STATUS: Z, 'defunct'). If you've read the previous page, you know that processes can have 'children', i.e. spawn new processes. If the parent process is terminated normally, it will send a termination signal to all its children signals, too. If however the parent is terminated abnormally, it might not have gotten around to send its children the termination signal. Without their parent process these processes turn into 'living dead': they still appear in the process table, but they don't use any resources and there is no way of 'reaching' these processes. Unlike their namesakes, process zombies don't do any harm. They will vanish from the process table upon the next reboot.

* section index * top

 
Legal: All texts on this site are covered by the GNU Free Documentation License. Standard disclaimers of warranty apply. Copyright LSTB (Tom Berger) and Mandrakesoft 1999-2002.