[SlugBug] server health

Bill Best bill at commedia.org.uk
Thu Mar 18 14:33:17 GMT 2004


cheers, chris, for such a complete answer.

Chris J wrote:

<snippetty>

> How much % of CPU were the top processes using, and what are they?

not much - here you go sorted by CPU (shift+p)

   3:07pm  up 266 days,  6:18,  1 user,  load average: 0.00, 0.00, 0.00
56 processes: 55 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states: 18.0% user, 81.0% system,  0.0% nice,  0.0% idle
CPU1 states: 23.0% user, 76.0% system,  0.0% nice,  0.0% idle
Mem:  1036048K av, 550412K used, 485636K free, 21516K shrd, 185188K buff
Swap: 2048248K av,    184K used, 2048064K free            296616K cached

   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
  1336 root       8   0  1056 1056   764 R     1.9  0.1   0:01 top
23567 helix      7   0 40072  39M  7520 S     0.9  3.8 129:51 rmserver
23562 helix      0   0 40072  39M  7520 S     0.5  3.8  31:59 rmserver
     1 root       0   0   208  208   180 S     0.0  0.0   4:45 init
     2 root       0   0     0    0     0 SW    0.0  0.0   0:06 kflushd
     3 root       0   0     0    0     0 SW    0.0  0.0   3:56 kupdate
     4 root       0   0     0    0     0 SW    0.0  0.0   0:15 kswapd
     5 root       0   0     0    0     0 SW    0.0  0.0   0:00 keventd
     6 root     -20 -20     0    0     0 SW<   0.0  0.0   0:00 mdrecoveryd
     9 root       0   0     0    0     0 SW    0.0  0.0   0:00 scsi_eh_0
    10 root       0   0     0    0     0 SW    0.0  0.0   0:00 scsi_eh_1
   645 syslog     0   0   636  636   504 S     0.0  0.0   3:05 syslogd
   653 syslog     0   0   788  776   328 S     0.0  0.0   0:21 klogd
   666 root       0   0  5504 5504   452 S     0.0  0.5  26:02 initlog
   667 helix      0   0  1252 1252  1228 S     0.0  0.1   0:00 rmserver
   677 root       0   0   856  748   616 S     0.0  0.0   0:08 sshd
   731 root       0   0   568  568   456 S     0.0  0.0   0:37 crond
   810 root       0   0  1028 1028   788 S     0.0  0.0   2:04 master
   815 postfix    0   0  1156 1156   872 S     0.0  0.1   5:26 qmgr
   816 postfix    0   0  1100 1100   860 S     0.0  0.1   5:29 tlsmgr
   827 root       0   0   684  684   556 S     0.0  0.0   0:01 portsentry
   829 root       0   0   396  396   332 S     0.0  0.0   0:00 mingetty
   830 root       0   0   396  396   332 S     0.0  0.0   0:00 mingetty
   831 root       0   0   396  396   332 S     0.0  0.0   0:00 mingetty
   832 root       0   0   396  396   332 S     0.0  0.0   0:00 mingetty
   833 root       0   0   396  396   332 S     0.0  0.0   0:00 mingetty
   834 root       0   0   396  396   332 S     0.0  0.0   0:00 mingetty
23557 helix      0   0 40072  39M  7520 S     0.0  3.8   2:40 rmserver

CPU usage looks minimal.

RSS (resident set size) is the amount of memory mapped in RAM per
process i believe but there isn't a problem with system memory so the 
large values might be alright in that column.

so, this lends support to your view that the kernel seems to be really 
busy doing something.

> Unless the system seemed unreasonably slow, I'd leave it as is, but keep
> an eye on it. If you can get away with killing the CPU-intensive process, 
> it could be worth doing - it may have got stuck in a loop or is locked
> waiting for the kernel to respond (in which case, the process will be in 
> state 'D', and will be unkillable without a reboot as it's a kernel
> lock).

can't see any under top.  apologies, how else can i list these - ps aux? 
  they're all S, SW or R under STAT - or is this the wrong place to look?

> Thinking about it, getting a list of process that are in the
> 'D' state may help; they won't use any CPU themselves as they're waiting,
> so killing the top processes might not do owt. But they may point in a 
> suitable direction for more poking.

right.

> But in short, the kernel is busy doing something. Very busy.

so, is it possible to list what the kernel is doing?

> Chris...

many thanks

bill


More information about the SlugBug mailing list