SLES8/DB2,Application-Server, performance Probleme

Fri Oct 14 08:48:08 CEST 2005

On Thu, Oct 13, 2005 at 02:21:39PM +0200, Peter Selzner wrote:
> Hallo Leute,
> 
> bitte um Hilfe bei der Interpretation folgender Daten:

Was moechtest du denn interpretiert haben ? Soll ich dir sagen wie der
Burn in tester des Motherboards heisst ?

> # ps aux | head -17

> USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
> root         1  0.0  0.0   448   76 ?        S    Sep19   1:43 init
> root         2  0.0  0.0     0    0 ?        SW   Sep19   0:00 [migration_CPU0]
> root         3  0.0  0.0     0    0 ?        SW   Sep19   0:00 [migration_CPU1]
> root         4  0.0  0.0     0    0 ?        SW   Sep19   0:00 [migration_CPU2]
> root         5  0.0  0.0     0    0 ?        SW   Sep19   0:00 [migration_CPU3]
> root         6  0.0  0.0     0    0 ?        SW   Sep19   0:00 [keventd]
> root         7  0.0  0.0     0    0 ?        SWN  Sep19   0:41 [ksoftirqd_CPU0]
> root         8  0.0  0.0     0    0 ?        SWN  Sep19   0:49 [ksoftirqd_CPU1]
> root         9  0.0  0.0     0    0 ?        SWN  Sep19   0:50 [ksoftirqd_CPU2]
> root        10  0.0  0.0     0    0 ?        SWN  Sep19   0:50 [ksoftirqd_CPU3]
> root        11  1.2  0.0     0    0 ?        SW   Sep19 447:10 [kswapd]
> root        12  0.0  0.0     0    0 ?        SW   Sep19  19:14 [bdflush]
> root        13  0.3  0.0     0    0 ?        SW   Sep19 117:05 [kupdated]
> root        14  0.0  0.0     0    0 ?        SW   Sep19   1:09 [kinoded]
> root        15  0.0  0.0     0    0 ?        SW   Sep19   0:00 [mdrecoveryd]
> root        25  1.3  0.0     0    0 ?        SW   Sep19 457:21 [kreiserfsd]

Eckige Klammern -> Kernel threads -> Vergisses 

> # vmstat 1
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
> 14  6 171808   9784 302172 12214796    1    1     2     4    0     3 22  9 69  0
> 16  0 171808  10004 302172 12213316    0    0  1012 11152 2540 25395 66 34  0  0
>  6  5 171808  51348 302176 12199144    0    0     0 18480 3563 21199 60 34  6  0
>  5  1 171808  40332 302200 12199212    0    0   472 23440 4548 17897 51 37 12  0
>  2  3 171808  35252 302204 12200104    0    0    76 21036 4270 14013 61 30  9  0
>  6  0 171808  10344 302204 12202192    0    0   100 24612 4793 17698 59 23 19  0
>  7  1 171808  10544 302208 12186640    0    0   168 30880 5524 22150 62 15 23  0
>  1  2 171808  42952 302216 12182404    0    0   304 36412 6625 26934 51 13 35  0
>  2  4 171808 120944 302220 12179020    0    0   392 37988 6928 23771 33 35 32  0
>  2  2 171808 119396 302220 12178408    0    0    16 31564 5822 19956 24 49 27  0
> 12  3 171808  67980 302220 12182916    0    0    24  9236 2441 10227 53 42  5  0
>  9  1 171808  33780 302228 12185168    0    0   368 22644 4487 20859 59 34  7  0
>  3  9 171808  53920 302232 12182584    0    0     0 32552 5851 40907 49 36 15  0
>  5  4 171808  88908 302232 12181412    0    0     0 32612 6246 36260 38 40 21  0
> 14  2 171808  36296 302236 12186856    0    0     0 21136 4982 21652 68 22 10  0
> 27  0 171872   8560 302264 12165756    0    0   804 24596 5100 25358 69 20 12  0

Also - Da du eine DB2 drauf hast nehme ich an das die in Betrieb ist und
zu tun hat. Die Context Switches wuerde ich werten als die DB2 laeuft
mit mehreren threads und alle threads haben zu tun. Du hast viel Block
Out und kein Block In. Ich vermute deine DatenBank ist so klein das sie
komplett in den Speicher passt. Du allerdings viele transactions am
laufen hast so das viel in die transaction logs geschrieben wird.

> 
> # sar -U ALL 5 0
> Linux 2.4.21-295-smp 10/13/05
> 
> 14:06:20          CPU     %user     %nice   %system     %idle
> 14:06:25            0      1.80      0.00     45.20     53.00
> 14:06:25            1      9.40      0.00      7.80     82.80
> 14:06:25            2      4.40      0.00      7.20     88.40
> 14:06:25            3     12.00      0.00      6.00     82.00
> 
> 14:06:25          CPU     %user     %nice   %system     %idle
> 14:06:30            0      3.00      0.00      9.60     87.40
> 14:06:30            1      6.00      0.00     10.80     83.20
> 14:06:30            2      4.40      0.00     11.40     84.20
> 14:06:30            3      1.60      0.00     55.80     42.60
> 
> 14:06:30          CPU     %user     %nice   %system     %idle
> 14:06:35            0     13.80      0.00      4.00     82.20
> 14:06:35            1      3.80      0.00      5.20     91.00
> 14:06:35            2      1.80      0.00     61.60     36.60
> 14:06:35            3      4.00      0.00     26.00     70.00
> 
> 14:06:35          CPU     %user     %nice   %system     %idle
> 14:06:40            0     17.60      0.00     17.80     64.60
> 14:06:40            1     31.80      0.00     10.00     58.20
> 14:06:40            2     20.00      0.00     38.20     41.80
> 14:06:40            3     25.20      0.00     15.20     59.60
> 
> 14:06:40          CPU     %user     %nice   %system     %idle
> 14:06:45            0      0.60      0.00     84.20     15.20
> 14:06:45            1     18.40      0.00     11.60     70.00
> 14:06:45            2     25.20      0.00      6.60     68.20
> 14:06:45            3     23.20      0.00     10.20     66.60

Die kiste langweilt sich. Ich nehme an das CPU0 den netzwerkinterrupt
oder aehnliches fest gebunden hat (/proc/interrupts) so das auf CPU0
viel system auftaucht. 

> HP DL530, 4*Xeon 3,0GHz, 16 GB Speicher, SmartArray RAID-Controller, RAID10, 14*146GB/10K/SCSI320,
> SLES8, Kernel 2.4.21-295-smp, DB2/8.1.6

Flo
-- 
Florian Lohoff                  flo at rfc822.org             +49-171-2280134
                        Heisenberg may have been here.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lug-owl.de/pipermail/linux/attachments/20051014/34385034/attachment.sig>