1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
|
The following is a demonstration of the cpuwalk.d script,
cpuwalk.d is not that useful on a single CPU server,
# cpuwalk.d
Sampling... Hit Ctrl-C to end.
^C
PID: 18843 CMD: bash
value ------------- Distribution ------------- count
< 0 | 0
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 30
1 | 0
PID: 8079 CMD: mozilla-bin
value ------------- Distribution ------------- count
< 0 | 0
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 10
1 | 0
The output above shows that PID 18843, "bash", was sampled on CPU 0 a total
of 30 times (we sample at 1000 hz).
The following is a demonstration of running cpuwalk.d with a 5 second
duration. This is on a 4 CPU server running a multithreaded CPU bound
application called "cputhread",
# cpuwalk.d 5
Sampling...
PID: 3 CMD: fsflush
value ------------- Distribution ------------- count
1 | 0
2 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 30
3 | 0
PID: 12186 CMD: cputhread
value ------------- Distribution ------------- count
< 0 | 0
0 |@@@@@@@@@@ 4900
1 |@@@@@@@@@@ 4900
2 |@@@@@@@@@@ 4860
3 |@@@@@@@@@@ 4890
4 | 0
As we are sampling at 1000 hz, the application cputhread is indeed running
concurrently across all available CPUs. We measured the applicaiton on
CPU 0 a total of 4900 times, on CPU 1 a total of 4900 times, etc. As there
are around 5000 samples per CPU available in this 5 second 1000 hz sample,
the application is using almost all the CPU capacity in this server well.
The following is a similar demonstration, this time running a multithreaded
CPU bound application called "cpuserial" that has a poor use of locking
such that the threads "serialise",
# cpuwalk.d 5
Sampling...
PID: 12194 CMD: cpuserial
value ------------- Distribution ------------- count
< 0 | 0
0 |@@@ 470
1 |@@@@@@ 920
2 |@@@@@@@@@@@@@@@@@@@@@@@@@ 3840
3 |@@@@@@ 850
4 | 0
In the above, we can see that this CPU bound application is not making
efficient use of the CPU resources available, only reaching 3840 samples
on CPU 2 out of a potential 5000. This problem was caused by a poor use
of locks.
|