Commit | Line | Data |
---|---|---|
8e82b2f4 MJ |
1 | ************** |
2 | LTTng-analyses | |
3 | ************** | |
4 | ||
5 | This repository contains various scripts to extract monitoring data and metrics | |
6 | from LTTng kernel traces. | |
7 | ||
8 | As opposed to other diagnostic or monitoring solutions, this approach is designed | |
9 | to allow users to record their system's activity with a low overhead, wait | |
10 | for a problem to occur and then diagnose its cause offline. | |
11 | ||
12 | This solution allows the user to target hard to find problems and dig until the | |
13 | root cause is found. | |
14 | ||
15 | This README describes the implemented analyses as well as how to use them. | |
16 | ||
17 | |pypi| | |
18 | ||
19 | .. contents:: | |
20 | :local: | |
21 | :depth: 2 | |
22 | :backlinks: none | |
23 | ||
24 | ============ | |
25 | Requirements | |
26 | ============ | |
27 | ||
28 | * LTTng >= 2.5 | |
29 | * Babeltrace >= 1.2 (with python bindings built) | |
30 | * Python >= 3.4 | |
31 | ||
32 | ||
33 | ============ | |
34 | Installation | |
35 | ============ | |
36 | ||
37 | --------------- | |
38 | Release version | |
39 | --------------- | |
40 | ||
41 | On **Ubuntu** (12.04 and up) using the LTTng ppa: | |
42 | ||
43 | .. code-block:: bash | |
44 | ||
45 | apt-get install -y software-properties-common (or python-software-properties on 12.04) | |
46 | apt-add-repository -y ppa:lttng/ppa | |
47 | apt-get update | |
b9eb3ebd | 48 | apt-get -y install lttng-tools babeltrace lttng-modules-dkms python3-babeltrace python3-progressbar python3-lttnganalyses |
8e82b2f4 MJ |
49 | |
50 | On **Debian Sid**: | |
51 | ||
52 | .. code-block:: bash | |
53 | ||
b9eb3ebd | 54 | apt-get -y install lttng-tools babeltrace lttng-modules-dkms python3-babeltrace python3-progressbar python3-lttnganalyses |
8e82b2f4 MJ |
55 | |
56 | On other distributions: | |
57 | ||
58 | Please refer to the `LTTng documentation <http://lttng.org/download/>`_ to | |
59 | install LTTng and the `Babeltrace README | |
60 | <http://git.efficios.com/?p=babeltrace.git;a=blob_plain;f=README;hb=HEAD>`_ | |
61 | to install ``babeltrace`` with the python bindings. Optionally install the | |
62 | ``progressbar`` python module, and then: | |
63 | ||
64 | .. code-block:: bash | |
65 | ||
66 | pip3 install lttnganalyses | |
67 | ||
68 | ||
69 | ------------------- | |
70 | Development version | |
71 | ------------------- | |
72 | ||
73 | The **latest development version** can be installed directly from GitHub: | |
74 | ||
75 | .. code-block:: bash | |
76 | ||
77 | pip3 install --upgrade https://github.com/lttng/lttng-analyses/tarball/master | |
78 | ||
79 | ||
80 | ============== | |
81 | Trace creation | |
82 | ============== | |
83 | ||
84 | Here are the basic commands to create a trace, for more information on the | |
85 | LTTng setup, please refer to the `LTTng documentation Getting started guide | |
86 | <http://lttng.org/docs/#doc-getting-started>`_. | |
87 | ||
88 | --------- | |
89 | Automatic | |
90 | --------- | |
91 | ||
92 | From the cloned git tree: | |
93 | ||
94 | .. code-block:: bash | |
95 | ||
96 | ./lttng-analyses-record | |
97 | ||
98 | ||
99 | ------ | |
100 | Manual | |
101 | ------ | |
102 | ||
103 | .. code-block:: bash | |
104 | ||
105 | lttng create | |
106 | lttng enable-channel -k bla --subbuf-size=4M | |
107 | lttng enable-event -k sched_switch,block_rq_complete,block_rq_issue,block_bio_remap,block_bio_backmerge,netif_receive_skb,net_dev_xmit,sched_process_fork,sched_process_exec,lttng_statedump_process_state,lttng_statedump_file_descriptor,lttng_statedump_block_device,writeback_pages_written,mm_vmscan_wakeup_kswapd,mm_page_free,mm_page_alloc,block_dirty_buffer,irq_handler_entry,irq_handler_exit,softirq_entry,softirq_exit,softirq_raise -c bla | |
108 | lttng enable-event -k --syscall -a -c bla | |
109 | lttng start | |
110 | ..do stuff... | |
111 | lttng stop | |
112 | lttng destroy | |
113 | ||
114 | ||
115 | ------ | |
116 | Remote | |
117 | ------ | |
118 | ||
119 | You can also create a trace on a server and send it to a remote host. The | |
120 | remote host only needs to run ``lttng-relayd -d`` and be reachable over the network. | |
121 | The only difference with the above commands is the tracing session's creation: | |
122 | ||
123 | .. code-block:: bash | |
124 | ||
125 | lttng create -U net://<remote-host> | |
126 | ||
127 | ||
128 | ==================== | |
129 | Implemented analyses | |
130 | ==================== | |
131 | ||
132 | * CPU usage for the whole system | |
133 | * CPU usage per-process | |
134 | * Process CPU migration count | |
135 | * Memory usage per-process (as seen by the kernel) | |
136 | * Memory usage system-wide (as seen by the kernel) | |
137 | * I/O usage (syscalls, disk, network) | |
138 | * I/O operations log (with latency and usage) | |
139 | * I/O latency statistics (open, read, write, sync operations) | |
140 | * I/O latency frequency distribution | |
141 | * Interrupt handler duration statistics (count, min, max, average stdev) | |
142 | * Interrupt handler duration top | |
143 | * Interrupt handler duration log | |
144 | * Interrupt handler duration frequency distribution | |
145 | * SoftIRQ handler latency statistics | |
146 | * Syscalls usage statistics | |
147 | ||
148 | All of the analyses share the same code architecture making it possible | |
149 | to filter by timerange, process name, PID, min and max values using the | |
150 | same command-line options. Also note that reported timestamps can | |
151 | optionally be expressed in the GMT timezone to allow easy sharing between | |
152 | teams. | |
153 | ||
154 | The project's architecture makes it easy to add new analyses or to reuse | |
155 | the analysis backend in external tools which may then present the results | |
156 | in their own format (as opposed to text). | |
157 | ||
158 | ||
159 | ======== | |
160 | Examples | |
161 | ======== | |
162 | ||
163 | After having collected your trace, any script contained in this repository | |
164 | can be used to run an analysis. Read on for some examples! | |
165 | ||
166 | --- | |
167 | I/O | |
168 | --- | |
169 | ||
170 | ^^^^^^^^^^^^^^^^^ | |
171 | I/O latency stats | |
172 | ^^^^^^^^^^^^^^^^^ | |
173 | ||
174 | .. code-block:: bash | |
175 | ||
176 | $ ./lttng-iolatencystats mytrace/ | |
177 | Timerange: [2015-01-06 10:58:26.140545481, 2015-01-06 10:58:27.229358936] | |
178 | Syscalls latency statistics (usec): | |
179 | Type Count Min Average Max Stdev | |
180 | ----------------------------------------------------------------------------------------- | |
181 | Open 45 5.562 13.835 77.683 15.263 | |
182 | Read 109 0.316 5.774 62.569 9.277 | |
183 | Write 101 0.256 7.060 48.531 8.555 | |
184 | Sync 207 19.384 40.664 160.188 21.201 | |
185 | ||
186 | Disk latency statistics (usec): | |
187 | Name Count Min Average Max Stdev | |
188 | ----------------------------------------------------------------------------------------- | |
189 | dm-0 108 0.001 0.004 0.007 1.306 | |
190 | ||
191 | ||
192 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
193 | I/O latency frequency distribution | |
194 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
195 | ||
196 | .. code-block:: bash | |
197 | ||
198 | $ ./lttng-iolatencyfreq mytrace/ | |
199 | Timerange: [2015-01-06 10:58:26.140545481, 2015-01-06 10:58:27.229358936] | |
200 | Open latency distribution (usec) | |
201 | ############################################################################### | |
202 | 5.562 ███████████████████████████████████████████████████████████████████ 25 | |
203 | 9.168 ██████████ 4 | |
204 | 12.774 █████████████████████ 8 | |
205 | 16.380 ████████ 3 | |
206 | 19.986 █████ 2 | |
207 | 23.592 0 | |
208 | 27.198 0 | |
209 | 30.804 0 | |
210 | 34.410 ██ 1 | |
211 | 38.016 0 | |
212 | 41.623 0 | |
213 | 45.229 0 | |
214 | 48.835 0 | |
215 | 52.441 0 | |
216 | 56.047 0 | |
217 | 59.653 0 | |
218 | 63.259 0 | |
219 | 66.865 0 | |
220 | 70.471 0 | |
221 | 74.077 █████ 2 | |
222 | ||
223 | ||
224 | ^^^^^^^^^^^^^^^ | |
225 | I/O latency top | |
226 | ^^^^^^^^^^^^^^^ | |
227 | ||
228 | .. code-block:: bash | |
229 | ||
230 | $ ./lttng-iolatencytop analysis-20150115-120942/ --limit 3 --minsize 2 | |
231 | Checking the trace for lost events... | |
232 | Timerange: [2015-01-15 12:18:37.216484041, 2015-01-15 12:18:53.821580313] | |
233 | Top open syscall latencies (usec) | |
234 | Begin End Name Duration (usec) Size Proc PID Filename | |
235 | [12:18:50.432950815,12:18:50.870648568] open 437697.753 N/A apache2 31517 /var/lib/php5/sess_0ifir2hangm8ggaljdphl9o5b5 (fd=13) | |
236 | [12:18:52.946080165,12:18:52.946132278] open 52.113 N/A apache2 31588 /var/lib/php5/sess_mr9045p1k55vin1h0vg7rhgd63 (fd=13) | |
237 | [12:18:46.800846035,12:18:46.800874916] open 28.881 N/A apache2 31591 /var/lib/php5/sess_r7c12pccfvjtas15g3j69u14h0 (fd=13) | |
238 | [12:18:51.389797604,12:18:51.389824426] open 26.822 N/A apache2 31520 /var/lib/php5/sess_4sdb1rtjkhb78sabnoj8gpbl00 (fd=13) | |
239 | ||
240 | Top read syscall latencies (usec) | |
241 | Begin End Name Duration (usec) Size Proc PID Filename | |
242 | [12:18:37.256073107,12:18:37.256555967] read 482.860 7.00 B bash 10237 unknown (origin not found) (fd=3) | |
243 | [12:18:52.000209798,12:18:52.000252304] read 42.506 1.00 KB irqbalance 1337 /proc/interrupts (fd=3) | |
244 | [12:18:37.256559439,12:18:37.256601615] read 42.176 5.00 B bash 10237 unknown (origin not found) (fd=3) | |
245 | [12:18:42.000281918,12:18:42.000320016] read 38.098 1.00 KB irqbalance 1337 /proc/interrupts (fd=3) | |
246 | ||
247 | Top write syscall latencies (usec) | |
248 | Begin End Name Duration (usec) Size Proc PID Filename | |
249 | [12:18:49.913241516,12:18:49.915908862] write 2667.346 95.00 B apache2 31584 /var/log/apache2/access.log (fd=8) | |
250 | [12:18:37.472823631,12:18:37.472859836] writev 36.205 21.97 KB apache2 31544 unknown (origin not found) (fd=12) | |
251 | [12:18:37.991578372,12:18:37.991612724] writev 34.352 21.97 KB apache2 31589 unknown (origin not found) (fd=12) | |
252 | [12:18:39.547778549,12:18:39.547812515] writev 33.966 21.97 KB apache2 31584 unknown (origin not found) (fd=12) | |
253 | ||
254 | Top sync syscall latencies (usec) | |
255 | Begin End Name Duration (usec) Size Proc PID Filename | |
256 | [12:18:50.162776739,12:18:51.157522361] sync 994745.622 N/A sync 22791 None (fd=None) | |
257 | [12:18:37.227867532,12:18:37.232289687] sync_file_range 4422.155 N/A lttng-consumerd 19964 /home/julien/lttng-traces/analysis-20150115-120942/kernel/metadata (fd=32) | |
258 | [12:18:37.238076585,12:18:37.239012027] sync_file_range 935.442 N/A lttng-consumerd 19964 /home/julien/lttng-traces/analysis-20150115-120942/kernel/metadata (fd=32) | |
259 | [12:18:37.220974711,12:18:37.221647124] sync_file_range 672.413 N/A lttng-consumerd 19964 /home/julien/lttng-traces/analysis-20150115-120942/kernel/metadata (fd=32) | |
260 | ||
261 | ||
262 | ^^^^^^^^^^^^^^^^^^ | |
263 | I/O operations log | |
264 | ^^^^^^^^^^^^^^^^^^ | |
265 | ||
266 | .. code-block:: bash | |
267 | ||
268 | $ ./lttng-iolog mytrace/ | |
269 | [10:58:26.221618530,10:58:26.221620659] write 2.129 8.00 B /usr/bin/x-term 11793 anon_inode:[eventfd] (fd=5) | |
270 | [10:58:26.221623609,10:58:26.221628055] read 4.446 50.00 B /usr/bin/x-term 11793 /dev/ptmx (fd=24) | |
271 | [10:58:26.221638929,10:58:26.221640008] write 1.079 8.00 B /usr/bin/x-term 11793 anon_inode:[eventfd] (fd=5) | |
272 | [10:58:26.221676232,10:58:26.221677385] read 1.153 8.00 B /usr/bin/x-term 11793 anon_inode:[eventfd] (fd=5) | |
273 | [10:58:26.223401804,10:58:26.223411683] open 9.879 N/A sleep 12420 /etc/ld.so.cache (fd=3) | |
274 | [10:58:26.223448060,10:58:26.223455577] open 7.517 N/A sleep 12420 /lib/x86_64-linux-gnu/libc.so.6 (fd=3) | |
275 | [10:58:26.223456522,10:58:26.223458898] read 2.376 832.00 B sleep 12420 /lib/x86_64-linux-gnu/libc.so.6 (fd=3) | |
276 | [10:58:26.223918068,10:58:26.223929316] open 11.248 N/A sleep 12420 (fd=3) | |
277 | [10:58:26.231881565,10:58:26.231895970] writev 14.405 16.00 B /usr/bin/x-term 11793 socket:[45650] (fd=4) | |
278 | [10:58:26.231979636,10:58:26.231988446] recvmsg 8.810 16.00 B Xorg 1827 socket:[47480] (fd=38) | |
279 | ||
280 | ||
281 | ^^^^^^^^^^^^^ | |
282 | I/O usage top | |
283 | ^^^^^^^^^^^^^ | |
284 | ||
285 | .. code-block:: bash | |
286 | ||
287 | $ ./lttng-iousagetop traces/pgread-writes | |
288 | Timerange: [2014-10-07 16:36:00.733214969, 2014-10-07 16:36:18.804584183] | |
289 | Per-process I/O Read | |
290 | ############################################################################### | |
291 | ██████████████████████████████████████████████████ 16.00 MB lttng-consumerd (2619) 0 B file 4.00 B net 16.00 MB unknown | |
292 | █████ 1.72 MB lttng-consumerd (2619) 0 B file 0 B net 1.72 MB unknown | |
293 | █ 398.13 KB postgres (4219) 121.05 KB file 277.07 KB net 8.00 B unknown | |
294 | 256.09 KB postgres (1348) 0 B file 255.97 KB net 117.00 B unknown | |
295 | 204.81 KB postgres (4218) 204.81 KB file 0 B net 0 B unknown | |
296 | 123.77 KB postgres (4220) 117.50 KB file 6.26 KB net 8.00 B unknown | |
297 | Per-process I/O Write | |
298 | ############################################################################### | |
299 | ██████████████████████████████████████████████████ 16.00 MB lttng-consumerd (2619) 0 B file 8.00 MB net 8.00 MB unknown | |
300 | ██████ 2.20 MB postgres (4219) 2.00 MB file 202.23 KB net 0 B unknown | |
301 | █████ 1.73 MB lttng-consumerd (2619) 0 B file 887.73 KB net 882.58 KB unknown | |
302 | ██ 726.33 KB postgres (1165) 8.00 KB file 6.33 KB net 712.00 KB unknown | |
303 | 158.69 KB postgres (1168) 158.69 KB file 0 B net 0 B unknown | |
304 | 80.66 KB postgres (1348) 0 B file 80.66 KB net 0 B unknown | |
305 | Files Read | |
306 | ############################################################################### | |
307 | ██████████████████████████████████████████████████ 8.00 MB anon_inode:[lttng_stream] (lttng-consumerd) 'fd 32 in lttng-consumerd (2619)' | |
308 | █████ 834.41 KB base/16384/pg_internal.init 'fd 7 in postgres (4219)', 'fd 7 in postgres (4220)', 'fd 7 in postgres (4221)', 'fd 7 in postgres (4222)', 'fd 7 in postgres (4223)', 'fd 7 in postgres (4224)', 'fd 7 in postgres (4225)', 'fd 7 in postgres (4226)' | |
309 | █ 256.09 KB socket:[8893] (postgres) 'fd 9 in postgres (1348)' | |
310 | █ 174.69 KB pg_stat_tmp/pgstat.stat 'fd 9 in postgres (4218)', 'fd 9 in postgres (1167)' | |
311 | 109.48 KB global/pg_internal.init 'fd 7 in postgres (4218)', 'fd 7 in postgres (4219)', 'fd 7 in postgres (4220)', 'fd 7 in postgres (4221)', 'fd 7 in postgres (4222)', 'fd 7 in postgres (4223)', 'fd 7 in postgres (4224)', 'fd 7 in postgres (4225)', 'fd 7 in postgres (4226)' | |
312 | 104.30 KB base/11951/pg_internal.init 'fd 7 in postgres (4218)' | |
313 | 12.85 KB socket (lttng-sessiond) 'fd 30 in lttng-sessiond (384)' | |
314 | 4.50 KB global/pg_filenode.map 'fd 7 in postgres (4218)', 'fd 7 in postgres (4219)', 'fd 7 in postgres (4220)', 'fd 7 in postgres (4221)', 'fd 7 in postgres (4222)', 'fd 7 in postgres (4223)', 'fd 7 in postgres (4224)', 'fd 7 in postgres (4225)', 'fd 7 in postgres (4226)' | |
315 | 4.16 KB socket (postgres) 'fd 9 in postgres (4226)' | |
316 | 4.00 KB /proc/interrupts 'fd 3 in irqbalance (1104)' | |
317 | Files Write | |
318 | ############################################################################### | |
319 | ██████████████████████████████████████████████████ 8.00 MB socket:[56371] (lttng-consumerd) 'fd 30 in lttng-consumerd (2619)' | |
320 | █████████████████████████████████████████████████ 8.00 MB pipe:[53306] (lttng-consumerd) 'fd 12 in lttng-consumerd (2619)' | |
321 | ██████████ 1.76 MB pg_xlog/00000001000000000000000B 'fd 31 in postgres (4219)' | |
322 | █████ 887.82 KB socket:[56369] (lttng-consumerd) 'fd 26 in lttng-consumerd (2619)' | |
323 | █████ 882.58 KB pipe:[53309] (lttng-consumerd) 'fd 18 in lttng-consumerd (2619)' | |
324 | 160.00 KB /var/lib/postgresql/9.1/main/base/16384/16602 'fd 14 in postgres (1165)' | |
325 | 158.69 KB pg_stat_tmp/pgstat.tmp 'fd 3 in postgres (1168)' | |
326 | 144.00 KB /var/lib/postgresql/9.1/main/base/16384/16613 'fd 12 in postgres (1165)' | |
327 | 88.00 KB /var/lib/postgresql/9.1/main/base/16384/16609 'fd 11 in postgres (1165)' | |
328 | 78.28 KB socket:[8893] (postgres) 'fd 9 in postgres (1348)' | |
329 | Block I/O Read | |
330 | ############################################################################### | |
331 | Block I/O Write | |
332 | ############################################################################### | |
333 | ██████████████████████████████████████████████████ 1.76 MB postgres (pid=4219) | |
334 | ████ 160.00 KB postgres (pid=1168) | |
335 | ██ 100.00 KB kworker/u8:0 (pid=1540) | |
336 | ██ 96.00 KB jbd2/vda1-8 (pid=257) | |
337 | █ 40.00 KB postgres (pid=1166) | |
338 | 8.00 KB kworker/u9:0 (pid=4197) | |
339 | 4.00 KB kworker/u9:2 (pid=1381) | |
340 | Disk nr_sector | |
341 | ############################################################################### | |
342 | ███████████████████████████████████████████████████████████████████ 4416.00 sectors vda1 | |
343 | Disk nr_requests | |
344 | ############################################################################### | |
345 | ████████████████████████████████████████████████████████████████████ 177.00 requests vda1 | |
346 | Disk request time/sector | |
347 | ############################################################################### | |
348 | ██████████████████████████████████████████████████████████████████ 0.01 ms vda1 | |
349 | Network recv_bytes | |
350 | ############################################################################### | |
351 | ███████████████████████████████████████████████████████ 739.50 KB eth0 | |
352 | █████ 80.27 KB lo | |
353 | Network sent_bytes | |
354 | ############################################################################### | |
355 | ████████████████████████████████████████████████████████ 9.36 MB eth0 | |
356 | ||
357 | ||
358 | -------- | |
359 | Syscalls | |
360 | -------- | |
361 | ||
362 | ^^^^^^^^^^ | |
363 | Statistics | |
364 | ^^^^^^^^^^ | |
365 | ||
366 | .. code-block:: bash | |
367 | ||
368 | $ ./lttng-syscallstats mytrace/ | |
369 | Timerange: [2015-01-15 12:18:37.216484041, 2015-01-15 12:18:53.821580313] | |
370 | Per-TID syscalls statistics (usec) | |
371 | find (22785) Count Min Average Max Stdev Return values | |
372 | - getdents 14240 0.380 364.301 43372.450 1629.390 {'success': 14240} | |
373 | - close 14236 0.233 0.506 4.932 0.217 {'success': 14236} | |
374 | - fchdir 14231 0.252 0.407 5.769 0.117 {'success': 14231} | |
375 | - open 7123 0.779 2.321 12.697 0.936 {'success': 7119, 'ENOENT': 4} | |
376 | - newfstatat 7118 1.457 143.562 28103.532 1410.281 {'success': 7118} | |
377 | - openat 7118 1.525 2.411 9.107 0.771 {'success': 7118} | |
378 | - newfstat 7117 0.272 0.654 8.707 0.248 {'success': 7117} | |
379 | - write 573 0.298 0.715 8.584 0.391 {'success': 573} | |
380 | - brk 27 0.615 5.768 30.792 7.830 {'success': 27} | |
381 | - rt_sigaction 22 0.227 0.283 0.589 0.098 {'success': 22} | |
382 | - mmap 12 1.116 2.116 3.597 0.762 {'success': 12} | |
383 | - mprotect 6 1.185 2.235 3.923 1.148 {'success': 6} | |
384 | - read 5 0.925 2.101 6.300 2.351 {'success': 5} | |
385 | - ioctl 4 0.342 1.151 2.280 0.873 {'success': 2, 'ENOTTY': 2} | |
386 | - access 4 1.166 2.530 4.202 1.527 {'ENOENT': 4} | |
387 | - rt_sigprocmask 3 0.325 0.570 0.979 0.357 {'success': 3} | |
388 | - dup2 2 0.250 0.562 0.874 ? {'success': 2} | |
389 | - munmap 2 3.006 5.399 7.792 ? {'success': 2} | |
390 | - execve 1 7277.974 7277.974 7277.974 ? {'success': 1} | |
391 | - setpgid 1 0.945 0.945 0.945 ? {'success': 1} | |
392 | - fcntl 1 ? 0.000 0.000 ? {} | |
393 | - newuname 1 1.240 1.240 1.240 ? {'success': 1} | |
394 | Total: 71847 | |
395 | ----------------------------------------------------------------------------------------------------------------- | |
396 | apache2 (31517) Count Min Average Max Stdev Return values | |
397 | - fcntl 192 ? 0.000 0.000 ? {} | |
398 | - newfstat 156 0.237 0.484 1.102 0.222 {'success': 156} | |
399 | - read 144 0.307 1.602 16.307 1.698 {'success': 117, 'EAGAIN': 27} | |
400 | - access 96 0.705 1.580 3.364 0.670 {'success': 12, 'ENOENT': 84} | |
401 | - newlstat 84 0.459 0.738 1.456 0.186 {'success': 63, 'ENOENT': 21} | |
402 | - newstat 74 0.735 2.266 11.212 1.772 {'success': 50, 'ENOENT': 24} | |
403 | - lseek 72 0.317 0.522 0.915 0.112 {'success': 72} | |
404 | - close 39 0.471 0.615 0.867 0.069 {'success': 39} | |
405 | - open 36 2.219 12162.689 437697.753 72948.868 {'success': 36} | |
406 | - getcwd 28 0.287 0.701 1.331 0.277 {'success': 28} | |
407 | - poll 27 1.080 1139.669 2851.163 856.723 {'success': 27} | |
408 | - times 24 0.765 0.956 1.327 0.107 {'success': 24} | |
409 | - setitimer 24 0.499 5.848 16.668 4.041 {'success': 24} | |
410 | - write 24 5.467 6.784 16.827 2.459 {'success': 24} | |
411 | - writev 24 10.241 17.645 29.817 5.116 {'success': 24} | |
412 | - mmap 15 3.060 3.482 4.406 0.317 {'success': 15} | |
413 | - munmap 15 2.944 3.502 4.154 0.427 {'success': 15} | |
414 | - brk 12 0.738 4.579 13.795 4.437 {'success': 12} | |
415 | - chdir 12 0.989 1.600 2.353 0.385 {'success': 12} | |
416 | - flock 6 0.906 1.282 2.043 0.423 {'success': 6} | |
417 | - rt_sigaction 6 0.530 0.725 1.123 0.217 {'success': 6} | |
418 | - pwrite64 6 1.262 1.430 1.692 0.143 {'success': 6} | |
419 | - rt_sigprocmask 6 0.539 0.650 0.976 0.162 {'success': 6} | |
420 | - shutdown 3 7.323 8.487 10.281 1.576 {'success': 3} | |
421 | - getsockname 3 1.015 1.228 1.585 0.311 {'success': 3} | |
422 | - accept4 3 5174453.611 3450157.282 5176018.235 ? {'success': 2} | |
423 | Total: 1131 | |
424 | ||
425 | ||
426 | --- | |
427 | IRQ | |
428 | --- | |
429 | ||
430 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
431 | Handler duration and raise latency statistics | |
432 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
433 | ||
434 | .. code-block:: bash | |
435 | ||
436 | $ ./lttng-irqstats mytrace/ | |
437 | Timerange: [2014-03-11 16:05:41.314824752, 2014-03-11 16:05:45.041994298] | |
438 | Hard IRQ Duration (us) | |
439 | count min avg max stdev | |
440 | ----------------------------------------------------------------------------------| | |
441 | 1: <i8042> 30 10.901 45.500 64.510 18.447 | | |
442 | 42: <ahci> 259 3.203 7.863 21.426 3.183 | | |
443 | 43: <eth0> 2 3.859 3.976 4.093 0.165 | | |
444 | 44: <iwlwifi> 92 0.300 3.995 6.542 2.181 | | |
445 | ||
446 | Soft IRQ Duration (us) Raise latency (us) | |
447 | count min avg max stdev | count min avg max stdev | |
448 | ----------------------------------------------------------------------------------|------------------------------------------------------------ | |
449 | 1: <TIMER_SOFTIRQ> 495 0.202 21.058 51.060 11.047 | 53 2.141 11.217 20.005 7.233 | |
450 | 3: <NET_RX_SOFTIRQ> 14 0.133 9.177 32.774 10.483 | 14 0.763 3.703 10.902 3.448 | |
451 | 4: <BLOCK_SOFTIRQ> 257 5.981 29.064 125.862 15.891 | 257 0.891 3.104 15.054 2.046 | |
452 | 6: <TASKLET_SOFTIRQ> 26 0.309 1.198 1.748 0.329 | 26 9.636 39.222 51.430 11.246 | |
453 | 7: <SCHED_SOFTIRQ> 299 1.185 14.768 90.465 15.992 | 298 1.286 31.387 61.700 11.866 | |
454 | 9: <RCU_SOFTIRQ> 338 0.592 3.387 13.745 1.356 | 147 2.480 29.299 64.453 14.286 | |
455 | ||
456 | ||
457 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
458 | Handler duration frequency distribution | |
459 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
460 | ||
461 | .. code-block:: bash | |
462 | ||
463 | $ ./lttng-irqfreq --timerange [16:05:42,16:05:45] --irq 44 --stats mytrace/ | |
464 | Timerange: [2014-03-11 16:05:42.042034570, 2014-03-11 16:05:44.998914297] | |
465 | Hard IRQ Duration (us) | |
466 | count min avg max stdev | |
467 | ----------------------------------------------------------------------------------| | |
468 | 44: <iwlwifi> 72 0.300 4.018 6.542 2.164 | | |
469 | Frequency distribution iwlwifi (44) | |
470 | ############################################################################### | |
471 | 0.300 █████ 1.00 | |
472 | 0.612 ██████████████████████████████████████████████████████████████ 12.00 | |
473 | 0.924 ████████████████████ 4.00 | |
474 | 1.236 ██████████ 2.00 | |
475 | 1.548 0.00 | |
476 | 1.861 █████ 1.00 | |
477 | 2.173 0.00 | |
478 | 2.485 █████ 1.00 | |
479 | 2.797 ██████████████████████████ 5.00 | |
480 | 3.109 █████ 1.00 | |
481 | 3.421 ███████████████ 3.00 | |
482 | 3.733 0.00 | |
483 | 4.045 █████ 1.00 | |
484 | 4.357 █████ 1.00 | |
485 | 4.669 ██████████ 2.00 | |
486 | 4.981 ██████████ 2.00 | |
487 | 5.294 █████████████████████████████████████████ 8.00 | |
488 | 5.606 ████████████████████████████████████████████████████████████████████ 13.00 | |
489 | 5.918 ██████████████████████████████████████████████████████████████ 12.00 | |
490 | 6.230 ███████████████ 3.00 | |
491 | ||
492 | ||
493 | ------ | |
494 | Others | |
495 | ------ | |
496 | ||
497 | There are a lot of other scripts, we encourage you to try them and read the | |
498 | ``--help`` to see all the available options. | |
499 | ||
500 | ||
501 | ================ | |
502 | Work in progress | |
503 | ================ | |
504 | ||
505 | Track the page cache and extract the latencies associated with pages flush to disk. | |
506 | In order to do that, we rely on the assumption that the pages are flushed in a FIFO | |
507 | order. It might not be 100% accurate, but it already gives great results : | |
508 | ||
509 | An example here when saving a file in vim:: | |
510 | ||
511 | [19:57:51.173332284 - 19:57:51.177794657] vim (31517) syscall_entry_fsync(fd = 4 <blabla>) = 0, 4.462 ms | |
512 | 1 dirty page(s) were flushed (assuming FIFO): | |
513 | vim (31517): 1 pages | |
514 | - blabla : 1 pages | |
515 | 13 active dirty filesystem page(s) (known): | |
516 | redis-server (2092): 2 pages | |
517 | - /var/log/redis/redis-server.log : 2 pages | |
518 | vim (31517): 2 pages | |
519 | - .blabla.swp : 2 pages | |
520 | lttng-consumerd (6750): 9 pages | |
521 | - unknown (origin not found) : 9 pages | |
522 | ||
523 | ||
524 | An other example when running the 'sync' command:: | |
525 | ||
526 | [19:57:53.046840755 - 19:57:53.072809609] sync (31554) syscall_entry_sync(fd = <unknown>) = 0, 25.969 ms | |
527 | 23 dirty page(s) were flushed (assuming FIFO): | |
528 | redis-server (2092): 2 pages | |
529 | - /var/log/redis/redis-server.log : 2 pages | |
530 | vim (31517): 9 pages | |
531 | - /home/julien/.viminfo.tmp : 6 pages | |
532 | - .blabla.swp : 3 pages | |
533 | lttng-consumerd (6750): 12 pages | |
534 | - unknown (origin not found) : 12 pages | |
535 | ||
536 | ||
537 | PostgreSQL with 'sys_fdatasync':: | |
538 | ||
539 | [13:49:39.908599447 - 13:49:39.915930730] postgres (1137) sys_fdatasync(fd = 7 </var/lib/postgresql/9.1/main/pg_xlog/000000010000000000000008>) = 0, 7.331 ms | |
540 | 2 pages allocated during the period | |
541 | 88 dirty page(s) were flushed (assuming FIFO): | |
542 | postgres (1137): 88 pages | |
543 | - /var/lib/postgresql/9.1/main/pg_xlog/000000010000000000000008 : 88 pages | |
544 | 68 last dirtied filesystem page(s): | |
545 | postgres (2419): 68 pages | |
546 | - base/11951/18410 : 46 pages | |
547 | - base/11951/18407 : 10 pages | |
548 | - base/11951/18407_fsm : 6 pages | |
549 | - base/11951/18410_fsm : 6 pages | |
550 | ||
551 | ||
552 | Detecting a fight for the I/O between a huge write and postgresql:: | |
553 | ||
554 | [13:49:47.242730583 - 13:49:47.442835037] python (2353) sys_write(fd = 3 </root/bla>, count = 102395904) = 102395904, 200.104 ms | |
555 | 34760 pages allocated during the period | |
556 | woke up kswapd during the period | |
557 | 10046 pages written on disk | |
558 | freed 33753 pages from the cache during the period | |
559 | 1397 last dirtied filesystem page(s): | |
560 | python (2353): 1325 pages | |
561 | - /root/bla : 1325 pages | |
562 | postgres (2419): 72 pages | |
563 | - base/11951/18419 : 72 pages | |
564 | ||
565 | ||
566 | =========== | |
567 | Limitations | |
568 | =========== | |
569 | ||
570 | The main limitation of this project is the fact that it can be quite slow to | |
571 | process a large trace. This project is a work in progress and we focus on the | |
572 | problem-solving aspect. Therefore, features have been prioritized over | |
573 | performance for now. | |
574 | ||
575 | One other aspect is the fact that the state is not persistent; the trace has | |
576 | to be re-processed if another analysis script is to be used on the same trace. | |
577 | Some scripts belonging to the same category allow the combination of multiple | |
578 | analyses into a single pass (see ``--freq``, ``--log``, ``--usage``, | |
579 | ``--latencystats``, etc). We are planning to add a way to save the state | |
580 | and/or create an interactive environment to allow the user to run multiple | |
581 | analyses on the same trace without having to process the trace every time. | |
582 | ||
583 | ||
584 | ========== | |
585 | Conclusion | |
586 | ========== | |
587 | ||
588 | We hope you have fun trying this project and please remember it is a work in | |
589 | progress; feedback, bug reports and improvement ideas are always welcome! | |
590 | ||
591 | ||
592 | .. _pip: http://www.pip-installer.org/en/latest/index.html | |
593 | ||
594 | ||
595 | .. |pypi| image:: https://img.shields.io/pypi/v/lttnganalyses.svg?style=flat-square&label=latest%20version | |
596 | :target: https://pypi.python.org/pypi/lttnganalyses | |
597 | :alt: Latest version released on PyPi |