| 1 | ************** |
| 2 | LTTng-analyses |
| 3 | ************** |
| 4 | |
| 5 | This repository contains various scripts to extract monitoring data and metrics |
| 6 | from LTTng kernel traces. |
| 7 | |
| 8 | As opposed to other diagnostic or monitoring solutions, this approach is designed |
| 9 | to allow users to record their system's activity with a low overhead, wait |
| 10 | for a problem to occur and then diagnose its cause offline. |
| 11 | |
| 12 | This solution allows the user to target hard to find problems and dig until the |
| 13 | root cause is found. |
| 14 | |
| 15 | This README describes the implemented analyses as well as how to use them. |
| 16 | |
| 17 | |pypi| |
| 18 | |
| 19 | .. contents:: |
| 20 | :local: |
| 21 | :depth: 2 |
| 22 | :backlinks: none |
| 23 | |
| 24 | ============ |
| 25 | Requirements |
| 26 | ============ |
| 27 | |
| 28 | * LTTng >= 2.5 |
| 29 | * Babeltrace >= 1.2 (with python bindings built) |
| 30 | * Python >= 3.4 |
| 31 | |
| 32 | |
| 33 | ============ |
| 34 | Installation |
| 35 | ============ |
| 36 | |
| 37 | --------------- |
| 38 | Release version |
| 39 | --------------- |
| 40 | |
| 41 | On **Ubuntu** (12.04 and up) using the LTTng ppa: |
| 42 | |
| 43 | .. code-block:: bash |
| 44 | |
| 45 | apt-get install -y software-properties-common (or python-software-properties on 12.04) |
| 46 | apt-add-repository -y ppa:lttng/ppa |
| 47 | apt-get update |
| 48 | apt-get -y install lttng-tools babeltrace lttng-modules-dkms python3-babeltrace python3-progressbar python3-lttnganalyses |
| 49 | |
| 50 | On **Debian Sid**: |
| 51 | |
| 52 | .. code-block:: bash |
| 53 | |
| 54 | apt-get -y install lttng-tools babeltrace lttng-modules-dkms python3-babeltrace python3-progressbar python3-lttnganalyses |
| 55 | |
| 56 | On other distributions: |
| 57 | |
| 58 | Please refer to the `LTTng documentation <http://lttng.org/download/>`_ to |
| 59 | install LTTng and the `Babeltrace README |
| 60 | <http://git.efficios.com/?p=babeltrace.git;a=blob_plain;f=README;hb=HEAD>`_ |
| 61 | to install ``babeltrace`` with the python bindings. Optionally install the |
| 62 | ``progressbar`` python module, and then: |
| 63 | |
| 64 | .. code-block:: bash |
| 65 | |
| 66 | pip3 install lttnganalyses |
| 67 | |
| 68 | |
| 69 | ------------------- |
| 70 | Development version |
| 71 | ------------------- |
| 72 | |
| 73 | The **latest development version** can be installed directly from GitHub: |
| 74 | |
| 75 | .. code-block:: bash |
| 76 | |
| 77 | pip3 install --upgrade git+git://github.com/lttng/lttng-analyses.git |
| 78 | |
| 79 | |
| 80 | ============== |
| 81 | Trace creation |
| 82 | ============== |
| 83 | |
| 84 | Here are the basic commands to create a trace, for more information on the |
| 85 | LTTng setup, please refer to the `LTTng documentation Getting started guide |
| 86 | <http://lttng.org/docs/#doc-getting-started>`_. |
| 87 | |
| 88 | --------- |
| 89 | Automatic |
| 90 | --------- |
| 91 | |
| 92 | From the cloned git tree: |
| 93 | |
| 94 | .. code-block:: bash |
| 95 | |
| 96 | ./lttng-analyses-record |
| 97 | |
| 98 | |
| 99 | ------ |
| 100 | Manual |
| 101 | ------ |
| 102 | |
| 103 | .. code-block:: bash |
| 104 | |
| 105 | lttng create |
| 106 | lttng enable-channel -k bla --subbuf-size=4M |
| 107 | lttng enable-event -k sched_switch,block_rq_complete,block_rq_issue,block_bio_remap,block_bio_backmerge,netif_receive_skb,net_dev_xmit,sched_process_fork,sched_process_exec,lttng_statedump_process_state,lttng_statedump_file_descriptor,lttng_statedump_block_device,writeback_pages_written,mm_vmscan_wakeup_kswapd,mm_page_free,mm_page_alloc,block_dirty_buffer,irq_handler_entry,irq_handler_exit,softirq_entry,softirq_exit,softirq_raise -c bla |
| 108 | lttng enable-event -k --syscall -a -c bla |
| 109 | lttng start |
| 110 | ..do stuff... |
| 111 | lttng stop |
| 112 | lttng destroy |
| 113 | |
| 114 | |
| 115 | ------ |
| 116 | Remote |
| 117 | ------ |
| 118 | |
| 119 | You can also create a trace on a server and send it to a remote host. The |
| 120 | remote host only needs to run ``lttng-relayd -d`` and be reachable over the network. |
| 121 | The only difference with the above commands is the tracing session's creation: |
| 122 | |
| 123 | .. code-block:: bash |
| 124 | |
| 125 | lttng create -U net://<remote-host> |
| 126 | |
| 127 | |
| 128 | ==================== |
| 129 | Implemented analyses |
| 130 | ==================== |
| 131 | |
| 132 | * CPU usage for the whole system |
| 133 | * CPU usage per-process |
| 134 | * Process CPU migration count |
| 135 | * Memory usage per-process (as seen by the kernel) |
| 136 | * Memory usage system-wide (as seen by the kernel) |
| 137 | * I/O usage (syscalls, disk, network) |
| 138 | * I/O operations log (with latency and usage) |
| 139 | * I/O latency statistics (open, read, write, sync operations) |
| 140 | * I/O latency frequency distribution |
| 141 | * Interrupt handler duration statistics (count, min, max, average stdev) |
| 142 | * Interrupt handler duration top |
| 143 | * Interrupt handler duration log |
| 144 | * Interrupt handler duration frequency distribution |
| 145 | * SoftIRQ handler latency statistics |
| 146 | * Syscalls usage statistics |
| 147 | |
| 148 | All of the analyses share the same code architecture making it possible |
| 149 | to filter by timerange, process name, PID, min and max values using the |
| 150 | same command-line options. Also note that reported timestamps can |
| 151 | optionally be expressed in the GMT timezone to allow easy sharing between |
| 152 | teams. |
| 153 | |
| 154 | The project's architecture makes it easy to add new analyses or to reuse |
| 155 | the analysis backend in external tools which may then present the results |
| 156 | in their own format (as opposed to text). |
| 157 | |
| 158 | |
| 159 | ======== |
| 160 | Examples |
| 161 | ======== |
| 162 | |
| 163 | After having collected your trace, any script contained in this repository |
| 164 | can be used to run an analysis. Read on for some examples! |
| 165 | |
| 166 | --- |
| 167 | I/O |
| 168 | --- |
| 169 | |
| 170 | ^^^^^^^^^^^^^^^^^ |
| 171 | I/O latency stats |
| 172 | ^^^^^^^^^^^^^^^^^ |
| 173 | |
| 174 | .. code-block:: bash |
| 175 | |
| 176 | $ ./lttng-iolatencystats mytrace/ |
| 177 | Timerange: [2015-01-06 10:58:26.140545481, 2015-01-06 10:58:27.229358936] |
| 178 | Syscalls latency statistics (usec): |
| 179 | Type Count Min Average Max Stdev |
| 180 | ----------------------------------------------------------------------------------------- |
| 181 | Open 45 5.562 13.835 77.683 15.263 |
| 182 | Read 109 0.316 5.774 62.569 9.277 |
| 183 | Write 101 0.256 7.060 48.531 8.555 |
| 184 | Sync 207 19.384 40.664 160.188 21.201 |
| 185 | |
| 186 | Disk latency statistics (usec): |
| 187 | Name Count Min Average Max Stdev |
| 188 | ----------------------------------------------------------------------------------------- |
| 189 | dm-0 108 0.001 0.004 0.007 1.306 |
| 190 | |
| 191 | |
| 192 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 193 | I/O latency frequency distribution |
| 194 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 195 | |
| 196 | .. code-block:: bash |
| 197 | |
| 198 | $ ./lttng-iolatencyfreq mytrace/ |
| 199 | Timerange: [2015-01-06 10:58:26.140545481, 2015-01-06 10:58:27.229358936] |
| 200 | Open latency distribution (usec) |
| 201 | ############################################################################### |
| 202 | 5.562 ███████████████████████████████████████████████████████████████████ 25 |
| 203 | 9.168 ██████████ 4 |
| 204 | 12.774 █████████████████████ 8 |
| 205 | 16.380 ████████ 3 |
| 206 | 19.986 █████ 2 |
| 207 | 23.592 0 |
| 208 | 27.198 0 |
| 209 | 30.804 0 |
| 210 | 34.410 ██ 1 |
| 211 | 38.016 0 |
| 212 | 41.623 0 |
| 213 | 45.229 0 |
| 214 | 48.835 0 |
| 215 | 52.441 0 |
| 216 | 56.047 0 |
| 217 | 59.653 0 |
| 218 | 63.259 0 |
| 219 | 66.865 0 |
| 220 | 70.471 0 |
| 221 | 74.077 █████ 2 |
| 222 | |
| 223 | |
| 224 | ^^^^^^^^^^^^^^^ |
| 225 | I/O latency top |
| 226 | ^^^^^^^^^^^^^^^ |
| 227 | |
| 228 | .. code-block:: bash |
| 229 | |
| 230 | $ ./lttng-iolatencytop analysis-20150115-120942/ --limit 3 --minsize 2 |
| 231 | Checking the trace for lost events... |
| 232 | Timerange: [2015-01-15 12:18:37.216484041, 2015-01-15 12:18:53.821580313] |
| 233 | Top open syscall latencies (usec) |
| 234 | Begin End Name Duration (usec) Size Proc PID Filename |
| 235 | [12:18:50.432950815,12:18:50.870648568] open 437697.753 N/A apache2 31517 /var/lib/php5/sess_0ifir2hangm8ggaljdphl9o5b5 (fd=13) |
| 236 | [12:18:52.946080165,12:18:52.946132278] open 52.113 N/A apache2 31588 /var/lib/php5/sess_mr9045p1k55vin1h0vg7rhgd63 (fd=13) |
| 237 | [12:18:46.800846035,12:18:46.800874916] open 28.881 N/A apache2 31591 /var/lib/php5/sess_r7c12pccfvjtas15g3j69u14h0 (fd=13) |
| 238 | [12:18:51.389797604,12:18:51.389824426] open 26.822 N/A apache2 31520 /var/lib/php5/sess_4sdb1rtjkhb78sabnoj8gpbl00 (fd=13) |
| 239 | |
| 240 | Top read syscall latencies (usec) |
| 241 | Begin End Name Duration (usec) Size Proc PID Filename |
| 242 | [12:18:37.256073107,12:18:37.256555967] read 482.860 7.00 B bash 10237 unknown (origin not found) (fd=3) |
| 243 | [12:18:52.000209798,12:18:52.000252304] read 42.506 1.00 KB irqbalance 1337 /proc/interrupts (fd=3) |
| 244 | [12:18:37.256559439,12:18:37.256601615] read 42.176 5.00 B bash 10237 unknown (origin not found) (fd=3) |
| 245 | [12:18:42.000281918,12:18:42.000320016] read 38.098 1.00 KB irqbalance 1337 /proc/interrupts (fd=3) |
| 246 | |
| 247 | Top write syscall latencies (usec) |
| 248 | Begin End Name Duration (usec) Size Proc PID Filename |
| 249 | [12:18:49.913241516,12:18:49.915908862] write 2667.346 95.00 B apache2 31584 /var/log/apache2/access.log (fd=8) |
| 250 | [12:18:37.472823631,12:18:37.472859836] writev 36.205 21.97 KB apache2 31544 unknown (origin not found) (fd=12) |
| 251 | [12:18:37.991578372,12:18:37.991612724] writev 34.352 21.97 KB apache2 31589 unknown (origin not found) (fd=12) |
| 252 | [12:18:39.547778549,12:18:39.547812515] writev 33.966 21.97 KB apache2 31584 unknown (origin not found) (fd=12) |
| 253 | |
| 254 | Top sync syscall latencies (usec) |
| 255 | Begin End Name Duration (usec) Size Proc PID Filename |
| 256 | [12:18:50.162776739,12:18:51.157522361] sync 994745.622 N/A sync 22791 None (fd=None) |
| 257 | [12:18:37.227867532,12:18:37.232289687] sync_file_range 4422.155 N/A lttng-consumerd 19964 /home/julien/lttng-traces/analysis-20150115-120942/kernel/metadata (fd=32) |
| 258 | [12:18:37.238076585,12:18:37.239012027] sync_file_range 935.442 N/A lttng-consumerd 19964 /home/julien/lttng-traces/analysis-20150115-120942/kernel/metadata (fd=32) |
| 259 | [12:18:37.220974711,12:18:37.221647124] sync_file_range 672.413 N/A lttng-consumerd 19964 /home/julien/lttng-traces/analysis-20150115-120942/kernel/metadata (fd=32) |
| 260 | |
| 261 | |
| 262 | ^^^^^^^^^^^^^^^^^^ |
| 263 | I/O operations log |
| 264 | ^^^^^^^^^^^^^^^^^^ |
| 265 | |
| 266 | .. code-block:: bash |
| 267 | |
| 268 | $ ./lttng-iolog mytrace/ |
| 269 | [10:58:26.221618530,10:58:26.221620659] write 2.129 8.00 B /usr/bin/x-term 11793 anon_inode:[eventfd] (fd=5) |
| 270 | [10:58:26.221623609,10:58:26.221628055] read 4.446 50.00 B /usr/bin/x-term 11793 /dev/ptmx (fd=24) |
| 271 | [10:58:26.221638929,10:58:26.221640008] write 1.079 8.00 B /usr/bin/x-term 11793 anon_inode:[eventfd] (fd=5) |
| 272 | [10:58:26.221676232,10:58:26.221677385] read 1.153 8.00 B /usr/bin/x-term 11793 anon_inode:[eventfd] (fd=5) |
| 273 | [10:58:26.223401804,10:58:26.223411683] open 9.879 N/A sleep 12420 /etc/ld.so.cache (fd=3) |
| 274 | [10:58:26.223448060,10:58:26.223455577] open 7.517 N/A sleep 12420 /lib/x86_64-linux-gnu/libc.so.6 (fd=3) |
| 275 | [10:58:26.223456522,10:58:26.223458898] read 2.376 832.00 B sleep 12420 /lib/x86_64-linux-gnu/libc.so.6 (fd=3) |
| 276 | [10:58:26.223918068,10:58:26.223929316] open 11.248 N/A sleep 12420 (fd=3) |
| 277 | [10:58:26.231881565,10:58:26.231895970] writev 14.405 16.00 B /usr/bin/x-term 11793 socket:[45650] (fd=4) |
| 278 | [10:58:26.231979636,10:58:26.231988446] recvmsg 8.810 16.00 B Xorg 1827 socket:[47480] (fd=38) |
| 279 | |
| 280 | |
| 281 | ^^^^^^^^^^^^^ |
| 282 | I/O usage top |
| 283 | ^^^^^^^^^^^^^ |
| 284 | |
| 285 | .. code-block:: bash |
| 286 | |
| 287 | $ ./lttng-iousagetop traces/pgread-writes |
| 288 | Timerange: [2014-10-07 16:36:00.733214969, 2014-10-07 16:36:18.804584183] |
| 289 | Per-process I/O Read |
| 290 | ############################################################################### |
| 291 | ██████████████████████████████████████████████████ 16.00 MB lttng-consumerd (2619) 0 B file 4.00 B net 16.00 MB unknown |
| 292 | █████ 1.72 MB lttng-consumerd (2619) 0 B file 0 B net 1.72 MB unknown |
| 293 | █ 398.13 KB postgres (4219) 121.05 KB file 277.07 KB net 8.00 B unknown |
| 294 | 256.09 KB postgres (1348) 0 B file 255.97 KB net 117.00 B unknown |
| 295 | 204.81 KB postgres (4218) 204.81 KB file 0 B net 0 B unknown |
| 296 | 123.77 KB postgres (4220) 117.50 KB file 6.26 KB net 8.00 B unknown |
| 297 | Per-process I/O Write |
| 298 | ############################################################################### |
| 299 | ██████████████████████████████████████████████████ 16.00 MB lttng-consumerd (2619) 0 B file 8.00 MB net 8.00 MB unknown |
| 300 | ██████ 2.20 MB postgres (4219) 2.00 MB file 202.23 KB net 0 B unknown |
| 301 | █████ 1.73 MB lttng-consumerd (2619) 0 B file 887.73 KB net 882.58 KB unknown |
| 302 | ██ 726.33 KB postgres (1165) 8.00 KB file 6.33 KB net 712.00 KB unknown |
| 303 | 158.69 KB postgres (1168) 158.69 KB file 0 B net 0 B unknown |
| 304 | 80.66 KB postgres (1348) 0 B file 80.66 KB net 0 B unknown |
| 305 | Files Read |
| 306 | ############################################################################### |
| 307 | ██████████████████████████████████████████████████ 8.00 MB anon_inode:[lttng_stream] (lttng-consumerd) 'fd 32 in lttng-consumerd (2619)' |
| 308 | █████ 834.41 KB base/16384/pg_internal.init 'fd 7 in postgres (4219)', 'fd 7 in postgres (4220)', 'fd 7 in postgres (4221)', 'fd 7 in postgres (4222)', 'fd 7 in postgres (4223)', 'fd 7 in postgres (4224)', 'fd 7 in postgres (4225)', 'fd 7 in postgres (4226)' |
| 309 | █ 256.09 KB socket:[8893] (postgres) 'fd 9 in postgres (1348)' |
| 310 | █ 174.69 KB pg_stat_tmp/pgstat.stat 'fd 9 in postgres (4218)', 'fd 9 in postgres (1167)' |
| 311 | 109.48 KB global/pg_internal.init 'fd 7 in postgres (4218)', 'fd 7 in postgres (4219)', 'fd 7 in postgres (4220)', 'fd 7 in postgres (4221)', 'fd 7 in postgres (4222)', 'fd 7 in postgres (4223)', 'fd 7 in postgres (4224)', 'fd 7 in postgres (4225)', 'fd 7 in postgres (4226)' |
| 312 | 104.30 KB base/11951/pg_internal.init 'fd 7 in postgres (4218)' |
| 313 | 12.85 KB socket (lttng-sessiond) 'fd 30 in lttng-sessiond (384)' |
| 314 | 4.50 KB global/pg_filenode.map 'fd 7 in postgres (4218)', 'fd 7 in postgres (4219)', 'fd 7 in postgres (4220)', 'fd 7 in postgres (4221)', 'fd 7 in postgres (4222)', 'fd 7 in postgres (4223)', 'fd 7 in postgres (4224)', 'fd 7 in postgres (4225)', 'fd 7 in postgres (4226)' |
| 315 | 4.16 KB socket (postgres) 'fd 9 in postgres (4226)' |
| 316 | 4.00 KB /proc/interrupts 'fd 3 in irqbalance (1104)' |
| 317 | Files Write |
| 318 | ############################################################################### |
| 319 | ██████████████████████████████████████████████████ 8.00 MB socket:[56371] (lttng-consumerd) 'fd 30 in lttng-consumerd (2619)' |
| 320 | █████████████████████████████████████████████████ 8.00 MB pipe:[53306] (lttng-consumerd) 'fd 12 in lttng-consumerd (2619)' |
| 321 | ██████████ 1.76 MB pg_xlog/00000001000000000000000B 'fd 31 in postgres (4219)' |
| 322 | █████ 887.82 KB socket:[56369] (lttng-consumerd) 'fd 26 in lttng-consumerd (2619)' |
| 323 | █████ 882.58 KB pipe:[53309] (lttng-consumerd) 'fd 18 in lttng-consumerd (2619)' |
| 324 | 160.00 KB /var/lib/postgresql/9.1/main/base/16384/16602 'fd 14 in postgres (1165)' |
| 325 | 158.69 KB pg_stat_tmp/pgstat.tmp 'fd 3 in postgres (1168)' |
| 326 | 144.00 KB /var/lib/postgresql/9.1/main/base/16384/16613 'fd 12 in postgres (1165)' |
| 327 | 88.00 KB /var/lib/postgresql/9.1/main/base/16384/16609 'fd 11 in postgres (1165)' |
| 328 | 78.28 KB socket:[8893] (postgres) 'fd 9 in postgres (1348)' |
| 329 | Block I/O Read |
| 330 | ############################################################################### |
| 331 | Block I/O Write |
| 332 | ############################################################################### |
| 333 | ██████████████████████████████████████████████████ 1.76 MB postgres (pid=4219) |
| 334 | ████ 160.00 KB postgres (pid=1168) |
| 335 | ██ 100.00 KB kworker/u8:0 (pid=1540) |
| 336 | ██ 96.00 KB jbd2/vda1-8 (pid=257) |
| 337 | █ 40.00 KB postgres (pid=1166) |
| 338 | 8.00 KB kworker/u9:0 (pid=4197) |
| 339 | 4.00 KB kworker/u9:2 (pid=1381) |
| 340 | Disk nr_sector |
| 341 | ############################################################################### |
| 342 | ███████████████████████████████████████████████████████████████████ 4416.00 sectors vda1 |
| 343 | Disk nr_requests |
| 344 | ############################################################################### |
| 345 | ████████████████████████████████████████████████████████████████████ 177.00 requests vda1 |
| 346 | Disk request time/sector |
| 347 | ############################################################################### |
| 348 | ██████████████████████████████████████████████████████████████████ 0.01 ms vda1 |
| 349 | Network recv_bytes |
| 350 | ############################################################################### |
| 351 | ███████████████████████████████████████████████████████ 739.50 KB eth0 |
| 352 | █████ 80.27 KB lo |
| 353 | Network sent_bytes |
| 354 | ############################################################################### |
| 355 | ████████████████████████████████████████████████████████ 9.36 MB eth0 |
| 356 | |
| 357 | |
| 358 | -------- |
| 359 | Syscalls |
| 360 | -------- |
| 361 | |
| 362 | ^^^^^^^^^^ |
| 363 | Statistics |
| 364 | ^^^^^^^^^^ |
| 365 | |
| 366 | .. code-block:: bash |
| 367 | |
| 368 | $ ./lttng-syscallstats mytrace/ |
| 369 | Timerange: [2015-01-15 12:18:37.216484041, 2015-01-15 12:18:53.821580313] |
| 370 | Per-TID syscalls statistics (usec) |
| 371 | find (22785) Count Min Average Max Stdev Return values |
| 372 | - getdents 14240 0.380 364.301 43372.450 1629.390 {'success': 14240} |
| 373 | - close 14236 0.233 0.506 4.932 0.217 {'success': 14236} |
| 374 | - fchdir 14231 0.252 0.407 5.769 0.117 {'success': 14231} |
| 375 | - open 7123 0.779 2.321 12.697 0.936 {'success': 7119, 'ENOENT': 4} |
| 376 | - newfstatat 7118 1.457 143.562 28103.532 1410.281 {'success': 7118} |
| 377 | - openat 7118 1.525 2.411 9.107 0.771 {'success': 7118} |
| 378 | - newfstat 7117 0.272 0.654 8.707 0.248 {'success': 7117} |
| 379 | - write 573 0.298 0.715 8.584 0.391 {'success': 573} |
| 380 | - brk 27 0.615 5.768 30.792 7.830 {'success': 27} |
| 381 | - rt_sigaction 22 0.227 0.283 0.589 0.098 {'success': 22} |
| 382 | - mmap 12 1.116 2.116 3.597 0.762 {'success': 12} |
| 383 | - mprotect 6 1.185 2.235 3.923 1.148 {'success': 6} |
| 384 | - read 5 0.925 2.101 6.300 2.351 {'success': 5} |
| 385 | - ioctl 4 0.342 1.151 2.280 0.873 {'success': 2, 'ENOTTY': 2} |
| 386 | - access 4 1.166 2.530 4.202 1.527 {'ENOENT': 4} |
| 387 | - rt_sigprocmask 3 0.325 0.570 0.979 0.357 {'success': 3} |
| 388 | - dup2 2 0.250 0.562 0.874 ? {'success': 2} |
| 389 | - munmap 2 3.006 5.399 7.792 ? {'success': 2} |
| 390 | - execve 1 7277.974 7277.974 7277.974 ? {'success': 1} |
| 391 | - setpgid 1 0.945 0.945 0.945 ? {'success': 1} |
| 392 | - fcntl 1 ? 0.000 0.000 ? {} |
| 393 | - newuname 1 1.240 1.240 1.240 ? {'success': 1} |
| 394 | Total: 71847 |
| 395 | ----------------------------------------------------------------------------------------------------------------- |
| 396 | apache2 (31517) Count Min Average Max Stdev Return values |
| 397 | - fcntl 192 ? 0.000 0.000 ? {} |
| 398 | - newfstat 156 0.237 0.484 1.102 0.222 {'success': 156} |
| 399 | - read 144 0.307 1.602 16.307 1.698 {'success': 117, 'EAGAIN': 27} |
| 400 | - access 96 0.705 1.580 3.364 0.670 {'success': 12, 'ENOENT': 84} |
| 401 | - newlstat 84 0.459 0.738 1.456 0.186 {'success': 63, 'ENOENT': 21} |
| 402 | - newstat 74 0.735 2.266 11.212 1.772 {'success': 50, 'ENOENT': 24} |
| 403 | - lseek 72 0.317 0.522 0.915 0.112 {'success': 72} |
| 404 | - close 39 0.471 0.615 0.867 0.069 {'success': 39} |
| 405 | - open 36 2.219 12162.689 437697.753 72948.868 {'success': 36} |
| 406 | - getcwd 28 0.287 0.701 1.331 0.277 {'success': 28} |
| 407 | - poll 27 1.080 1139.669 2851.163 856.723 {'success': 27} |
| 408 | - times 24 0.765 0.956 1.327 0.107 {'success': 24} |
| 409 | - setitimer 24 0.499 5.848 16.668 4.041 {'success': 24} |
| 410 | - write 24 5.467 6.784 16.827 2.459 {'success': 24} |
| 411 | - writev 24 10.241 17.645 29.817 5.116 {'success': 24} |
| 412 | - mmap 15 3.060 3.482 4.406 0.317 {'success': 15} |
| 413 | - munmap 15 2.944 3.502 4.154 0.427 {'success': 15} |
| 414 | - brk 12 0.738 4.579 13.795 4.437 {'success': 12} |
| 415 | - chdir 12 0.989 1.600 2.353 0.385 {'success': 12} |
| 416 | - flock 6 0.906 1.282 2.043 0.423 {'success': 6} |
| 417 | - rt_sigaction 6 0.530 0.725 1.123 0.217 {'success': 6} |
| 418 | - pwrite64 6 1.262 1.430 1.692 0.143 {'success': 6} |
| 419 | - rt_sigprocmask 6 0.539 0.650 0.976 0.162 {'success': 6} |
| 420 | - shutdown 3 7.323 8.487 10.281 1.576 {'success': 3} |
| 421 | - getsockname 3 1.015 1.228 1.585 0.311 {'success': 3} |
| 422 | - accept4 3 5174453.611 3450157.282 5176018.235 ? {'success': 2} |
| 423 | Total: 1131 |
| 424 | |
| 425 | |
| 426 | --- |
| 427 | IRQ |
| 428 | --- |
| 429 | |
| 430 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 431 | Handler duration and raise latency statistics |
| 432 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 433 | |
| 434 | .. code-block:: bash |
| 435 | |
| 436 | $ ./lttng-irqstats mytrace/ |
| 437 | Timerange: [2014-03-11 16:05:41.314824752, 2014-03-11 16:05:45.041994298] |
| 438 | Hard IRQ Duration (us) |
| 439 | count min avg max stdev |
| 440 | ----------------------------------------------------------------------------------| |
| 441 | 1: <i8042> 30 10.901 45.500 64.510 18.447 | |
| 442 | 42: <ahci> 259 3.203 7.863 21.426 3.183 | |
| 443 | 43: <eth0> 2 3.859 3.976 4.093 0.165 | |
| 444 | 44: <iwlwifi> 92 0.300 3.995 6.542 2.181 | |
| 445 | |
| 446 | Soft IRQ Duration (us) Raise latency (us) |
| 447 | count min avg max stdev | count min avg max stdev |
| 448 | ----------------------------------------------------------------------------------|------------------------------------------------------------ |
| 449 | 1: <TIMER_SOFTIRQ> 495 0.202 21.058 51.060 11.047 | 53 2.141 11.217 20.005 7.233 |
| 450 | 3: <NET_RX_SOFTIRQ> 14 0.133 9.177 32.774 10.483 | 14 0.763 3.703 10.902 3.448 |
| 451 | 4: <BLOCK_SOFTIRQ> 257 5.981 29.064 125.862 15.891 | 257 0.891 3.104 15.054 2.046 |
| 452 | 6: <TASKLET_SOFTIRQ> 26 0.309 1.198 1.748 0.329 | 26 9.636 39.222 51.430 11.246 |
| 453 | 7: <SCHED_SOFTIRQ> 299 1.185 14.768 90.465 15.992 | 298 1.286 31.387 61.700 11.866 |
| 454 | 9: <RCU_SOFTIRQ> 338 0.592 3.387 13.745 1.356 | 147 2.480 29.299 64.453 14.286 |
| 455 | |
| 456 | |
| 457 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 458 | Handler duration frequency distribution |
| 459 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 460 | |
| 461 | .. code-block:: bash |
| 462 | |
| 463 | $ ./lttng-irqfreq --timerange [16:05:42,16:05:45] --irq 44 --stats mytrace/ |
| 464 | Timerange: [2014-03-11 16:05:42.042034570, 2014-03-11 16:05:44.998914297] |
| 465 | Hard IRQ Duration (us) |
| 466 | count min avg max stdev |
| 467 | ----------------------------------------------------------------------------------| |
| 468 | 44: <iwlwifi> 72 0.300 4.018 6.542 2.164 | |
| 469 | Frequency distribution iwlwifi (44) |
| 470 | ############################################################################### |
| 471 | 0.300 █████ 1.00 |
| 472 | 0.612 ██████████████████████████████████████████████████████████████ 12.00 |
| 473 | 0.924 ████████████████████ 4.00 |
| 474 | 1.236 ██████████ 2.00 |
| 475 | 1.548 0.00 |
| 476 | 1.861 █████ 1.00 |
| 477 | 2.173 0.00 |
| 478 | 2.485 █████ 1.00 |
| 479 | 2.797 ██████████████████████████ 5.00 |
| 480 | 3.109 █████ 1.00 |
| 481 | 3.421 ███████████████ 3.00 |
| 482 | 3.733 0.00 |
| 483 | 4.045 █████ 1.00 |
| 484 | 4.357 █████ 1.00 |
| 485 | 4.669 ██████████ 2.00 |
| 486 | 4.981 ██████████ 2.00 |
| 487 | 5.294 █████████████████████████████████████████ 8.00 |
| 488 | 5.606 ████████████████████████████████████████████████████████████████████ 13.00 |
| 489 | 5.918 ██████████████████████████████████████████████████████████████ 12.00 |
| 490 | 6.230 ███████████████ 3.00 |
| 491 | |
| 492 | |
| 493 | ------ |
| 494 | Others |
| 495 | ------ |
| 496 | |
| 497 | There are a lot of other scripts, we encourage you to try them and read the |
| 498 | ``--help`` to see all the available options. |
| 499 | |
| 500 | |
| 501 | ================ |
| 502 | Work in progress |
| 503 | ================ |
| 504 | |
| 505 | Track the page cache and extract the latencies associated with pages flush to disk. |
| 506 | In order to do that, we rely on the assumption that the pages are flushed in a FIFO |
| 507 | order. It might not be 100% accurate, but it already gives great results : |
| 508 | |
| 509 | An example here when saving a file in vim:: |
| 510 | |
| 511 | [19:57:51.173332284 - 19:57:51.177794657] vim (31517) syscall_entry_fsync(fd = 4 <blabla>) = 0, 4.462 ms |
| 512 | 1 dirty page(s) were flushed (assuming FIFO): |
| 513 | vim (31517): 1 pages |
| 514 | - blabla : 1 pages |
| 515 | 13 active dirty filesystem page(s) (known): |
| 516 | redis-server (2092): 2 pages |
| 517 | - /var/log/redis/redis-server.log : 2 pages |
| 518 | vim (31517): 2 pages |
| 519 | - .blabla.swp : 2 pages |
| 520 | lttng-consumerd (6750): 9 pages |
| 521 | - unknown (origin not found) : 9 pages |
| 522 | |
| 523 | |
| 524 | An other example when running the 'sync' command:: |
| 525 | |
| 526 | [19:57:53.046840755 - 19:57:53.072809609] sync (31554) syscall_entry_sync(fd = <unknown>) = 0, 25.969 ms |
| 527 | 23 dirty page(s) were flushed (assuming FIFO): |
| 528 | redis-server (2092): 2 pages |
| 529 | - /var/log/redis/redis-server.log : 2 pages |
| 530 | vim (31517): 9 pages |
| 531 | - /home/julien/.viminfo.tmp : 6 pages |
| 532 | - .blabla.swp : 3 pages |
| 533 | lttng-consumerd (6750): 12 pages |
| 534 | - unknown (origin not found) : 12 pages |
| 535 | |
| 536 | |
| 537 | PostgreSQL with 'sys_fdatasync':: |
| 538 | |
| 539 | [13:49:39.908599447 - 13:49:39.915930730] postgres (1137) sys_fdatasync(fd = 7 </var/lib/postgresql/9.1/main/pg_xlog/000000010000000000000008>) = 0, 7.331 ms |
| 540 | 2 pages allocated during the period |
| 541 | 88 dirty page(s) were flushed (assuming FIFO): |
| 542 | postgres (1137): 88 pages |
| 543 | - /var/lib/postgresql/9.1/main/pg_xlog/000000010000000000000008 : 88 pages |
| 544 | 68 last dirtied filesystem page(s): |
| 545 | postgres (2419): 68 pages |
| 546 | - base/11951/18410 : 46 pages |
| 547 | - base/11951/18407 : 10 pages |
| 548 | - base/11951/18407_fsm : 6 pages |
| 549 | - base/11951/18410_fsm : 6 pages |
| 550 | |
| 551 | |
| 552 | Detecting a fight for the I/O between a huge write and postgresql:: |
| 553 | |
| 554 | [13:49:47.242730583 - 13:49:47.442835037] python (2353) sys_write(fd = 3 </root/bla>, count = 102395904) = 102395904, 200.104 ms |
| 555 | 34760 pages allocated during the period |
| 556 | woke up kswapd during the period |
| 557 | 10046 pages written on disk |
| 558 | freed 33753 pages from the cache during the period |
| 559 | 1397 last dirtied filesystem page(s): |
| 560 | python (2353): 1325 pages |
| 561 | - /root/bla : 1325 pages |
| 562 | postgres (2419): 72 pages |
| 563 | - base/11951/18419 : 72 pages |
| 564 | |
| 565 | |
| 566 | =========== |
| 567 | Limitations |
| 568 | =========== |
| 569 | |
| 570 | The main limitation of this project is the fact that it can be quite slow to |
| 571 | process a large trace. This project is a work in progress and we focus on the |
| 572 | problem-solving aspect. Therefore, features have been prioritized over |
| 573 | performance for now. |
| 574 | |
| 575 | One other aspect is the fact that the state is not persistent; the trace has |
| 576 | to be re-processed if another analysis script is to be used on the same trace. |
| 577 | Some scripts belonging to the same category allow the combination of multiple |
| 578 | analyses into a single pass (see ``--freq``, ``--log``, ``--usage``, |
| 579 | ``--latencystats``, etc). We are planning to add a way to save the state |
| 580 | and/or create an interactive environment to allow the user to run multiple |
| 581 | analyses on the same trace without having to process the trace every time. |
| 582 | |
| 583 | |
| 584 | ========== |
| 585 | Conclusion |
| 586 | ========== |
| 587 | |
| 588 | We hope you have fun trying this project and please remember it is a work in |
| 589 | progress; feedback, bug reports and improvement ideas are always welcome! |
| 590 | |
| 591 | |
| 592 | .. _pip: http://www.pip-installer.org/en/latest/index.html |
| 593 | |
| 594 | |
| 595 | .. |pypi| image:: https://img.shields.io/pypi/v/lttnganalyses.svg?style=flat-square&label=latest%20version |
| 596 | :target: https://pypi.python.org/pypi/lttnganalyses |
| 597 | :alt: Latest version released on PyPi |