Commit | Line | Data |
---|---|---|
28ccfe49 JD |
1 | # lttng-analyses |
2 | ||
3 | This repository contains various scripts to extract monitoring data and metrics | |
4 | from LTTng kernel traces. | |
5 | ||
0f582ce8 | 6 | As opposed to other diagnostic or monitoring solutions, this approach is designed |
898ff628 JG |
7 | to allow users to record their system's activity with a low overhead, wait |
8 | for a problem to occur and then diagnose its cause offline. | |
28ccfe49 | 9 | |
0f582ce8 AB |
10 | This solution allows the user to target hard to find problems and dig until the |
11 | root cause is found. | |
28ccfe49 | 12 | |
0f582ce8 | 13 | This README describes the implemented analyses as well as how to use them. |
28ccfe49 JD |
14 | |
15 | ## Requirements | |
e3de48f7 | 16 | * LTTng >= 2.5 |
898ff628 | 17 | * Babeltrace >= 1.2 (with python bindings built) |
e3de48f7 | 18 | * Python >= 3.4 |
28ccfe49 JD |
19 | |
20 | ## Install on Ubuntu (12.04 and 14.04 at least) | |
e3de48f7 | 21 | ```bash |
851bf235 JD |
22 | apt-get install -y software-properties-common (or python-software-properties on 12.04) |
23 | apt-add-repository -y ppa:lttng/ppa | |
24 | apt-get update | |
25 | apt-get -y install lttng-tools babeltrace lttng-modules-dkms python3-babeltrace python3-progressbar | |
898ff628 JG |
26 | ``` |
27 | ||
898ff628 | 28 | ```bash |
e3de48f7 JD |
29 | git clone https://github.com/lttng/lttng-analyses.git |
30 | ``` | |
28ccfe49 JD |
31 | |
32 | ## Trace creation | |
b5019297 JD |
33 | Here are the basic commands to create a trace, for more information on the |
34 | LTTng setup, please refer to the [LTTng | |
35 | documentation](http://lttng.org/docs/#doc-tracing-the-linux-kernel) | |
36 | ||
e3de48f7 JD |
37 | ### Automatic |
38 | From the cloned git tree: | |
39 | ```bash | |
40 | ./lttng-analyses-record | |
28ccfe49 | 41 | ``` |
e3de48f7 JD |
42 | |
43 | ### Manual | |
44 | ```bash | |
851bf235 JD |
45 | lttng create |
46 | lttng enable-channel -k bla --subbuf-size=4M | |
0e634170 | 47 | lttng enable-event -k sched_switch,block_rq_complete,block_rq_issue,block_bio_remap,block_bio_backmerge,netif_receive_skb,net_dev_xmit,sched_process_fork,sched_process_exec,lttng_statedump_process_state,lttng_statedump_file_descriptor,lttng_statedump_block_device,writeback_pages_written,mm_vmscan_wakeup_kswapd,mm_page_free,mm_page_alloc,block_dirty_buffer,irq_handler_entry,irq_handler_exit,softirq_entry,softirq_exit,softirq_raise -c bla |
851bf235 JD |
48 | lttng enable-event -k --syscall -a -c bla |
49 | lttng start | |
50 | ..do stuff... | |
51 | lttng stop | |
52 | lttng destroy | |
28ccfe49 JD |
53 | ``` |
54 | ||
e3de48f7 | 55 | ### Remote trace creation |
28ccfe49 | 56 | You can also create a trace on a server and send it to a remote host. The |
0f582ce8 AB |
57 | remote host only needs to run `lttng-relayd -d` and be reachable over the network. |
58 | The only difference with the above commands is the tracing session's creation: | |
e3de48f7 | 59 | ```bash |
851bf235 | 60 | lttng create -U net://<remote-host> |
3d184c69 | 61 | ``` |
28ccfe49 | 62 | |
e3de48f7 JD |
63 | ## Implemented analyses |
64 | ||
65 | * CPU usage for the whole system | |
66 | * CPU usage per-process | |
67 | * Process CPU migration count | |
68 | * Memory usage per-process (as seen by the kernel) | |
69 | * Memory usage system-wide (as seen by the kernel) | |
70 | * I/O usage (syscalls, disk, network) | |
71 | * I/O operations log (with latency and usage) | |
72 | * I/O latency statistics (open, read, write, sync operations) | |
73 | * I/O latency frequency distribution | |
74 | * Interrupt handler duration statistics (count, min, max, average stdev) | |
75 | * Interrupt handler duration top | |
76 | * Interrupt handler duration log | |
77 | * Interrupt handler duration frequency distribution | |
78 | * SoftIRQ handler latency statistics | |
79 | * Syscalls usage statistics | |
80 | ||
898ff628 JG |
81 | All of the analyses share the same code architecture making it possible |
82 | to filter by timerange, process name, PID, min and max values using the | |
83 | same command-line options. Also note that reported timestamps can | |
84 | optionally be expressed in the GMT timezone to allow easy sharing between | |
85 | teams. | |
e3de48f7 | 86 | |
898ff628 JG |
87 | The project's architecture makes it easy to add new analyses or to reuse |
88 | the analysis backend in external tools which may then present the results | |
89 | in their own format (as opposed to text). | |
e3de48f7 | 90 | |
fe109453 | 91 | ## Examples |
898ff628 JG |
92 | After having collected your trace, any script contained in this repository |
93 | can be used to run an analysis. Read on for some examples! | |
e3de48f7 JD |
94 | |
95 | ### I/O | |
96 | #### I/O latency stats | |
97 | ```bash | |
fe109453 | 98 | $ ./lttng-iolatencystats mytrace/ |
e3de48f7 JD |
99 | Timerange: [2015-01-06 10:58:26.140545481, 2015-01-06 10:58:27.229358936] |
100 | Syscalls latency statistics (usec): | |
101 | Type Count Min Average Max Stdev | |
102 | ----------------------------------------------------------------------------------------- | |
103 | Open 45 5.562 13.835 77.683 15.263 | |
104 | Read 109 0.316 5.774 62.569 9.277 | |
105 | Write 101 0.256 7.060 48.531 8.555 | |
106 | Sync 207 19.384 40.664 160.188 21.201 | |
107 | ||
108 | Disk latency statistics (usec): | |
109 | Name Count Min Average Max Stdev | |
110 | ----------------------------------------------------------------------------------------- | |
111 | dm-0 108 0.001 0.004 0.007 1.306 | |
112 | ``` | |
113 | ||
114 | #### I/O latency frequency distribution | |
115 | ```bash | |
fe109453 | 116 | $ ./lttng-iolatencyfreq mytrace/ |
e3de48f7 JD |
117 | Timerange: [2015-01-06 10:58:26.140545481, 2015-01-06 10:58:27.229358936] |
118 | Open latency distribution (usec) | |
28ccfe49 | 119 | ############################################################################### |
0f582ce8 AB |
120 | 5.562 ███████████████████████████████████████████████████████████████████ 25 |
121 | 9.168 ██████████ 4 | |
122 | 12.774 █████████████████████ 8 | |
123 | 16.380 ████████ 3 | |
124 | 19.986 █████ 2 | |
125 | 23.592 0 | |
126 | 27.198 0 | |
127 | 30.804 0 | |
128 | 34.410 ██ 1 | |
129 | 38.016 0 | |
130 | 41.623 0 | |
131 | 45.229 0 | |
132 | 48.835 0 | |
133 | 52.441 0 | |
134 | 56.047 0 | |
135 | 59.653 0 | |
136 | 63.259 0 | |
137 | 66.865 0 | |
138 | 70.471 0 | |
139 | 74.077 █████ 2 | |
e3de48f7 JD |
140 | ``` |
141 | ||
bb0e7459 JD |
142 | #### I/O latency top |
143 | ```bash | |
144 | $ ./lttng-iolatencytop analysis-20150115-120942/ --limit 3 --minsize 2 | |
145 | Checking the trace for lost events... | |
146 | Timerange: [2015-01-15 12:18:37.216484041, 2015-01-15 12:18:53.821580313] | |
147 | Top open syscall latencies (usec) | |
148 | Begin End Name Duration (usec) Size Proc PID Filename | |
149 | [12:18:50.432950815,12:18:50.870648568] open 437697.753 N/A apache2 31517 /var/lib/php5/sess_0ifir2hangm8ggaljdphl9o5b5 (fd=13) | |
150 | [12:18:52.946080165,12:18:52.946132278] open 52.113 N/A apache2 31588 /var/lib/php5/sess_mr9045p1k55vin1h0vg7rhgd63 (fd=13) | |
151 | [12:18:46.800846035,12:18:46.800874916] open 28.881 N/A apache2 31591 /var/lib/php5/sess_r7c12pccfvjtas15g3j69u14h0 (fd=13) | |
152 | [12:18:51.389797604,12:18:51.389824426] open 26.822 N/A apache2 31520 /var/lib/php5/sess_4sdb1rtjkhb78sabnoj8gpbl00 (fd=13) | |
153 | ||
154 | Top read syscall latencies (usec) | |
155 | Begin End Name Duration (usec) Size Proc PID Filename | |
156 | [12:18:37.256073107,12:18:37.256555967] read 482.860 7.00 B bash 10237 unknown (origin not found) (fd=3) | |
157 | [12:18:52.000209798,12:18:52.000252304] read 42.506 1.00 KB irqbalance 1337 /proc/interrupts (fd=3) | |
158 | [12:18:37.256559439,12:18:37.256601615] read 42.176 5.00 B bash 10237 unknown (origin not found) (fd=3) | |
159 | [12:18:42.000281918,12:18:42.000320016] read 38.098 1.00 KB irqbalance 1337 /proc/interrupts (fd=3) | |
160 | ||
161 | Top write syscall latencies (usec) | |
162 | Begin End Name Duration (usec) Size Proc PID Filename | |
163 | [12:18:49.913241516,12:18:49.915908862] write 2667.346 95.00 B apache2 31584 /var/log/apache2/access.log (fd=8) | |
164 | [12:18:37.472823631,12:18:37.472859836] writev 36.205 21.97 KB apache2 31544 unknown (origin not found) (fd=12) | |
165 | [12:18:37.991578372,12:18:37.991612724] writev 34.352 21.97 KB apache2 31589 unknown (origin not found) (fd=12) | |
166 | [12:18:39.547778549,12:18:39.547812515] writev 33.966 21.97 KB apache2 31584 unknown (origin not found) (fd=12) | |
167 | ||
168 | Top sync syscall latencies (usec) | |
169 | Begin End Name Duration (usec) Size Proc PID Filename | |
170 | [12:18:50.162776739,12:18:51.157522361] sync 994745.622 N/A sync 22791 None (fd=None) | |
171 | [12:18:37.227867532,12:18:37.232289687] sync_file_range 4422.155 N/A lttng-consumerd 19964 /home/julien/lttng-traces/analysis-20150115-120942/kernel/metadata (fd=32) | |
172 | [12:18:37.238076585,12:18:37.239012027] sync_file_range 935.442 N/A lttng-consumerd 19964 /home/julien/lttng-traces/analysis-20150115-120942/kernel/metadata (fd=32) | |
173 | [12:18:37.220974711,12:18:37.221647124] sync_file_range 672.413 N/A lttng-consumerd 19964 /home/julien/lttng-traces/analysis-20150115-120942/kernel/metadata (fd=32) | |
174 | ``` | |
175 | ||
e3de48f7 JD |
176 | #### I/O operations log |
177 | ```bash | |
fe109453 | 178 | $ ./lttng-iolog mytrace/ |
e3de48f7 JD |
179 | [10:58:26.221618530,10:58:26.221620659] write 2.129 8.00 B /usr/bin/x-term 11793 anon_inode:[eventfd] (fd=5) |
180 | [10:58:26.221623609,10:58:26.221628055] read 4.446 50.00 B /usr/bin/x-term 11793 /dev/ptmx (fd=24) | |
181 | [10:58:26.221638929,10:58:26.221640008] write 1.079 8.00 B /usr/bin/x-term 11793 anon_inode:[eventfd] (fd=5) | |
182 | [10:58:26.221676232,10:58:26.221677385] read 1.153 8.00 B /usr/bin/x-term 11793 anon_inode:[eventfd] (fd=5) | |
183 | [10:58:26.223401804,10:58:26.223411683] open 9.879 N/A sleep 12420 /etc/ld.so.cache (fd=3) | |
184 | [10:58:26.223448060,10:58:26.223455577] open 7.517 N/A sleep 12420 /lib/x86_64-linux-gnu/libc.so.6 (fd=3) | |
185 | [10:58:26.223456522,10:58:26.223458898] read 2.376 832.00 B sleep 12420 /lib/x86_64-linux-gnu/libc.so.6 (fd=3) | |
0f582ce8 | 186 | [10:58:26.223918068,10:58:26.223929316] open 11.248 N/A sleep 12420 (fd=3) |
e3de48f7 JD |
187 | [10:58:26.231881565,10:58:26.231895970] writev 14.405 16.00 B /usr/bin/x-term 11793 socket:[45650] (fd=4) |
188 | [10:58:26.231979636,10:58:26.231988446] recvmsg 8.810 16.00 B Xorg 1827 socket:[47480] (fd=38) | |
189 | ``` | |
190 | ||
191 | ||
192 | #### I/O usage top | |
193 | ```bash | |
fe109453 | 194 | $ ./lttng-iousagetop traces/pgread-writes |
e3de48f7 JD |
195 | Timerange: [2014-10-07 16:36:00.733214969, 2014-10-07 16:36:18.804584183] |
196 | Per-process I/O Read | |
197 | ############################################################################### | |
198 | ██████████████████████████████████████████████████ 16.00 MB lttng-consumerd (2619) 0 B file 4.00 B net 16.00 MB unknown | |
199 | █████ 1.72 MB lttng-consumerd (2619) 0 B file 0 B net 1.72 MB unknown | |
200 | █ 398.13 KB postgres (4219) 121.05 KB file 277.07 KB net 8.00 B unknown | |
201 | 256.09 KB postgres (1348) 0 B file 255.97 KB net 117.00 B unknown | |
202 | 204.81 KB postgres (4218) 204.81 KB file 0 B net 0 B unknown | |
203 | 123.77 KB postgres (4220) 117.50 KB file 6.26 KB net 8.00 B unknown | |
204 | Per-process I/O Write | |
205 | ############################################################################### | |
0f582ce8 AB |
206 | ██████████████████████████████████████████████████ 16.00 MB lttng-consumerd (2619) 0 B file 8.00 MB net 8.00 MB unknown |
207 | ██████ 2.20 MB postgres (4219) 2.00 MB file 202.23 KB net 0 B unknown | |
208 | █████ 1.73 MB lttng-consumerd (2619) 0 B file 887.73 KB net 882.58 KB unknown | |
209 | ██ 726.33 KB postgres (1165) 8.00 KB file 6.33 KB net 712.00 KB unknown | |
210 | 158.69 KB postgres (1168) 158.69 KB file 0 B net 0 B unknown | |
211 | 80.66 KB postgres (1348) 0 B file 80.66 KB net 0 B unknown | |
28ccfe49 | 212 | Files Read |
e3de48f7 JD |
213 | ############################################################################### |
214 | ██████████████████████████████████████████████████ 8.00 MB anon_inode:[lttng_stream] (lttng-consumerd) 'fd 32 in lttng-consumerd (2619)' | |
215 | █████ 834.41 KB base/16384/pg_internal.init 'fd 7 in postgres (4219)', 'fd 7 in postgres (4220)', 'fd 7 in postgres (4221)', 'fd 7 in postgres (4222)', 'fd 7 in postgres (4223)', 'fd 7 in postgres (4224)', 'fd 7 in postgres (4225)', 'fd 7 in postgres (4226)' | |
216 | █ 256.09 KB socket:[8893] (postgres) 'fd 9 in postgres (1348)' | |
217 | █ 174.69 KB pg_stat_tmp/pgstat.stat 'fd 9 in postgres (4218)', 'fd 9 in postgres (1167)' | |
218 | 109.48 KB global/pg_internal.init 'fd 7 in postgres (4218)', 'fd 7 in postgres (4219)', 'fd 7 in postgres (4220)', 'fd 7 in postgres (4221)', 'fd 7 in postgres (4222)', 'fd 7 in postgres (4223)', 'fd 7 in postgres (4224)', 'fd 7 in postgres (4225)', 'fd 7 in postgres (4226)' | |
219 | 104.30 KB base/11951/pg_internal.init 'fd 7 in postgres (4218)' | |
220 | 12.85 KB socket (lttng-sessiond) 'fd 30 in lttng-sessiond (384)' | |
221 | 4.50 KB global/pg_filenode.map 'fd 7 in postgres (4218)', 'fd 7 in postgres (4219)', 'fd 7 in postgres (4220)', 'fd 7 in postgres (4221)', 'fd 7 in postgres (4222)', 'fd 7 in postgres (4223)', 'fd 7 in postgres (4224)', 'fd 7 in postgres (4225)', 'fd 7 in postgres (4226)' | |
222 | 4.16 KB socket (postgres) 'fd 9 in postgres (4226)' | |
223 | 4.00 KB /proc/interrupts 'fd 3 in irqbalance (1104)' | |
19952e0f | 224 | Files Write |
e3de48f7 JD |
225 | ############################################################################### |
226 | ██████████████████████████████████████████████████ 8.00 MB socket:[56371] (lttng-consumerd) 'fd 30 in lttng-consumerd (2619)' | |
227 | █████████████████████████████████████████████████ 8.00 MB pipe:[53306] (lttng-consumerd) 'fd 12 in lttng-consumerd (2619)' | |
228 | ██████████ 1.76 MB pg_xlog/00000001000000000000000B 'fd 31 in postgres (4219)' | |
229 | █████ 887.82 KB socket:[56369] (lttng-consumerd) 'fd 26 in lttng-consumerd (2619)' | |
230 | █████ 882.58 KB pipe:[53309] (lttng-consumerd) 'fd 18 in lttng-consumerd (2619)' | |
231 | 160.00 KB /var/lib/postgresql/9.1/main/base/16384/16602 'fd 14 in postgres (1165)' | |
232 | 158.69 KB pg_stat_tmp/pgstat.tmp 'fd 3 in postgres (1168)' | |
233 | 144.00 KB /var/lib/postgresql/9.1/main/base/16384/16613 'fd 12 in postgres (1165)' | |
234 | 88.00 KB /var/lib/postgresql/9.1/main/base/16384/16609 'fd 11 in postgres (1165)' | |
235 | 78.28 KB socket:[8893] (postgres) 'fd 9 in postgres (1348)' | |
236 | Block I/O Read | |
237 | ############################################################################### | |
238 | Block I/O Write | |
239 | ############################################################################### | |
0f582ce8 AB |
240 | ██████████████████████████████████████████████████ 1.76 MB postgres (pid=4219) |
241 | ████ 160.00 KB postgres (pid=1168) | |
e3de48f7 | 242 | ██ 100.00 KB kworker/u8:0 (pid=1540) |
0f582ce8 AB |
243 | ██ 96.00 KB jbd2/vda1-8 (pid=257) |
244 | █ 40.00 KB postgres (pid=1166) | |
e3de48f7 JD |
245 | 8.00 KB kworker/u9:0 (pid=4197) |
246 | 4.00 KB kworker/u9:2 (pid=1381) | |
28ccfe49 JD |
247 | Disk nr_sector |
248 | ############################################################################### | |
e3de48f7 | 249 | ███████████████████████████████████████████████████████████████████ 4416.00 sectors vda1 |
28ccfe49 JD |
250 | Disk nr_requests |
251 | ############################################################################### | |
e3de48f7 | 252 | ████████████████████████████████████████████████████████████████████ 177.00 requests vda1 |
28ccfe49 JD |
253 | Disk request time/sector |
254 | ############################################################################### | |
e3de48f7 | 255 | ██████████████████████████████████████████████████████████████████ 0.01 ms vda1 |
28ccfe49 JD |
256 | Network recv_bytes |
257 | ############################################################################### | |
e3de48f7 | 258 | ███████████████████████████████████████████████████████ 739.50 KB eth0 |
0f582ce8 | 259 | █████ 80.27 KB lo |
28ccfe49 JD |
260 | Network sent_bytes |
261 | ############################################################################### | |
e3de48f7 | 262 | ████████████████████████████████████████████████████████ 9.36 MB eth0 |
28ccfe49 | 263 | ``` |
19952e0f | 264 | |
e3de48f7 JD |
265 | ### IRQ |
266 | #### Handler duration and raise latency statistics | |
267 | ```bash | |
fe109453 | 268 | $ ./lttng-irqstats mytrace/ |
e3de48f7 JD |
269 | Timerange: [2014-03-11 16:05:41.314824752, 2014-03-11 16:05:45.041994298] |
270 | Hard IRQ Duration (us) | |
0f582ce8 | 271 | count min avg max stdev |
e3de48f7 | 272 | ----------------------------------------------------------------------------------| |
0f582ce8 AB |
273 | 1: <i8042> 30 10.901 45.500 64.510 18.447 | |
274 | 42: <ahci> 259 3.203 7.863 21.426 3.183 | | |
275 | 43: <eth0> 2 3.859 3.976 4.093 0.165 | | |
276 | 44: <iwlwifi> 92 0.300 3.995 6.542 2.181 | | |
e3de48f7 JD |
277 | |
278 | Soft IRQ Duration (us) Raise latency (us) | |
0f582ce8 | 279 | count min avg max stdev | count min avg max stdev |
e3de48f7 JD |
280 | ----------------------------------------------------------------------------------|------------------------------------------------------------ |
281 | 1: <TIMER_SOFTIRQ> 495 0.202 21.058 51.060 11.047 | 53 2.141 11.217 20.005 7.233 | |
282 | 3: <NET_RX_SOFTIRQ> 14 0.133 9.177 32.774 10.483 | 14 0.763 3.703 10.902 3.448 | |
283 | 4: <BLOCK_SOFTIRQ> 257 5.981 29.064 125.862 15.891 | 257 0.891 3.104 15.054 2.046 | |
284 | 6: <TASKLET_SOFTIRQ> 26 0.309 1.198 1.748 0.329 | 26 9.636 39.222 51.430 11.246 | |
285 | 7: <SCHED_SOFTIRQ> 299 1.185 14.768 90.465 15.992 | 298 1.286 31.387 61.700 11.866 | |
286 | 9: <RCU_SOFTIRQ> 338 0.592 3.387 13.745 1.356 | 147 2.480 29.299 64.453 14.286 | |
287 | ``` | |
288 | ||
289 | #### Handler duration frequency distribution | |
290 | ```bash | |
fe109453 | 291 | $ ./lttng-irqfreq --timerange [16:05:42,16:05:45] --irq 44 --stats mytrace/ |
0f582ce8 | 292 | Timerange: [2014-03-11 16:05:42.042034570, 2014-03-11 16:05:44.998914297] |
e3de48f7 | 293 | Hard IRQ Duration (us) |
0f582ce8 | 294 | count min avg max stdev |
e3de48f7 | 295 | ----------------------------------------------------------------------------------| |
0f582ce8 | 296 | 44: <iwlwifi> 72 0.300 4.018 6.542 2.164 | |
e3de48f7 JD |
297 | Frequency distribution iwlwifi (44) |
298 | ############################################################################### | |
0f582ce8 AB |
299 | 0.300 █████ 1.00 |
300 | 0.612 ██████████████████████████████████████████████████████████████ 12.00 | |
301 | 0.924 ████████████████████ 4.00 | |
302 | 1.236 ██████████ 2.00 | |
303 | 1.548 0.00 | |
304 | 1.861 █████ 1.00 | |
305 | 2.173 0.00 | |
306 | 2.485 █████ 1.00 | |
307 | 2.797 ██████████████████████████ 5.00 | |
308 | 3.109 █████ 1.00 | |
309 | 3.421 ███████████████ 3.00 | |
310 | 3.733 0.00 | |
311 | 4.045 █████ 1.00 | |
312 | 4.357 █████ 1.00 | |
313 | 4.669 ██████████ 2.00 | |
314 | 4.981 ██████████ 2.00 | |
315 | 5.294 █████████████████████████████████████████ 8.00 | |
316 | 5.606 ████████████████████████████████████████████████████████████████████ 13.00 | |
317 | 5.918 ██████████████████████████████████████████████████████████████ 12.00 | |
318 | 6.230 ███████████████ 3.00 | |
e3de48f7 JD |
319 | ``` |
320 | ||
321 | ### Others | |
322 | There are a lot of other scripts, we encourage you to try them and read the | |
323 | ```--help``` to see all the available options. | |
324 | ||
173fa485 | 325 | ## Work in progress |
19952e0f JD |
326 | Track the page cache and extract the latencies associated with pages flush to disk. |
327 | In order to do that, we rely on the assumption that the pages are flushed in a FIFO | |
328 | order. It might not be 100% accurate, but it already gives great results : | |
329 | ||
330 | An example here when saving a file in vim: | |
173fa485 | 331 | ``` |
19952e0f JD |
332 | [19:57:51.173332284 - 19:57:51.177794657] vim (31517) syscall_entry_fsync(fd = 4 <blabla>) = 0, 4.462 ms |
333 | 1 dirty page(s) were flushed (assuming FIFO): | |
334 | vim (31517): 1 pages | |
335 | - blabla : 1 pages | |
336 | 13 active dirty filesystem page(s) (known): | |
337 | redis-server (2092): 2 pages | |
338 | - /var/log/redis/redis-server.log : 2 pages | |
339 | vim (31517): 2 pages | |
340 | - .blabla.swp : 2 pages | |
341 | lttng-consumerd (6750): 9 pages | |
342 | - unknown (origin not found) : 9 pages | |
343 | ``` | |
344 | ||
345 | An other example when running the 'sync' command: | |
173fa485 | 346 | ``` |
19952e0f JD |
347 | [19:57:53.046840755 - 19:57:53.072809609] sync (31554) syscall_entry_sync(fd = <unknown>) = 0, 25.969 ms |
348 | 23 dirty page(s) were flushed (assuming FIFO): | |
349 | redis-server (2092): 2 pages | |
350 | - /var/log/redis/redis-server.log : 2 pages | |
351 | vim (31517): 9 pages | |
352 | - /home/julien/.viminfo.tmp : 6 pages | |
353 | - .blabla.swp : 3 pages | |
354 | lttng-consumerd (6750): 12 pages | |
355 | - unknown (origin not found) : 12 pages | |
356 | ``` | |
357 | ||
c6bb95df | 358 | PostgreSQL with 'sys_fdatasync': |
19952e0f JD |
359 | ``` |
360 | [13:49:39.908599447 - 13:49:39.915930730] postgres (1137) sys_fdatasync(fd = 7 </var/lib/postgresql/9.1/main/pg_xlog/000000010000000000000008>) = 0, 7.331 ms | |
361 | 2 pages allocated during the period | |
362 | 88 dirty page(s) were flushed (assuming FIFO): | |
363 | postgres (1137): 88 pages | |
364 | - /var/lib/postgresql/9.1/main/pg_xlog/000000010000000000000008 : 88 pages | |
365 | 68 last dirtied filesystem page(s): | |
366 | postgres (2419): 68 pages | |
367 | - base/11951/18410 : 46 pages | |
368 | - base/11951/18407 : 10 pages | |
369 | - base/11951/18407_fsm : 6 pages | |
370 | - base/11951/18410_fsm : 6 pages | |
371 | ``` | |
372 | ||
898ff628 | 373 | Detecting a fight for the I/O between a huge write and postgresql: |
19952e0f JD |
374 | ``` |
375 | [13:49:47.242730583 - 13:49:47.442835037] python (2353) sys_write(fd = 3 </root/bla>, count = 102395904) = 102395904, 200.104 ms | |
376 | 34760 pages allocated during the period | |
377 | woke up kswapd during the period | |
378 | 10046 pages written on disk | |
379 | freed 33753 pages from the cache during the period | |
380 | 1397 last dirtied filesystem page(s): | |
381 | python (2353): 1325 pages | |
382 | - /root/bla : 1325 pages | |
383 | postgres (2419): 72 pages | |
384 | - base/11951/18419 : 72 pages | |
385 | ``` | |
386 | ||
e3de48f7 JD |
387 | ## Limitations |
388 | ||
389 | The main limitation of this project is the fact that it can be quite slow to | |
390 | process a large trace. This project is a work in progress and we focus on the | |
0f582ce8 AB |
391 | problem-solving aspect. Therefore, features have been prioritized over |
392 | performance for now. | |
e3de48f7 | 393 | |
898ff628 JG |
394 | One other aspect is the fact that the state is not persistent; the trace has |
395 | to be re-processed if another analysis script is to be used on the same trace. | |
396 | Some scripts belonging to the same category allow the combination of multiple | |
397 | analyses into a single pass (see ```--freq```, ```--log```, ```--usage```, | |
e3de48f7 JD |
398 | ```--latencystats```, etc). We are planning to add a way to save the state |
399 | and/or create an interactive environment to allow the user to run multiple | |
400 | analyses on the same trace without having to process the trace every time. | |
401 | ||
402 | ||
19952e0f | 403 | ## Conclusion |
898ff628 JG |
404 | We hope you have fun trying this project and please remember it is a work in |
405 | progress; feedback, bug reports and improvement ideas are always welcome! |