Commit | Line | Data |
---|---|---|
17edf6ce MD |
1 | LTTng calibrate command documentation |
2 | Mathieu Desnoyers, August 6, 2011 | |
3 | ||
4 | The LTTng calibrate command can be used to find out the combined average | |
5 | overhead of the LTTng tracer and the instrumentation mechanisms used. | |
6 | This overhead can be calibrated in terms of time or using any of the PMU | |
7 | performance counter available on the system. | |
8 | ||
9 | For now, the only calibration implemented is that of the kernel function | |
10 | instrumentation (kretprobes). | |
11 | ||
12 | ||
13 | * Calibrate kernel function instrumentation | |
14 | ||
15 | Let's use an example to show this calibration. We use an i7 processor | |
16 | with 4 general-purpose PMU registers. This information is available by | |
17 | issuing dmesg, looking for "generic registers". | |
18 | ||
19 | This sequence of commands will gather a trace executing a kretprobe | |
20 | hooked on an empty function, gathering PMU counters LLC (Last Level | |
21 | Cache) misses information (see lttng add-context --help to see the list | |
22 | of available PMU counters). | |
23 | ||
24 | (as root) | |
25 | lttng create calibrate-function | |
26 | lttng enable-event calibrate --kernel --function lttng_calibrate_kretprobe | |
27 | lttng add-context --kernel -t perf:LLC-load-misses -t perf:LLC-store-misses \ | |
28 | -t perf:LLC-prefetch-misses | |
29 | lttng start | |
30 | for a in $(seq 1 10); do \ | |
31 | lttng calibrate --kernel --function; | |
32 | done | |
33 | lttng destroy | |
9674ce7a | 34 | babeltrace $(ls -1drt ~/lttng-traces/calibrate-function-* | tail -n 1) |
17edf6ce MD |
35 | |
36 | The output from babeltrace can be saved to a text file and opened in a | |
37 | spreadsheet (e.g. oocalc) to focus on the per-PMU counter delta between | |
38 | consecutive "calibrate_entry" and "calibrate_return" events. Note that | |
39 | these counters are per-CPU, so scheduling events would need to be | |
40 | present to account for migration between CPU. Therefore, for calibration | |
41 | purposes, only events staying on the same CPU must be considered. | |
42 | ||
43 | The average result, for the i7, on 10 samples: | |
44 | ||
45 | Average Std.Dev. | |
46 | perf_LLC_load_misses: 5.0 0.577 | |
47 | perf_LLC_store_misses: 1.6 0.516 | |
48 | perf_LLC_prefetch_misses: 9.0 14.742 | |
49 | ||
50 | As we can notice, the load and store misses are relatively stable across | |
51 | runs (their standard deviation is relatively low) compared to the | |
52 | prefetch misses. We can conclude from this information that LLC load and | |
53 | store misses can be accounted for quite precisely, but prefetches within | |
54 | a function seems to behave too erratically (not much causality link | |
55 | between the code executed and the CPU prefetch activity) to be accounted | |
56 | for. |