Commit | Line | Data |
---|---|---|
555f386c MF |
1 | function tracer guts |
2 | ==================== | |
03688970 | 3 | By Mike Frysinger |
555f386c MF |
4 | |
5 | Introduction | |
6 | ------------ | |
7 | ||
8 | Here we will cover the architecture pieces that the common function tracing | |
9 | code relies on for proper functioning. Things are broken down into increasing | |
10 | complexity so that you can start simple and at least get basic functionality. | |
11 | ||
12 | Note that this focuses on architecture implementation details only. If you | |
13 | want more explanation of a feature in terms of common code, review the common | |
14 | ftrace.txt file. | |
15 | ||
9849ed4d MF |
16 | Ideally, everyone who wishes to retain performance while supporting tracing in |
17 | their kernel should make it all the way to dynamic ftrace support. | |
18 | ||
555f386c MF |
19 | |
20 | Prerequisites | |
21 | ------------- | |
22 | ||
23 | Ftrace relies on these features being implemented: | |
24 | STACKTRACE_SUPPORT - implement save_stack_trace() | |
25 | TRACE_IRQFLAGS_SUPPORT - implement include/asm/irqflags.h | |
26 | ||
27 | ||
28 | HAVE_FUNCTION_TRACER | |
29 | -------------------- | |
30 | ||
31 | You will need to implement the mcount and the ftrace_stub functions. | |
32 | ||
33 | The exact mcount symbol name will depend on your toolchain. Some call it | |
34 | "mcount", "_mcount", or even "__mcount". You can probably figure it out by | |
35 | running something like: | |
36 | $ echo 'main(){}' | gcc -x c -S -o - - -pg | grep mcount | |
37 | call mcount | |
38 | We'll make the assumption below that the symbol is "mcount" just to keep things | |
39 | nice and simple in the examples. | |
40 | ||
41 | Keep in mind that the ABI that is in effect inside of the mcount function is | |
42 | *highly* architecture/toolchain specific. We cannot help you in this regard, | |
43 | sorry. Dig up some old documentation and/or find someone more familiar than | |
44 | you to bang ideas off of. Typically, register usage (argument/scratch/etc...) | |
45 | is a major issue at this point, especially in relation to the location of the | |
46 | mcount call (before/after function prologue). You might also want to look at | |
47 | how glibc has implemented the mcount function for your architecture. It might | |
48 | be (semi-)relevant. | |
49 | ||
50 | The mcount function should check the function pointer ftrace_trace_function | |
51 | to see if it is set to ftrace_stub. If it is, there is nothing for you to do, | |
52 | so return immediately. If it isn't, then call that function in the same way | |
53 | the mcount function normally calls __mcount_internal -- the first argument is | |
54 | the "frompc" while the second argument is the "selfpc" (adjusted to remove the | |
55 | size of the mcount call that is embedded in the function). | |
56 | ||
57 | For example, if the function foo() calls bar(), when the bar() function calls | |
58 | mcount(), the arguments mcount() will pass to the tracer are: | |
59 | "frompc" - the address bar() will use to return to foo() | |
7e25f44c | 60 | "selfpc" - the address bar() (with mcount() size adjustment) |
555f386c MF |
61 | |
62 | Also keep in mind that this mcount function will be called *a lot*, so | |
63 | optimizing for the default case of no tracer will help the smooth running of | |
64 | your system when tracing is disabled. So the start of the mcount function is | |
7e25f44c RD |
65 | typically the bare minimum with checking things before returning. That also |
66 | means the code flow should usually be kept linear (i.e. no branching in the nop | |
67 | case). This is of course an optimization and not a hard requirement. | |
555f386c MF |
68 | |
69 | Here is some pseudo code that should help (these functions should actually be | |
70 | implemented in assembly): | |
71 | ||
72 | void ftrace_stub(void) | |
73 | { | |
74 | return; | |
75 | } | |
76 | ||
77 | void mcount(void) | |
78 | { | |
79 | /* save any bare state needed in order to do initial checking */ | |
80 | ||
81 | extern void (*ftrace_trace_function)(unsigned long, unsigned long); | |
82 | if (ftrace_trace_function != ftrace_stub) | |
83 | goto do_trace; | |
84 | ||
85 | /* restore any bare state */ | |
86 | ||
87 | return; | |
88 | ||
89 | do_trace: | |
90 | ||
91 | /* save all state needed by the ABI (see paragraph above) */ | |
92 | ||
93 | unsigned long frompc = ...; | |
94 | unsigned long selfpc = <return address> - MCOUNT_INSN_SIZE; | |
95 | ftrace_trace_function(frompc, selfpc); | |
96 | ||
97 | /* restore all state needed by the ABI */ | |
98 | } | |
99 | ||
100 | Don't forget to export mcount for modules ! | |
101 | extern void mcount(void); | |
102 | EXPORT_SYMBOL(mcount); | |
103 | ||
104 | ||
105 | HAVE_FUNCTION_TRACE_MCOUNT_TEST | |
106 | ------------------------------- | |
107 | ||
108 | This is an optional optimization for the normal case when tracing is turned off | |
109 | in the system. If you do not enable this Kconfig option, the common ftrace | |
110 | code will take care of doing the checking for you. | |
111 | ||
112 | To support this feature, you only need to check the function_trace_stop | |
113 | variable in the mcount function. If it is non-zero, there is no tracing to be | |
114 | done at all, so you can return. | |
115 | ||
116 | This additional pseudo code would simply be: | |
117 | void mcount(void) | |
118 | { | |
119 | /* save any bare state needed in order to do initial checking */ | |
120 | ||
121 | + if (function_trace_stop) | |
122 | + return; | |
123 | ||
124 | extern void (*ftrace_trace_function)(unsigned long, unsigned long); | |
125 | if (ftrace_trace_function != ftrace_stub) | |
126 | ... | |
127 | ||
128 | ||
129 | HAVE_FUNCTION_GRAPH_TRACER | |
130 | -------------------------- | |
131 | ||
132 | Deep breath ... time to do some real work. Here you will need to update the | |
133 | mcount function to check ftrace graph function pointers, as well as implement | |
134 | some functions to save (hijack) and restore the return address. | |
135 | ||
136 | The mcount function should check the function pointers ftrace_graph_return | |
137 | (compare to ftrace_stub) and ftrace_graph_entry (compare to | |
7e25f44c | 138 | ftrace_graph_entry_stub). If either of those is not set to the relevant stub |
555f386c MF |
139 | function, call the arch-specific function ftrace_graph_caller which in turn |
140 | calls the arch-specific function prepare_ftrace_return. Neither of these | |
7e25f44c | 141 | function names is strictly required, but you should use them anyway to stay |
555f386c MF |
142 | consistent across the architecture ports -- easier to compare & contrast |
143 | things. | |
144 | ||
145 | The arguments to prepare_ftrace_return are slightly different than what are | |
146 | passed to ftrace_trace_function. The second argument "selfpc" is the same, | |
147 | but the first argument should be a pointer to the "frompc". Typically this is | |
148 | located on the stack. This allows the function to hijack the return address | |
149 | temporarily to have it point to the arch-specific function return_to_handler. | |
150 | That function will simply call the common ftrace_return_to_handler function and | |
7e25f44c | 151 | that will return the original return address with which you can return to the |
555f386c MF |
152 | original call site. |
153 | ||
154 | Here is the updated mcount pseudo code: | |
155 | void mcount(void) | |
156 | { | |
157 | ... | |
158 | if (ftrace_trace_function != ftrace_stub) | |
159 | goto do_trace; | |
160 | ||
161 | +#ifdef CONFIG_FUNCTION_GRAPH_TRACER | |
162 | + extern void (*ftrace_graph_return)(...); | |
163 | + extern void (*ftrace_graph_entry)(...); | |
164 | + if (ftrace_graph_return != ftrace_stub || | |
165 | + ftrace_graph_entry != ftrace_graph_entry_stub) | |
166 | + ftrace_graph_caller(); | |
167 | +#endif | |
168 | ||
169 | /* restore any bare state */ | |
170 | ... | |
171 | ||
172 | Here is the pseudo code for the new ftrace_graph_caller assembly function: | |
173 | #ifdef CONFIG_FUNCTION_GRAPH_TRACER | |
174 | void ftrace_graph_caller(void) | |
175 | { | |
176 | /* save all state needed by the ABI */ | |
177 | ||
178 | unsigned long *frompc = &...; | |
179 | unsigned long selfpc = <return address> - MCOUNT_INSN_SIZE; | |
03688970 MF |
180 | /* passing frame pointer up is optional -- see below */ |
181 | prepare_ftrace_return(frompc, selfpc, frame_pointer); | |
555f386c MF |
182 | |
183 | /* restore all state needed by the ABI */ | |
184 | } | |
185 | #endif | |
186 | ||
03688970 MF |
187 | For information on how to implement prepare_ftrace_return(), simply look at the |
188 | x86 version (the frame pointer passing is optional; see the next section for | |
189 | more information). The only architecture-specific piece in it is the setup of | |
555f386c MF |
190 | the fault recovery table (the asm(...) code). The rest should be the same |
191 | across architectures. | |
192 | ||
193 | Here is the pseudo code for the new return_to_handler assembly function. Note | |
194 | that the ABI that applies here is different from what applies to the mcount | |
195 | code. Since you are returning from a function (after the epilogue), you might | |
196 | be able to skimp on things saved/restored (usually just registers used to pass | |
197 | return values). | |
198 | ||
199 | #ifdef CONFIG_FUNCTION_GRAPH_TRACER | |
200 | void return_to_handler(void) | |
201 | { | |
202 | /* save all state needed by the ABI (see paragraph above) */ | |
203 | ||
204 | void (*original_return_point)(void) = ftrace_return_to_handler(); | |
205 | ||
206 | /* restore all state needed by the ABI */ | |
207 | ||
208 | /* this is usually either a return or a jump */ | |
209 | original_return_point(); | |
210 | } | |
211 | #endif | |
212 | ||
213 | ||
03688970 MF |
214 | HAVE_FUNCTION_GRAPH_FP_TEST |
215 | --------------------------- | |
216 | ||
217 | An arch may pass in a unique value (frame pointer) to both the entering and | |
218 | exiting of a function. On exit, the value is compared and if it does not | |
219 | match, then it will panic the kernel. This is largely a sanity check for bad | |
220 | code generation with gcc. If gcc for your port sanely updates the frame | |
9849ed4d | 221 | pointer under different optimization levels, then ignore this option. |
03688970 MF |
222 | |
223 | However, adding support for it isn't terribly difficult. In your assembly code | |
224 | that calls prepare_ftrace_return(), pass the frame pointer as the 3rd argument. | |
225 | Then in the C version of that function, do what the x86 port does and pass it | |
226 | along to ftrace_push_return_trace() instead of a stub value of 0. | |
227 | ||
228 | Similarly, when you call ftrace_return_to_handler(), pass it the frame pointer. | |
229 | ||
230 | ||
555f386c MF |
231 | HAVE_FTRACE_NMI_ENTER |
232 | --------------------- | |
233 | ||
234 | If you can't trace NMI functions, then skip this option. | |
235 | ||
236 | <details to be filled> | |
237 | ||
238 | ||
459c6d15 | 239 | HAVE_SYSCALL_TRACEPOINTS |
9849ed4d | 240 | ------------------------ |
555f386c | 241 | |
459c6d15 FW |
242 | You need very few things to get the syscalls tracing in an arch. |
243 | ||
e7b8e675 | 244 | - Support HAVE_ARCH_TRACEHOOK (see arch/Kconfig). |
459c6d15 FW |
245 | - Have a NR_syscalls variable in <asm/unistd.h> that provides the number |
246 | of syscalls supported by the arch. | |
e7b8e675 | 247 | - Support the TIF_SYSCALL_TRACEPOINT thread flags. |
459c6d15 FW |
248 | - Put the trace_sys_enter() and trace_sys_exit() tracepoints calls from ptrace |
249 | in the ptrace syscalls tracing path. | |
c763ba06 IM |
250 | - If the system call table on this arch is more complicated than a simple array |
251 | of addresses of the system calls, implement an arch_syscall_addr to return | |
252 | the address of a given system call. | |
b2d55496 IM |
253 | - If the symbol names of the system calls do not match the function names on |
254 | this arch, define ARCH_HAS_SYSCALL_MATCH_SYM_NAME in asm/ftrace.h and | |
255 | implement arch_syscall_match_sym_name with the appropriate logic to return | |
256 | true if the function name corresponds with the symbol name. | |
459c6d15 | 257 | - Tag this arch as HAVE_SYSCALL_TRACEPOINTS. |
555f386c MF |
258 | |
259 | ||
260 | HAVE_FTRACE_MCOUNT_RECORD | |
261 | ------------------------- | |
262 | ||
9849ed4d MF |
263 | See scripts/recordmcount.pl for more info. Just fill in the arch-specific |
264 | details for how to locate the addresses of mcount call sites via objdump. | |
265 | This option doesn't make much sense without also implementing dynamic ftrace. | |
555f386c | 266 | |
9849ed4d MF |
267 | |
268 | HAVE_DYNAMIC_FTRACE | |
269 | ------------------- | |
270 | ||
271 | You will first need HAVE_FTRACE_MCOUNT_RECORD and HAVE_FUNCTION_TRACER, so | |
272 | scroll your reader back up if you got over eager. | |
273 | ||
274 | Once those are out of the way, you will need to implement: | |
275 | - asm/ftrace.h: | |
276 | - MCOUNT_ADDR | |
277 | - ftrace_call_adjust() | |
278 | - struct dyn_arch_ftrace{} | |
279 | - asm code: | |
280 | - mcount() (new stub) | |
281 | - ftrace_caller() | |
282 | - ftrace_call() | |
283 | - ftrace_stub() | |
284 | - C code: | |
285 | - ftrace_dyn_arch_init() | |
286 | - ftrace_make_nop() | |
287 | - ftrace_make_call() | |
288 | - ftrace_update_ftrace_func() | |
289 | ||
290 | First you will need to fill out some arch details in your asm/ftrace.h. | |
291 | ||
292 | Define MCOUNT_ADDR as the address of your mcount symbol similar to: | |
293 | #define MCOUNT_ADDR ((unsigned long)mcount) | |
294 | Since no one else will have a decl for that function, you will need to: | |
295 | extern void mcount(void); | |
296 | ||
297 | You will also need the helper function ftrace_call_adjust(). Most people | |
298 | will be able to stub it out like so: | |
299 | static inline unsigned long ftrace_call_adjust(unsigned long addr) | |
300 | { | |
301 | return addr; | |
302 | } | |
555f386c MF |
303 | <details to be filled> |
304 | ||
9849ed4d MF |
305 | Lastly you will need the custom dyn_arch_ftrace structure. If you need |
306 | some extra state when runtime patching arbitrary call sites, this is the | |
307 | place. For now though, create an empty struct: | |
308 | struct dyn_arch_ftrace { | |
309 | /* No extra data needed */ | |
310 | }; | |
311 | ||
312 | With the header out of the way, we can fill out the assembly code. While we | |
313 | did already create a mcount() function earlier, dynamic ftrace only wants a | |
314 | stub function. This is because the mcount() will only be used during boot | |
315 | and then all references to it will be patched out never to return. Instead, | |
316 | the guts of the old mcount() will be used to create a new ftrace_caller() | |
317 | function. Because the two are hard to merge, it will most likely be a lot | |
318 | easier to have two separate definitions split up by #ifdefs. Same goes for | |
319 | the ftrace_stub() as that will now be inlined in ftrace_caller(). | |
320 | ||
321 | Before we get confused anymore, let's check out some pseudo code so you can | |
322 | implement your own stuff in assembly: | |
555f386c | 323 | |
9849ed4d MF |
324 | void mcount(void) |
325 | { | |
326 | return; | |
327 | } | |
328 | ||
329 | void ftrace_caller(void) | |
330 | { | |
331 | /* implement HAVE_FUNCTION_TRACE_MCOUNT_TEST if you desire */ | |
332 | ||
333 | /* save all state needed by the ABI (see paragraph above) */ | |
334 | ||
335 | unsigned long frompc = ...; | |
336 | unsigned long selfpc = <return address> - MCOUNT_INSN_SIZE; | |
337 | ||
338 | ftrace_call: | |
339 | ftrace_stub(frompc, selfpc); | |
340 | ||
341 | /* restore all state needed by the ABI */ | |
342 | ||
343 | ftrace_stub: | |
344 | return; | |
345 | } | |
346 | ||
347 | This might look a little odd at first, but keep in mind that we will be runtime | |
348 | patching multiple things. First, only functions that we actually want to trace | |
349 | will be patched to call ftrace_caller(). Second, since we only have one tracer | |
350 | active at a time, we will patch the ftrace_caller() function itself to call the | |
351 | specific tracer in question. That is the point of the ftrace_call label. | |
352 | ||
353 | With that in mind, let's move on to the C code that will actually be doing the | |
354 | runtime patching. You'll need a little knowledge of your arch's opcodes in | |
355 | order to make it through the next section. | |
356 | ||
357 | Every arch has an init callback function. If you need to do something early on | |
358 | to initialize some state, this is the time to do that. Otherwise, this simple | |
359 | function below should be sufficient for most people: | |
360 | ||
361 | int __init ftrace_dyn_arch_init(void *data) | |
362 | { | |
363 | /* return value is done indirectly via data */ | |
364 | *(unsigned long *)data = 0; | |
365 | ||
366 | return 0; | |
367 | } | |
368 | ||
369 | There are two functions that are used to do runtime patching of arbitrary | |
370 | functions. The first is used to turn the mcount call site into a nop (which | |
371 | is what helps us retain runtime performance when not tracing). The second is | |
372 | used to turn the mcount call site into a call to an arbitrary location (but | |
373 | typically that is ftracer_caller()). See the general function definition in | |
374 | linux/ftrace.h for the functions: | |
375 | ftrace_make_nop() | |
376 | ftrace_make_call() | |
377 | The rec->ip value is the address of the mcount call site that was collected | |
378 | by the scripts/recordmcount.pl during build time. | |
379 | ||
380 | The last function is used to do runtime patching of the active tracer. This | |
381 | will be modifying the assembly code at the location of the ftrace_call symbol | |
382 | inside of the ftrace_caller() function. So you should have sufficient padding | |
383 | at that location to support the new function calls you'll be inserting. Some | |
384 | people will be using a "call" type instruction while others will be using a | |
385 | "branch" type instruction. Specifically, the function is: | |
386 | ftrace_update_ftrace_func() | |
387 | ||
388 | ||
389 | HAVE_DYNAMIC_FTRACE + HAVE_FUNCTION_GRAPH_TRACER | |
390 | ------------------------------------------------ | |
391 | ||
392 | The function grapher needs a few tweaks in order to work with dynamic ftrace. | |
393 | Basically, you will need to: | |
394 | - update: | |
395 | - ftrace_caller() | |
396 | - ftrace_graph_call() | |
397 | - ftrace_graph_caller() | |
398 | - implement: | |
399 | - ftrace_enable_ftrace_graph_caller() | |
400 | - ftrace_disable_ftrace_graph_caller() | |
555f386c MF |
401 | |
402 | <details to be filled> | |
9849ed4d MF |
403 | Quick notes: |
404 | - add a nop stub after the ftrace_call location named ftrace_graph_call; | |
405 | stub needs to be large enough to support a call to ftrace_graph_caller() | |
406 | - update ftrace_graph_caller() to work with being called by the new | |
407 | ftrace_caller() since some semantics may have changed | |
408 | - ftrace_enable_ftrace_graph_caller() will runtime patch the | |
409 | ftrace_graph_call location with a call to ftrace_graph_caller() | |
410 | - ftrace_disable_ftrace_graph_caller() will runtime patch the | |
411 | ftrace_graph_call location with nops |