Commit | Line | Data |
---|---|---|
1b577b00 NC |
1 | README for GPROF |
2 | ||
3 | This is the GNU profiler. It is distributed with other "binary | |
4 | utilities" which should be in ../binutils. See ../binutils/README for | |
5 | more general notes, including where to send bug reports. | |
252b5132 RH |
6 | |
7 | This file documents the changes and new features available with this | |
8 | version of GNU gprof. | |
9 | ||
10 | * New Features | |
11 | ||
12 | o Long options | |
13 | ||
14 | o Supports generalized file format, without breaking backward compatibility: | |
15 | new file format supports basic-block execution counts and non-realtime | |
16 | histograms (see below) | |
17 | ||
18 | o Supports profiling at the line level: flat profiles, call-graph profiles, | |
19 | and execution-counts can all be displayed at a level that identifies | |
20 | individual lines rather than just functions | |
21 | ||
22 | o Test-coverage support (similar to Sun tcov program): source files | |
23 | can be annotated with the number of times a function was invoked | |
24 | or with the number of times each basic-block in a function was | |
25 | executed | |
26 | ||
27 | o Generalized histograms: not just execution-time, but arbitrary | |
28 | histograms are support (for example, performance counter based | |
29 | profiles) | |
30 | ||
31 | o Powerful mechanism to select data to be included/excluded from | |
32 | analysis and/or output | |
33 | ||
34 | o Support for DEC OSF/1 v3.0 | |
35 | ||
36 | o Full cross-platform profiling support: gprof uses BFD to support | |
37 | arbitrary, non-native object file formats and non-native byte-orders | |
38 | (this feature has not been tested yet) | |
39 | ||
40 | o In the call-graph function index, static function names are now | |
41 | printed together with the filename in which the function was defined | |
42 | (required bfd_find_nearest_line() support and symbolic debugging | |
43 | information to be present in the executable file) | |
44 | ||
45 | o Major overhaul of source code (compiles cleanly with -Wall, etc.) | |
46 | ||
47 | * Supported Platforms | |
48 | ||
49 | The current version is known to work on: | |
50 | ||
51 | o DEC OSF/1 v3.0 | |
52 | All features supported. | |
53 | ||
54 | o SunOS 4.1.x | |
55 | All features supported. | |
56 | ||
57 | o Solaris 2.3 | |
58 | Line-level profiling unsupported because bfd_find_nearest_line() | |
59 | is not fully implemented for Elf binaries. | |
60 | ||
61 | o HP-UX 9.01 | |
62 | Line-level profiling unsupported because bfd_find_nearest_line() | |
63 | is not fully implemented for SOM binaries. | |
64 | ||
65 | * Detailed Description | |
66 | ||
67 | ** User Interface Changes | |
68 | ||
69 | The command-line interface is backwards compatible with earlier | |
70 | versions of GNU gprof and Berkeley gprof. The only exception is | |
71 | the option to delete arcs from the call graph. The old syntax | |
72 | was: | |
73 | ||
74 | -k fromname toname | |
75 | ||
76 | while the new syntax is: | |
77 | ||
78 | -k fromname/toname | |
79 | ||
80 | This change was necessary to be compatible with long-option parsing. | |
81 | Also, "fromname" and "toname" can now be arbitrary symspecs rather | |
82 | than just function names (see below for an explanation of symspecs). | |
83 | For example, option "-k gprof.c/" suppresses all arcs due to calls out | |
84 | of file "gprof.c". | |
85 | ||
86 | *** Sym Specs | |
87 | ||
88 | It is often necessary to apply gprof only to specific parts of a | |
89 | program. GNU gprof has a simple but powerful mechanism to achieve | |
90 | this. So called {\em symspecs\/} provide the foundation for this | |
91 | mechanism. A symspec selects the parts of a profiled program to which | |
92 | an operation should be applied to. The syntax of a symspec is | |
93 | simple: | |
94 | ||
95 | filename_containing_a_dot | |
96 | | funcname_not_containing_a_dot | |
97 | | linenumber | |
98 | | ( [ any_filename ] `:' ( any_funcname | linenumber ) ) | |
99 | ||
100 | Here are some examples: | |
101 | ||
102 | main.c Selects everything in file "main.c"---the | |
103 | dot in the string tells gprof to interpret | |
104 | the string as a filename, rather than as | |
105 | a function name. To select a file whose | |
106 | name does contain a dot, a trailing colon | |
107 | should be specified. For example, "odd:" is | |
108 | interpreted as the file named "odd". | |
109 | ||
110 | main Selects all functions named "main". Notice | |
111 | that there may be multiple instances of the | |
112 | same function name because some of the | |
113 | definitions may be local (i.e., static). | |
114 | Unless a function name is unique in a program, | |
115 | you must use the colon notation explained | |
116 | below to specify a function from a specific | |
117 | source file. Sometimes, functionnames contain | |
1b577b00 | 118 | dots. In such cases, it is necessary to |
252b5132 RH |
119 | add a leading colon to the name. For example, |
120 | ":.mul" selects function ".mul". | |
121 | ||
122 | main.c:main Selects function "main" in file "main.c". | |
123 | ||
124 | main.c:134 Selects line 134 in file "main.c". | |
125 | ||
126 | IMPLEMENTATION NOTE: The source code uses the type sym_id for symspecs. | |
127 | At some point, this probably ought to be changed to "sym_spec" to make | |
128 | reading the code easier. | |
129 | ||
130 | *** Long options | |
131 | ||
132 | GNU gprof now supports long options. The following is a list of all | |
133 | supported options. Options that are listed without description | |
134 | operate in the same manner as the corresponding option in older | |
135 | versions of gprof. | |
136 | ||
137 | Short Form: Long Form: | |
138 | ----------- ---------- | |
139 | -l --line | |
140 | Request profiling at the line-level rather | |
141 | than just at the function level. Source | |
142 | lines are identified by symbols of the form: | |
143 | ||
144 | func (file:line) | |
145 | ||
146 | where "func" is the function name, "file" is the | |
147 | file name and "line" is the line-number that | |
148 | corresponds to the line. | |
149 | ||
150 | To work properly, the binary must contain symbolic | |
151 | debugging information. This means that the source | |
152 | have to be translated with option "-g" specified. | |
153 | Functions for which there is no symbolic debugging | |
154 | information available are treated as if "--line" | |
155 | had not been specified. However, the line number | |
156 | printed with such symbols is usually incorrect | |
157 | and should be ignored. | |
158 | ||
159 | -a --no-static | |
160 | -A[symspec] --annotated-source[=symspec] | |
161 | Request output in the form of annotated source | |
162 | files. If "symspec" is specified, print output only | |
163 | for symbols selected by "symspec". If the option | |
164 | is specified multiple times, annotated output is | |
165 | generated for the union of all symspecs. | |
166 | ||
167 | Examples: | |
168 | ||
169 | -A Prints annotated source for all | |
170 | source files. | |
171 | -Agprof.c Prints annotated source for file | |
172 | gprof.c. | |
173 | -Afoobar Prints annotated source for files | |
174 | containing a function named "foobar". | |
175 | The entire file will be printed, but | |
176 | only the function itself will be | |
177 | annotated with profile data. | |
178 | ||
179 | -J[symspec] --no-annotated-source[=symspec] | |
180 | Suppress annotated source output. If specified | |
181 | without argument, annotated output is suppressed | |
182 | completely. With an argument, annotated output | |
183 | is suppressed only for the symbols selected by | |
184 | "symspec". If the option is specified multiple | |
185 | times, annotated output is suppressed for the | |
186 | union of all symspecs. This option has lower | |
187 | precedence than --annotated-source | |
188 | ||
189 | -p[symspec] --flat-profile[=symspec] | |
190 | Request output in the form of a flat profile | |
191 | (unless any other output-style option is specified, | |
192 | this option is turned on by default). If | |
193 | "symspec" is specified, include only symbols | |
194 | selected by "symspec" in flat profile. If the | |
195 | option is specified multiple times, the flat | |
196 | profile includes symbols selected by the union | |
197 | of all symspecs. | |
198 | ||
199 | -P[symspec] --no-flat-profile[=symspec] | |
200 | Suppress output in the flat profile. If given | |
201 | without an argument, the flat profile is suppressed | |
202 | completely. If "symspec" is specified, suppress | |
203 | the selected symbols in the flat profile. If the | |
204 | option is specified multiple times, the union of | |
205 | the selected symbols is suppressed. This option | |
206 | has lower precedence than --flat-profile. | |
207 | ||
208 | -q[symspec] --graph[=symspec] | |
209 | Request output in the form of a call-graph | |
210 | (unless any other output-style option is specified, | |
211 | this option is turned on by default). If "symspec" | |
212 | is specified, include only symbols selected by | |
213 | "symspec" in the call-graph. If the option is | |
214 | specified multiple times, the call-graph includes | |
215 | symbols selected by the union of all symspecs. | |
216 | ||
217 | -Q[symspec] --no-graph[=symspec] | |
218 | Suppress output in the call-graph. If given without | |
219 | an argument, the call-graph is suppressed completely. | |
220 | With a "symspec", suppress the selected symbols | |
221 | from the call-graph. If the option is specified | |
222 | multiple times, the union of the selected symbols | |
223 | is suppressed. This option has lower precedence | |
224 | than --graph. | |
225 | ||
226 | -C[symspec] --exec-counts[=symspec] | |
227 | Request output in the form of execution counts. | |
228 | If "symspec" is present, include only symbols | |
229 | selected by "symspec" in the execution count | |
230 | listing. If the option is specified multiple | |
231 | times, the execution count listing includes | |
232 | symbols selected by the union of all symspecs. | |
233 | ||
234 | -Z[symspec] --no-exec-counts[=symspec] | |
235 | Suppress output in the execution count listing. | |
236 | If given without an argument, the listing is | |
237 | suppressed completely. With a "symspec", suppress | |
238 | the selected symbols from the call-graph. If the | |
239 | option is specified multiple times, the union of | |
240 | the selected symbols is suppressed. This option | |
241 | has lower precedence than --exec-counts. | |
242 | ||
243 | -i --file-info | |
244 | Print information about the profile files that | |
245 | are read. The information consists of the | |
246 | number and types of records present in the | |
247 | profile file. Currently, a profile file can | |
248 | contain any number and any combination of histogram, | |
249 | call-graph, or basic-block count records. | |
250 | ||
251 | -s --sum | |
252 | ||
253 | -x --all-lines | |
254 | This option affects annotated source output only. | |
255 | By default, only the lines at the beginning of | |
256 | a basic-block are annotated. If this option is | |
257 | specified, every line in a basic-block is annotated | |
258 | by repeating the annotation for the first line. | |
259 | This option is identical to tcov's "-a". | |
260 | ||
261 | -I dirs --directory-path=dirs | |
262 | This option affects annotated source output only. | |
263 | Specifies the list of directories to be searched | |
264 | for source files. The argument "dirs" is a colon | |
265 | separated list of directories. By default, gprof | |
266 | searches for source files relative to the current | |
267 | working directory only. | |
268 | ||
269 | -z --display-unused-functions | |
270 | ||
271 | -m num --min-count=num | |
272 | This option affects annotated source and execution | |
273 | count output only. Symbols that are executed | |
274 | less than "num" times are suppressed. For annotated | |
275 | source output, suppressed symbols are marked | |
276 | by five hash-marks (#####). In an execution count | |
277 | output, suppressed symbols do not appear at all. | |
278 | ||
279 | -L --print-path | |
280 | Normally, source filenames are printed with the path | |
281 | component suppressed. With this option, gprof | |
282 | can be forced to print the full pathname of | |
283 | source filenames. The full pathname is determined | |
284 | from symbolic debugging information in the image file | |
285 | and is relative to the directory in which the compiler | |
286 | was invoked. | |
287 | ||
288 | -y --separate-files | |
289 | This option affects annotated source output only. | |
290 | Normally, gprof prints annotated source files | |
291 | to standard-output. If this option is specified, | |
292 | annotated source for a file named "path/filename" | |
293 | is generated in the file "filename-ann". That is, | |
294 | annotated output is {\em always\/} generated in | |
295 | gprof's current working directory. Care has to | |
296 | be taken if a program consists of files that have | |
297 | identical filenames, but distinct paths. | |
298 | ||
299 | -c --static-call-graph | |
300 | ||
301 | -t num --table-length=num | |
302 | This option affects annotated source output only. | |
303 | After annotating a source file, gprof generates | |
304 | an execution count summary consisting of a table | |
305 | of lines with the top execution counts. By | |
306 | default, this table is ten entries long. | |
307 | This option can be used to change the table length | |
308 | or, by specifying an argument value of 0, it can be | |
309 | suppressed completely. | |
310 | ||
311 | -n symspec --time=symspec | |
312 | Only symbols selected by "symspec" are considered | |
313 | in total and percentage time computations. | |
314 | However, this option does not affect percentage time | |
315 | computation for the flat profile. | |
316 | If the option is specified multiple times, the union | |
317 | of all selected symbols is used in time computations. | |
318 | ||
319 | -N --no-time=symspec | |
320 | Exclude the symbols selected by "symspec" from | |
321 | total and percentage time computations. | |
322 | However, this option does not affect percentage time | |
323 | computation for the flat profile. | |
324 | This option is ignored if any --time options are | |
325 | specified. | |
326 | ||
327 | -w num --width=num | |
328 | Sets the output line width. Currently, this option | |
329 | affects the printing of the call-graph function index | |
330 | only. | |
331 | ||
332 | -e <no long form---for backwards compatibility only> | |
333 | -E <no long form---for backwards compatibility only> | |
334 | -f <no long form---for backwards compatibility only> | |
335 | -F <no long form---for backwards compatibility only> | |
336 | -k <no long form---for backwards compatibility only> | |
337 | -b --brief | |
338 | -dnum --debug[=num] | |
339 | ||
340 | -h --help | |
341 | Prints a usage message. | |
342 | ||
343 | -O name --file-format=name | |
344 | Selects the format of the profile data files. | |
345 | Recognized formats are "auto", "bsd", "magic", | |
346 | and "prof". The last one is not yet supported. | |
347 | Format "auto" attempts to detect the file format | |
348 | automatically (this is the default behavior). | |
349 | It attempts to read the profile data files as | |
350 | "magic" files and if this fails, falls back to | |
351 | the "bsd" format. "bsd" forces gprof to read | |
352 | the data files in the BSD format. "magic" forces | |
353 | gprof to read the data files in the "magic" format. | |
354 | ||
355 | -T --traditional | |
356 | -v --version | |
357 | ||
358 | ** File Format Changes | |
359 | ||
360 | The old BSD-derived format used for profile data does not contain a | |
361 | magic cookie that allows to check whether a data file really is a | |
362 | gprof file. Furthermore, it does not provide a version number, thus | |
363 | rendering changes to the file format almost impossible. GNU gprof | |
364 | uses a new file format that provides these features. For backward | |
365 | compatibility, GNU gprof continues to support the old BSD-derived | |
366 | format, but not all features are supported with it. For example, | |
367 | basic-block execution counts cannot be accommodated by the old file | |
368 | format. | |
369 | ||
370 | The new file format is defined in header file \file{gmon_out.h}. It | |
371 | consists of a header containing the magic cookie and a version number, | |
372 | as well as some spare bytes available for future extensions. All data | |
373 | in a profile data file is in the native format of the host on which | |
374 | the profile was collected. GNU gprof adapts automatically to the | |
375 | byte-order in use. | |
376 | ||
377 | In the new file format, the header is followed by a sequence of | |
378 | records. Currently, there are three different record types: histogram | |
379 | records, call-graph arc records, and basic-block execution count | |
380 | records. Each file can contain any number of each record type. When | |
381 | reading a file, GNU gprof will ensure records of the same type are | |
382 | compatible with each other and compute the union of all records. For | |
383 | example, for basic-block execution counts, the union is simply the sum | |
384 | of all execution counts for each basic-block. | |
385 | ||
386 | *** Histogram Records | |
387 | ||
388 | Histogram records consist of a header that is followed by an array of | |
389 | bins. The header contains the text-segment range that the histogram | |
390 | spans, the size of the histogram in bytes (unlike in the old BSD | |
391 | format, this does not include the size of the header), the rate of the | |
392 | profiling clock, and the physical dimension that the bin counts | |
393 | represent after being scaled by the profiling clock rate. The | |
394 | physical dimension is specified in two parts: a long name of up to 15 | |
395 | characters and a single character abbreviation. For example, a | |
396 | histogram representing real-time would specify the long name as | |
397 | "seconds" and the abbreviation as "s". This feature is useful for | |
398 | architectures that support performance monitor hardware (which, | |
399 | fortunately, is becoming increasingly common). For example, under DEC | |
400 | OSF/1, the "uprofile" command can be used to produce a histogram of, | |
401 | say, instruction cache misses. In this case, the dimension in the | |
402 | histogram header could be set to "i-cache misses" and the abbreviation | |
403 | could be set to "1" (because it is simply a count, not a physical | |
404 | dimension). Also, the profiling rate would have to be set to 1 in | |
405 | this case. | |
406 | ||
407 | Histogram bins are 16-bit numbers and each bin represent an equal | |
408 | amount of text-space. For example, if the text-segment is one | |
409 | thousand bytes long and if there are ten bins in the histogram, each | |
410 | bin represents one hundred bytes. | |
411 | ||
412 | ||
413 | *** Call-Graph Records | |
414 | ||
415 | Call-graph records have a format that is identical to the one used in | |
416 | the BSD-derived file format. It consists of an arc in the call graph | |
417 | and a count indicating the number of times the arc was traversed | |
418 | during program execution. Arcs are specified by a pair of addresses: | |
419 | the first must be within caller's function and the second must be | |
420 | within the callee's function. When performing profiling at the | |
421 | function level, these addresses can point anywhere within the | |
422 | respective function. However, when profiling at the line-level, it is | |
423 | better if the addresses are as close to the call-site/entry-point as | |
424 | possible. This will ensure that the line-level call-graph is able to | |
425 | identify exactly which line of source code performed calls to a | |
426 | function. | |
427 | ||
428 | *** Basic-Block Execution Count Records | |
429 | ||
430 | Basic-block execution count records consist of a header followed by a | |
431 | sequence of address/count pairs. The header simply specifies the | |
432 | length of the sequence. In an address/count pair, the address | |
433 | identifies a basic-block and the count specifies the number of times | |
434 | that basic-block was executed. Any address within the basic-address can | |
435 | be used. | |
436 | ||
437 | IMPLEMENTATION NOTE: gcc -a can be used to instrument a program to | |
438 | record basic-block execution counts. However, the __bb_exit_func() | |
439 | that is currently present in libgcc2.c does not generate a gmon.out | |
1b577b00 | 440 | file in a suitable format. This should be fixed for future releases |
252b5132 RH |
441 | of gcc. In the meantime, contact davidm@cs.arizona.edu for a version |
442 | of __bb_exit_func() to is appropriate. |