Commit | Line | Data |
---|---|---|
5ba9f198 | 1 | |
4767a9e7 | 2 | RFC: Common Trace Format (CTF) Proposal (pre-v1.7) |
5ba9f198 MD |
3 | |
4 | Mathieu Desnoyers, EfficiOS Inc. | |
5 | ||
6 | The goal of the present document is to propose a trace format that suits the | |
cc089c3a | 7 | needs of the embedded, telecom, high-performance and kernel communities. It is |
5ba9f198 | 8 | based on the Common Trace Format Requirements (v1.4) document. It is designed to |
cc089c3a MD |
9 | allow traces to be natively generated by the Linux kernel, Linux user-space |
10 | applications written in C/C++, and hardware components. | |
11 | ||
12 | The latest version of this document can be found at: | |
13 | ||
14 | git tree: git://git.efficios.com/ctf.git | |
15 | gitweb: http://git.efficios.com/?p=ctf.git | |
5ba9f198 MD |
16 | |
17 | A reference implementation of a library to read and write this trace format is | |
18 | being implemented within the BabelTrace project, a converter between trace | |
19 | formats. The development tree is available at: | |
20 | ||
21 | git tree: git://git.efficios.com/babeltrace.git | |
22 | gitweb: http://git.efficios.com/?p=babeltrace.git | |
23 | ||
24 | ||
25 | 1. Preliminary definitions | |
26 | ||
3bf79539 MD |
27 | - Event Trace: An ordered sequence of events. |
28 | - Event Stream: An ordered sequence of events, containing a subset of the | |
29 | trace event types. | |
30 | - Event Packet: A sequence of physically contiguous events within an event | |
31 | stream. | |
5ba9f198 MD |
32 | - Event: This is the basic entry in a trace. (aka: a trace record). |
33 | - An event identifier (ID) relates to the class (a type) of event within | |
3bf79539 MD |
34 | an event stream. |
35 | e.g. event: irq_entry. | |
5ba9f198 MD |
36 | - An event (or event record) relates to a specific instance of an event |
37 | class. | |
3bf79539 MD |
38 | e.g. event: irq_entry, at time X, on CPU Y |
39 | - Source Architecture: Architecture writing the trace. | |
40 | - Reader Architecture: Architecture reading the trace. | |
5ba9f198 MD |
41 | |
42 | ||
43 | 2. High-level representation of a trace | |
44 | ||
3bf79539 MD |
45 | A trace is divided into multiple event streams. Each event stream contains a |
46 | subset of the trace event types. | |
5ba9f198 | 47 | |
3bf79539 MD |
48 | The final output of the trace, after its generation and optional transport over |
49 | the network, is expected to be either on permanent or temporary storage in a | |
50 | virtual file system. Because each event stream is appended to while a trace is | |
51 | being recorded, each is associated with a separate file for output. Therefore, | |
52 | a stored trace can be represented as a directory containing one file per stream. | |
5ba9f198 | 53 | |
3bf79539 | 54 | A metadata event stream contains information on trace event types. It describes: |
5ba9f198 MD |
55 | |
56 | - Trace version. | |
57 | - Types available. | |
3bf79539 MD |
58 | - Per-stream event header description. |
59 | - Per-stream event header selection. | |
60 | - Per-stream event context fields. | |
5ba9f198 | 61 | - Per-event |
3bf79539 | 62 | - Event type to stream mapping. |
5ba9f198 MD |
63 | - Event type to name mapping. |
64 | - Event type to ID mapping. | |
65 | - Event fields description. | |
66 | ||
67 | ||
3bf79539 | 68 | 3. Event stream |
5ba9f198 | 69 | |
3bf79539 MD |
70 | An event stream is divided in contiguous event packets of variable size. These |
71 | subdivisions have a variable size. An event packet can contain a certain amount | |
72 | of padding at the end. The rationale for the event stream design choices is | |
73 | explained in Appendix B. Stream Header Rationale. | |
5ba9f198 | 74 | |
3bf79539 MD |
75 | An event stream is divided in contiguous event packets of variable size. These |
76 | subdivisions have a variable size. An event packet can contain a certain amount | |
77 | of padding at the end. The stream header is repeated at the beginning of each | |
78 | event packet. | |
5ba9f198 | 79 | |
3bf79539 MD |
80 | The event stream header will therefore be referred to as the "event packet |
81 | header" throughout the rest of this document. | |
5ba9f198 MD |
82 | |
83 | ||
84 | 4. Types | |
85 | ||
86 | 4.1 Basic types | |
87 | ||
88 | A basic type is a scalar type, as described in this section. | |
89 | ||
90 | 4.1.1 Type inheritance | |
91 | ||
80fd2569 MD |
92 | Type specifications can be inherited to allow deriving types from a |
93 | type class. For example, see the uint32_t named type derived from the "integer" | |
94 | type class below ("Integers" section). Types have a precise binary | |
95 | representation in the trace. A type class has methods to read and write these | |
96 | types, but must be derived into a type to be usable in an event field. | |
5ba9f198 MD |
97 | |
98 | 4.1.2 Alignment | |
99 | ||
100 | We define "byte-packed" types as aligned on the byte size, namely 8-bit. | |
101 | We define "bit-packed" types as following on the next bit, as defined by the | |
102 | "bitfields" section. | |
5ba9f198 | 103 | |
3bf79539 MD |
104 | All basic types, except bitfields, are either aligned on an architecture-defined |
105 | specific alignment or byte-packed, depending on the architecture preference. | |
106 | Architectures providing fast unaligned write byte-packed basic types to save | |
5ba9f198 | 107 | space, aligning each type on byte boundaries (8-bit). Architectures with slow |
3bf79539 MD |
108 | unaligned writes align types on specific alignment values. If no specific |
109 | alignment is declared for a type nor its parents, it is assumed to be bit-packed | |
110 | for bitfields and byte-packed for other types. | |
5ba9f198 | 111 | |
3bf79539 | 112 | Metadata attribute representation of a specific alignment: |
5ba9f198 MD |
113 | |
114 | align = value; /* value in bits */ | |
115 | ||
116 | 4.1.3 Byte order | |
117 | ||
3bf79539 MD |
118 | By default, the native endianness of the source architecture the trace is used. |
119 | Byte order can be overridden for a basic type by specifying a "byte_order" | |
120 | attribute. Typical use-case is to specify the network byte order (big endian: | |
121 | "be") to save data captured from the network into the trace without conversion. | |
122 | If not specified, the byte order is native. | |
5ba9f198 MD |
123 | |
124 | Metadata representation: | |
125 | ||
126 | byte_order = native OR network OR be OR le; /* network and be are aliases */ | |
127 | ||
128 | 4.1.4 Size | |
129 | ||
130 | Type size, in bits, for integers and floats is that returned by "sizeof()" in C | |
131 | multiplied by CHAR_BIT. | |
132 | We require the size of "char" and "unsigned char" types (CHAR_BIT) to be fixed | |
133 | to 8 bits for cross-endianness compatibility. | |
134 | ||
135 | Metadata representation: | |
136 | ||
137 | size = value; (value is in bits) | |
138 | ||
139 | 4.1.5 Integers | |
140 | ||
141 | Signed integers are represented in two-complement. Integer alignment, size, | |
142 | signedness and byte ordering are defined in the metadata. Integers aligned on | |
143 | byte size (8-bit) and with length multiple of byte size (8-bit) correspond to | |
144 | the C99 standard integers. In addition, integers with alignment and/or size that | |
145 | are _not_ a multiple of the byte size are permitted; these correspond to the C99 | |
146 | standard bitfields, with the added specification that the CTF integer bitfields | |
147 | have a fixed binary representation. A MIT-licensed reference implementation of | |
148 | the CTF portable bitfields is available at: | |
149 | ||
150 | http://git.efficios.com/?p=babeltrace.git;a=blob;f=include/babeltrace/bitfield.h | |
151 | ||
152 | Binary representation of integers: | |
153 | ||
154 | - On little and big endian: | |
155 | - Within a byte, high bits correspond to an integer high bits, and low bits | |
156 | correspond to low bits. | |
157 | - On little endian: | |
158 | - Integer across multiple bytes are placed from the less significant to the | |
159 | most significant. | |
160 | - Consecutive integers are placed from lower bits to higher bits (even within | |
161 | a byte). | |
162 | - On big endian: | |
163 | - Integer across multiple bytes are placed from the most significant to the | |
164 | less significant. | |
165 | - Consecutive integers are placed from higher bits to lower bits (even within | |
166 | a byte). | |
167 | ||
168 | This binary representation is derived from the bitfield implementation in GCC | |
169 | for little and big endian. However, contrary to what GCC does, integers can | |
170 | cross units boundaries (no padding is required). Padding can be explicitely | |
171 | added (see 4.1.6 GNU/C bitfields) to follow the GCC layout if needed. | |
172 | ||
173 | Metadata representation: | |
174 | ||
80fd2569 | 175 | integer { |
5ba9f198 MD |
176 | signed = true OR false; /* default false */ |
177 | byte_order = native OR network OR be OR le; /* default native */ | |
178 | size = value; /* value in bits, no default */ | |
179 | align = value; /* value in bits */ | |
2152348f | 180 | } |
5ba9f198 | 181 | |
80fd2569 | 182 | Example of type inheritance (creation of a uint32_t named type): |
5ba9f198 | 183 | |
80fd2569 | 184 | typedef integer { |
9e4e34e9 | 185 | size = 32; |
5ba9f198 MD |
186 | signed = false; |
187 | align = 32; | |
80fd2569 | 188 | } uint32_t; |
5ba9f198 | 189 | |
80fd2569 | 190 | Definition of a named 5-bit signed bitfield: |
5ba9f198 | 191 | |
80fd2569 | 192 | typedef integer { |
5ba9f198 MD |
193 | size = 5; |
194 | signed = true; | |
195 | align = 1; | |
80fd2569 | 196 | } int5_t; |
5ba9f198 MD |
197 | |
198 | 4.1.6 GNU/C bitfields | |
199 | ||
200 | The GNU/C bitfields follow closely the integer representation, with a | |
201 | particularity on alignment: if a bitfield cannot fit in the current unit, the | |
80fd2569 MD |
202 | unit is padded and the bitfield starts at the following unit. The unit size is |
203 | defined by the size of the type "unit_type". | |
5ba9f198 | 204 | |
2152348f | 205 | Metadata representation: |
80fd2569 MD |
206 | |
207 | unit_type name:size: | |
208 | ||
5ba9f198 MD |
209 | As an example, the following structure declared in C compiled by GCC: |
210 | ||
211 | struct example { | |
212 | short a:12; | |
213 | short b:5; | |
214 | }; | |
215 | ||
2152348f MD |
216 | The example structure is aligned on the largest element (short). The second |
217 | bitfield would be aligned on the next unit boundary, because it would not fit in | |
218 | the current unit. | |
5ba9f198 MD |
219 | |
220 | 4.1.7 Floating point | |
221 | ||
222 | The floating point values byte ordering is defined in the metadata. | |
223 | ||
224 | Floating point values follow the IEEE 754-2008 standard interchange formats. | |
225 | Description of the floating point values include the exponent and mantissa size | |
226 | in bits. Some requirements are imposed on the floating point values: | |
227 | ||
228 | - FLT_RADIX must be 2. | |
229 | - mant_dig is the number of digits represented in the mantissa. It is specified | |
230 | by the ISO C99 standard, section 5.2.4, as FLT_MANT_DIG, DBL_MANT_DIG and | |
231 | LDBL_MANT_DIG as defined by <float.h>. | |
232 | - exp_dig is the number of digits represented in the exponent. Given that | |
233 | mant_dig is one bit more than its actual size in bits (leading 1 is not | |
234 | needed) and also given that the sign bit always takes one bit, exp_dig can be | |
235 | specified as: | |
236 | ||
237 | - sizeof(float) * CHAR_BIT - FLT_MANT_DIG | |
238 | - sizeof(double) * CHAR_BIT - DBL_MANT_DIG | |
239 | - sizeof(long double) * CHAR_BIT - LDBL_MANT_DIG | |
240 | ||
241 | Metadata representation: | |
242 | ||
80fd2569 | 243 | floating_point { |
5ba9f198 MD |
244 | exp_dig = value; |
245 | mant_dig = value; | |
246 | byte_order = native OR network OR be OR le; | |
2152348f | 247 | } |
5ba9f198 MD |
248 | |
249 | Example of type inheritance: | |
250 | ||
80fd2569 | 251 | typedef floating_point { |
5ba9f198 MD |
252 | exp_dig = 8; /* sizeof(float) * CHAR_BIT - FLT_MANT_DIG */ |
253 | mant_dig = 24; /* FLT_MANT_DIG */ | |
254 | byte_order = native; | |
80fd2569 | 255 | } float; |
5ba9f198 MD |
256 | |
257 | TODO: define NaN, +inf, -inf behavior. | |
258 | ||
259 | 4.1.8 Enumerations | |
260 | ||
261 | Enumerations are a mapping between an integer type and a table of strings. The | |
262 | numerical representation of the enumeration follows the integer type specified | |
263 | by the metadata. The enumeration mapping table is detailed in the enumeration | |
3bf79539 MD |
264 | description within the metadata. The mapping table maps inclusive value ranges |
265 | (or single values) to strings. Instead of being limited to simple | |
266 | "value -> string" mappings, these enumerations map | |
80fd2569 | 267 | "[ start_value ... end_value ] -> string", which map inclusive ranges of |
3bf79539 MD |
268 | values to strings. An enumeration from the C language can be represented in |
269 | this format by having the same start_value and end_value for each element, which | |
270 | is in fact a range of size 1. This single-value range is supported without | |
4767a9e7 | 271 | repeating the start and end values with the value = string declaration. |
80fd2569 | 272 | |
4767a9e7 MD |
273 | If a numeric value is encountered between < >, it represents the integer type |
274 | size used to hold the enumeration, in bits. | |
275 | ||
276 | enum <integer_type OR size> name { | |
80fd2569 MD |
277 | string = start_value1 ... end_value1, |
278 | "other string" = start_value2 ... end_value2, | |
279 | yet_another_string, /* will be assigned to end_value2 + 1 */ | |
280 | "some other string" = value, | |
281 | ... | |
282 | }; | |
283 | ||
284 | If the values are omitted, the enumeration starts at 0 and increment of 1 for | |
285 | each entry: | |
286 | ||
4767a9e7 | 287 | enum <32> name { |
80fd2569 MD |
288 | ZERO, |
289 | ONE, | |
290 | TWO, | |
291 | TEN = 10, | |
292 | ELEVEN, | |
3bf79539 | 293 | }; |
5ba9f198 | 294 | |
80fd2569 | 295 | Overlapping ranges within a single enumeration are implementation defined. |
5ba9f198 | 296 | |
2152348f MD |
297 | A nameless enumeration can be declared as a field type or as part of a typedef: |
298 | ||
299 | enum <integer_type> { | |
300 | ... | |
301 | } | |
302 | ||
5ba9f198 MD |
303 | 4.2 Compound types |
304 | ||
305 | 4.2.1 Structures | |
306 | ||
307 | Structures are aligned on the largest alignment required by basic types | |
308 | contained within the structure. (This follows the ISO/C standard for structures) | |
309 | ||
80fd2569 | 310 | Metadata representation of a named structure: |
5ba9f198 | 311 | |
80fd2569 MD |
312 | struct name { |
313 | field_type field_name; | |
314 | field_type field_name; | |
315 | ... | |
316 | }; | |
5ba9f198 MD |
317 | |
318 | Example: | |
319 | ||
80fd2569 MD |
320 | struct example { |
321 | integer { /* Nameless type */ | |
322 | size = 16; | |
323 | signed = true; | |
324 | align = 16; | |
325 | } first_field_name; | |
326 | uint64_t second_field_name; /* Named type declared in the metadata */ | |
3bf79539 | 327 | }; |
5ba9f198 MD |
328 | |
329 | The fields are placed in a sequence next to each other. They each possess a | |
330 | field name, which is a unique identifier within the structure. | |
331 | ||
2152348f | 332 | A nameless structure can be declared as a field type or as part of a typedef: |
80fd2569 MD |
333 | |
334 | struct { | |
335 | ... | |
2152348f | 336 | } |
80fd2569 | 337 | |
5ba9f198 MD |
338 | 4.2.2 Arrays |
339 | ||
340 | Arrays are fixed-length. Their length is declared in the type declaration within | |
341 | the metadata. They contain an array of "inner type" elements, which can refer to | |
342 | any type not containing the type of the array being declared (no circular | |
3bf79539 | 343 | dependency). The length is the number of elements in an array. |
5ba9f198 | 344 | |
2152348f | 345 | Metadata representation of a named array: |
80fd2569 MD |
346 | |
347 | typedef elem_type name[length]; | |
5ba9f198 | 348 | |
2152348f | 349 | A nameless array can be declared as a field type within a structure, e.g.: |
5ba9f198 | 350 | |
2152348f | 351 | uint8_t field_name[10]; |
80fd2569 | 352 | |
5ba9f198 MD |
353 | |
354 | 4.2.3 Sequences | |
355 | ||
356 | Sequences are dynamically-sized arrays. They start with an integer that specify | |
357 | the length of the sequence, followed by an array of "inner type" elements. | |
3bf79539 | 358 | The length is the number of elements in the sequence. |
5ba9f198 | 359 | |
2152348f | 360 | Metadata representation for a named sequence: |
80fd2569 MD |
361 | |
362 | typedef elem_type name[length_type]; | |
363 | ||
364 | A nameless sequence can be declared as a field type, e.g.: | |
365 | ||
80fd2569 MD |
366 | long field_name[int]; |
367 | ||
368 | The length type follows the integer types specifications, and the sequence | |
5ba9f198 MD |
369 | elements follow the "array" specifications. |
370 | ||
371 | 4.2.4 Strings | |
372 | ||
373 | Strings are an array of bytes of variable size and are terminated by a '\0' | |
374 | "NULL" character. Their encoding is described in the metadata. In absence of | |
375 | encoding attribute information, the default encoding is UTF-8. | |
376 | ||
80fd2569 MD |
377 | Metadata representation of a named string type: |
378 | ||
379 | typedef string { | |
5ba9f198 | 380 | encoding = UTF8 OR ASCII; |
80fd2569 | 381 | } name; |
5ba9f198 | 382 | |
80fd2569 MD |
383 | A nameless string type can be declared as a field type: |
384 | ||
385 | string field_name; /* Use default UTF8 encoding */ | |
5ba9f198 | 386 | |
3bf79539 MD |
387 | 5. Event Packet Header |
388 | ||
389 | The event packet header consists of two part: one is mandatory and have a fixed | |
390 | layout. The second part, the "event packet context", has its layout described in | |
391 | the metadata. | |
5ba9f198 | 392 | |
3bf79539 MD |
393 | - Aligned on page size. Fixed size. Fields either aligned or packed (depending |
394 | on the architecture preference). | |
395 | No padding at the end of the event packet header. Native architecture byte | |
5ba9f198 | 396 | ordering. |
3bf79539 MD |
397 | |
398 | Fixed layout (event packet header): | |
399 | ||
5ba9f198 MD |
400 | - Magic number (CTF magic numbers: 0xC1FC1FC1 and its reverse endianness |
401 | representation: 0xC11FFCC1) It needs to have a non-symmetric bytewise | |
402 | representation. Used to distinguish between big and little endian traces (this | |
403 | information is determined by knowing the endianness of the architecture | |
404 | reading the trace and comparing the magic number against its value and the | |
405 | reverse, 0xC11FFCC1). This magic number specifies that we use the CTF metadata | |
406 | description language described in this document. Different magic numbers | |
407 | should be used for other metadata description languages. | |
3bf79539 | 408 | - Trace UUID, used to ensure the event packet match the metadata used. |
5ba9f198 MD |
409 | (note: we cannot use a metadata checksum because metadata can be appended to |
410 | while tracing is active) | |
3bf79539 MD |
411 | - Stream ID, used as reference to stream description in metadata. |
412 | ||
413 | Metadata-defined layout (event packet context): | |
414 | ||
415 | - Event packet content size (in bytes). | |
416 | - Event packet size (in bytes, includes padding). | |
417 | - Event packet content checksum (optional). Checksum excludes the event packet | |
418 | header. | |
419 | - Per-stream event packet sequence count (to deal with UDP packet loss). The | |
420 | number of significant sequence counter bits should also be present, so | |
421 | wrap-arounds are deal with correctly. | |
422 | - Timestamp at the beginning and timestamp at the end of the event packet. | |
423 | Both timestamps are written in the packet header, but sampled respectively | |
424 | while (or before) writing the first event and while (or after) writing the | |
425 | last event in the packet. The inclusive range between these timestamps should | |
426 | include all event timestamps assigned to events contained within the packet. | |
5ba9f198 | 427 | - Events discarded count |
3bf79539 MD |
428 | - Snapshot of a per-stream free-running counter, counting the number of |
429 | events discarded that were supposed to be written in the stream prior to | |
430 | the first event in the event packet. | |
5ba9f198 | 431 | * Note: producer-consumer buffer full condition should fill the current |
3bf79539 | 432 | event packet with padding so we know exactly where events have been |
5ba9f198 | 433 | discarded. |
3bf79539 MD |
434 | - Lossless compression scheme used for the event packet content. Applied |
435 | directly to raw data. New types of compression can be added in following | |
436 | versions of the format. | |
5ba9f198 MD |
437 | 0: no compression scheme |
438 | 1: bzip2 | |
439 | 2: gzip | |
3bf79539 MD |
440 | 3: xz |
441 | - Cypher used for the event packet content. Applied after compression. | |
5ba9f198 MD |
442 | 0: no encryption |
443 | 1: AES | |
3bf79539 | 444 | - Checksum scheme used for the event packet content. Applied after encryption. |
5ba9f198 MD |
445 | 0: no checksum |
446 | 1: md5 | |
447 | 2: sha1 | |
448 | 3: crc32 | |
449 | ||
3bf79539 MD |
450 | 5.1 Event Packet Header Fixed Layout Description |
451 | ||
80fd2569 MD |
452 | struct event_packet_header { |
453 | uint32_t magic; | |
454 | uint8_t trace_uuid[16]; | |
3bf79539 | 455 | uint32_t stream_id; |
80fd2569 | 456 | }; |
5ba9f198 | 457 | |
3bf79539 MD |
458 | 5.2 Event Packet Context Description |
459 | ||
460 | Event packet context example. These are declared within the stream declaration | |
461 | in the metadata. All these fields are optional except for "content_size" and | |
462 | "packet_size", which must be present in the context. | |
463 | ||
464 | An example event packet context type: | |
465 | ||
80fd2569 | 466 | struct event_packet_context { |
3bf79539 MD |
467 | uint64_t timestamp_begin; |
468 | uint64_t timestamp_end; | |
469 | uint32_t checksum; | |
470 | uint32_t stream_packet_count; | |
471 | uint32_t events_discarded; | |
472 | uint32_t cpu_id; | |
473 | uint32_t/uint16_t content_size; | |
474 | uint32_t/uint16_t packet_size; | |
475 | uint8_t stream_packet_count_bits; /* Significant counter bits */ | |
476 | uint8_t compression_scheme; | |
477 | uint8_t encryption_scheme; | |
478 | uint8_t checksum; | |
479 | }; | |
5ba9f198 MD |
480 | |
481 | 6. Event Structure | |
482 | ||
483 | The overall structure of an event is: | |
484 | ||
3bf79539 | 485 | - Event Header (as specifed by the stream metadata) |
5ba9f198 | 486 | - Extended Event Header (as specified by the event header) |
3bf79539 | 487 | - Event Context (as specified by the stream metadata) |
5ba9f198 MD |
488 | - Event Payload (as specified by the event metadata) |
489 | ||
490 | ||
491 | 6.1 Event Header | |
492 | ||
3bf79539 MD |
493 | One major factor can vary between streams: the number of event IDs assigned to |
494 | a stream. Luckily, this information tends to stay relatively constant (modulo | |
5ba9f198 | 495 | event registration while trace is being recorded), so we can specify different |
3bf79539 | 496 | representations for streams containing few event IDs and streams containing |
5ba9f198 MD |
497 | many event IDs, so we end up representing the event ID and timestamp as densely |
498 | as possible in each case. | |
499 | ||
3bf79539 MD |
500 | We therefore provide two types of events headers. Type 1 accommodates streams |
501 | with less than 31 event IDs. Type 2 accommodates streams with 31 or more event | |
5ba9f198 MD |
502 | IDs. |
503 | ||
504 | The "extended headers" are used in the rare occasions where the information | |
3bf79539 MD |
505 | cannot be represented in the ranges available in the event header. They are also |
506 | used in the rare occasions where the data required for a field could not be | |
507 | collected: the flag corresponding to the missing field within the missing_fields | |
508 | array is then set to 1. | |
5ba9f198 MD |
509 | |
510 | Types uintX_t represent an X-bit unsigned integer. | |
511 | ||
512 | ||
513 | 6.1.1 Type 1 - Few event IDs | |
514 | ||
515 | - Aligned on 32-bit (or 8-bit if byte-packed, depending on the architecture | |
516 | preference). | |
517 | - Fixed size: 32 bits. | |
518 | - Native architecture byte ordering. | |
519 | ||
80fd2569 MD |
520 | struct event_header_1 { |
521 | uint5_t id; /* | |
5ba9f198 MD |
522 | * id: range: 0 - 30. |
523 | * id 31 is reserved to indicate a following | |
524 | * extended header. | |
525 | */ | |
80fd2569 | 526 | uint27_t timestamp; |
5ba9f198 MD |
527 | }; |
528 | ||
529 | The end of a type 1 header is aligned on a 32-bit boundary (or packed). | |
530 | ||
531 | ||
532 | 6.1.2 Extended Type 1 Event Header | |
533 | ||
534 | - Follows struct event_header_1, which is aligned on 32-bit, so no need to | |
535 | realign. | |
3bf79539 | 536 | - Variable size (depends on the number of fields per event). |
5ba9f198 | 537 | - Native architecture byte ordering. |
80fd2569 | 538 | - NR_FIELDS is the number of fields within the event. |
5ba9f198 | 539 | |
80fd2569 MD |
540 | struct event_header_1_ext { |
541 | uint32_t id; /* 32-bit event IDs */ | |
542 | uint64_t timestamp; /* 64-bit timestamps */ | |
543 | uint1_t missing_fields[NR_FIELDS]; /* missing event fields bitmap */ | |
5ba9f198 MD |
544 | }; |
545 | ||
5ba9f198 MD |
546 | |
547 | 6.1.3 Type 2 - Many event IDs | |
548 | ||
549 | - Aligned on 32-bit (or 8-bit if byte-packed, depending on the architecture | |
550 | preference). | |
551 | - Fixed size: 48 bits. | |
552 | - Native architecture byte ordering. | |
553 | ||
80fd2569 MD |
554 | struct event_header_2 { |
555 | uint32_t timestamp; | |
556 | uint16_t id; /* | |
5ba9f198 MD |
557 | * id: range: 0 - 65534. |
558 | * id 65535 is reserved to indicate a following | |
559 | * extended header. | |
560 | */ | |
5ba9f198 MD |
561 | }; |
562 | ||
563 | The end of a type 2 header is aligned on a 16-bit boundary (or 8-bit if | |
564 | byte-packed). | |
565 | ||
566 | ||
567 | 6.1.4 Extended Type 2 Event Header | |
568 | ||
569 | - Follows struct event_header_2, which alignment end on a 16-bit boundary, so | |
3bf79539 | 570 | we need to align on 64-bit integer architecture alignment (or 8-bit if |
5ba9f198 | 571 | byte-packed). |
3bf79539 | 572 | - Variable size (depends on the number of fields per event). |
5ba9f198 | 573 | - Native architecture byte ordering. |
80fd2569 | 574 | - NR_FIELDS is the number of fields within the event. |
5ba9f198 | 575 | |
80fd2569 MD |
576 | struct event_header_2_ext { |
577 | uint64_t timestamp; /* 64-bit timestamps */ | |
578 | uint32_t id; /* 32-bit event IDs */ | |
579 | uint1_t missing_fields[NR_FIELDS]; /* missing event fields bitmap */ | |
5ba9f198 MD |
580 | }; |
581 | ||
5ba9f198 MD |
582 | |
583 | 6.2 Event Context | |
584 | ||
585 | The event context contains information relative to the current event. The choice | |
3bf79539 | 586 | and meaning of this information is specified by the metadata "stream" |
5ba9f198 | 587 | information. For this trace format, event context is usually empty, except when |
3bf79539 | 588 | the metadata "stream" information specifies otherwise by declaring a non-empty |
5ba9f198 MD |
589 | structure for the event context. An example of event context is to save the |
590 | event payload size with each event, or to save the current PID with each event. | |
3bf79539 | 591 | These are declared within the stream declaration within the metadata. |
5ba9f198 | 592 | |
3bf79539 | 593 | An example event context type: |
5ba9f198 | 594 | |
80fd2569 MD |
595 | struct event_context { |
596 | uint pid; | |
597 | uint16_t payload_size; | |
3bf79539 | 598 | }; |
5ba9f198 MD |
599 | |
600 | ||
601 | 6.3 Event Payload | |
602 | ||
603 | An event payload contains fields specific to a given event type. The fields | |
604 | belonging to an event type are described in the event-specific metadata | |
605 | within a structure type. | |
606 | ||
607 | 6.3.1 Padding | |
608 | ||
609 | No padding at the end of the event payload. This differs from the ISO/C standard | |
610 | for structures, but follows the CTF standard for structures. In a trace, even | |
611 | though it makes sense to align the beginning of a structure, it really makes no | |
612 | sense to add padding at the end of the structure, because structures are usually | |
613 | not followed by a structure of the same type. | |
614 | ||
615 | This trick can be done by adding a zero-length "end" field at the end of the C | |
616 | structures, and by using the offset of this field rather than using sizeof() | |
3bf79539 | 617 | when calculating the size of a structure (see Appendix "A. Helper macros"). |
5ba9f198 MD |
618 | |
619 | 6.3.2 Alignment | |
620 | ||
621 | The event payload is aligned on the largest alignment required by types | |
622 | contained within the payload. (This follows the ISO/C standard for structures) | |
623 | ||
624 | ||
625 | ||
626 | 7. Metadata | |
627 | ||
3bf79539 MD |
628 | The meta-data is located in a stream named "metadata". It is made of "event |
629 | packets", which each start with an event packet header. The event type within | |
630 | the metadata stream have no event header nor event context. Each event only | |
5ba9f198 | 631 | contains a null-terminated "string" payload, which is a metadata description |
3bf79539 MD |
632 | entry. The events are packed one next to another. Each event packet start with |
633 | an event packet header, which contains, amongst other fields, the magic number | |
634 | and trace UUID. | |
5ba9f198 MD |
635 | |
636 | The metadata can be parsed by reading through the metadata strings, skipping | |
3bf79539 | 637 | newlines and null-characters. Type names may contain spaces. |
5ba9f198 MD |
638 | |
639 | trace { | |
640 | major = value; /* Trace format version */ | |
641 | minor = value; | |
3bf79539 MD |
642 | uuid = value; /* Trace UUID */ |
643 | word_size = value; | |
644 | }; | |
5ba9f198 | 645 | |
3bf79539 MD |
646 | stream { |
647 | id = stream_id; | |
5ba9f198 | 648 | event { |
3bf79539 MD |
649 | /* Type 1 - Few event IDs; Type 2 - Many event IDs. See section 6.1. */ |
650 | header_type = event_header_1 OR event_header_2; | |
651 | /* | |
652 | * Extended event header type. Only present if specified in event header | |
653 | * on a per-event basis. | |
654 | */ | |
655 | header_type_ext = event_header_1_ext OR event_header_2_ext; | |
80fd2569 MD |
656 | context_type = struct { |
657 | ... | |
658 | }; | |
3bf79539 MD |
659 | }; |
660 | packet { | |
80fd2569 MD |
661 | context_type = struct { |
662 | ... | |
663 | }; | |
3bf79539 MD |
664 | }; |
665 | }; | |
5ba9f198 MD |
666 | |
667 | event { | |
3d13ef1a | 668 | name = event_name; |
3bf79539 MD |
669 | id = value; /* Numeric identifier within the stream */ |
670 | stream = stream_id; | |
80fd2569 MD |
671 | fields = struct { |
672 | ... | |
673 | }; | |
3bf79539 | 674 | }; |
5ba9f198 MD |
675 | |
676 | /* More detail on types in section 4. Types */ | |
677 | ||
3d13ef1a MD |
678 | /* |
679 | * Named types: | |
680 | * | |
681 | * A named type can only have a prefix and postfix if it aliases a CTF basic | |
682 | * type. A type name aliasing another type name cannot have prefix nor postfix, | |
683 | * but the type aliased can have a prefix and/or postfix. | |
684 | */ | |
685 | ||
686 | typedef aliased_type_prefix aliased_type new_type aliased_type_postfix; | |
2152348f | 687 | |
3d13ef1a | 688 | /* e.g.: typedef struct example new_type_name[10]; */ |
80fd2569 MD |
689 | |
690 | typedef type_class { | |
691 | ... | |
3d13ef1a | 692 | } new_type_prefix new_type new_type_postfix; |
2152348f | 693 | |
3d13ef1a MD |
694 | /* |
695 | * e.g.: | |
696 | * typedef integer { | |
697 | * size = 32; | |
698 | * align = 32; | |
699 | * signed = false; | |
700 | * } struct page *; | |
701 | */ | |
80fd2569 MD |
702 | |
703 | struct name { | |
3bf79539 MD |
704 | ... |
705 | }; | |
5ba9f198 | 706 | |
4767a9e7 | 707 | enum <integer_type or size> name { |
3bf79539 MD |
708 | ... |
709 | }; | |
710 | ||
2152348f MD |
711 | |
712 | /* Unnamed types, contained within compound type fields or typedef. */ | |
713 | ||
80fd2569 MD |
714 | struct { |
715 | ... | |
2152348f | 716 | } |
5ba9f198 | 717 | |
4767a9e7 | 718 | enum <integer_type or size> { |
80fd2569 | 719 | ... |
2152348f MD |
720 | } |
721 | ||
722 | typedef type new_type[length]; | |
3bf79539 | 723 | |
2152348f MD |
724 | struct { |
725 | type field_name[length]; | |
726 | } | |
727 | ||
728 | typedef type new_type[length_type]; | |
729 | ||
730 | struct { | |
731 | type field_name[length_type]; | |
732 | } | |
733 | ||
734 | integer { | |
80fd2569 | 735 | ... |
2152348f | 736 | } |
3bf79539 | 737 | |
2152348f | 738 | floating_point { |
80fd2569 | 739 | ... |
2152348f MD |
740 | } |
741 | ||
742 | struct { | |
743 | integer_type field_name:size; /* GNU/C bitfield */ | |
744 | } | |
745 | ||
746 | struct { | |
747 | string field_name; | |
748 | } | |
3bf79539 MD |
749 | |
750 | A. Helper macros | |
5ba9f198 MD |
751 | |
752 | The two following macros keep track of the size of a GNU/C structure without | |
753 | padding at the end by placing HEADER_END as the last field. A one byte end field | |
754 | is used for C90 compatibility (C99 flexible arrays could be used here). Note | |
755 | that this does not affect the effective structure size, which should always be | |
756 | calculated with the header_sizeof() helper. | |
757 | ||
758 | #define HEADER_END char end_field | |
759 | #define header_sizeof(type) offsetof(typeof(type), end_field) | |
3bf79539 MD |
760 | |
761 | ||
762 | B. Stream Header Rationale | |
763 | ||
764 | An event stream is divided in contiguous event packets of variable size. These | |
765 | subdivisions allow the trace analyzer to perform a fast binary search by time | |
766 | within the stream (typically requiring to index only the event packet headers) | |
767 | without reading the whole stream. These subdivisions have a variable size to | |
768 | eliminate the need to transfer the event packet padding when partially filled | |
769 | event packets must be sent when streaming a trace for live viewing/analysis. | |
770 | An event packet can contain a certain amount of padding at the end. Dividing | |
771 | streams into event packets is also useful for network streaming over UDP and | |
772 | flight recorder mode tracing (a whole event packet can be swapped out of the | |
773 | buffer atomically for reading). | |
774 | ||
775 | The stream header is repeated at the beginning of each event packet to allow | |
776 | flexibility in terms of: | |
777 | ||
778 | - streaming support, | |
779 | - allowing arbitrary buffers to be discarded without making the trace | |
780 | unreadable, | |
781 | - allow UDP packet loss handling by either dealing with missing event packet | |
782 | or asking for re-transmission. | |
783 | - transparently support flight recorder mode, | |
784 | - transparently support crash dump. | |
785 | ||
786 | The event stream header will therefore be referred to as the "event packet | |
787 | header" throughout the rest of this document. |