1 # Common Trace Format (CTF) Specification (v1.8.3)
3 **Author**: Mathieu Desnoyers, [EfficiOS Inc.](http://www.efficios.com/)
5 The goal of the present document is to specify a trace format that suits
6 the needs of the embedded, telecom, high-performance and kernel
7 communities. It is based on the
8 [Common Trace Format Requirements (v1.4)](http://git.efficios.com/?p=ctf.git;a=blob_plain;f=common-trace-format-reqs.txt;hb=master)
9 document. It is designed to allow traces to be natively generated by the
10 Linux kernel, Linux user space applications written in C/C++, and
11 hardware components. One major element of CTF is the Trace Stream
12 Description Language (TSDL) which flexibility enables description of
13 various binary trace stream layouts.
15 The latest version of this document can be found at:
17 * Git: `git clone git://git.efficios.com/ctf.git`
18 * [Gitweb](http://git.efficios.com/?p=ctf.git)
20 A reference implementation of a library to read and write this trace
21 format is being implemented within the
22 [Babeltrace](http://www.efficios.com/babeltrace) project, a converter
23 between trace formats. The development tree is available at:
25 * Git: `git clone git://git.efficios.com/babeltrace.git`
26 * [Gitweb](http://git.efficios.com/?p=babeltrace.git)
28 The [CE Workgroup](http://www.linuxfoundation.org/collaborate/workgroups/celf)
29 of the Linux Foundation, [Ericsson](http://www.ericsson.com/), and
30 [EfficiOS](http://www.efficios.com/) have sponsored this work.
34 1. Preliminary definitions
35 2. High-level representation of a trace
39 4.1.1 Type inheritance
49 4.2.2 Variants (discriminated/tagged unions)
53 5. Event packet header
54 5.1 Event packet header description
55 5.2 Event packet context description
58 6.1.1 Type 1: few event IDs
59 6.1.2 Type 2: many event IDs
60 6.2 Stream event context and event context
64 7. Trace Stream Description Language (TSDL)
66 7.2 Declaration vs definition
69 7.3.2 Static and dynamic scopes
73 B. Stream header rationale
76 C.1.1 Lexical elements
79 C.1.4 Universal character names
83 C.2 Phrase structure grammar
85 C.2.3 CTF-specific declarations
88 ## 1. Preliminary definitions
90 * **Event trace**: an ordered sequence of events.
91 * **Event stream**: an ordered sequence of events, containing a
92 subset of the trace event types.
93 * **Event packet**: a sequence of physically contiguous events within
95 * **Event**: this is the basic entry in a trace. Also known as
97 * An **event identifier** (ID) relates to the class (a type) of
98 event within an event stream, e.g. event `irq_entry`.
99 * An **event** (or event record) relates to a specific instance of
100 an event class, e.g. event `irq_entry`, at time _X_, on CPU _Y_.
101 * Source architecture: architecture writing the trace.
102 * Reader architecture: architecture reading the trace.
105 ## 2. High-level representation of a trace
107 A _trace_ is divided into multiple event streams. Each event stream
108 contains a subset of the trace event types.
110 The final output of the trace, after its generation and optional
111 transport over the network, is expected to be either on permanent or
112 temporary storage in a virtual file system. Because each event stream
113 is appended to while a trace is being recorded, each is associated with
114 a distinct set of files for output. Therefore, a stored trace can be
115 represented as a directory containing zero, one or more files
118 Metadata description associated with the trace contains information on
119 trace event types expressed in the _Trace Stream Description Language_
120 (TSDL). This language describes:
124 * Per-trace event header description
125 * Per-stream event header description
126 * Per-stream event context description
128 * Event type to stream mapping
129 * Event type to name mapping
130 * Event type to ID mapping
131 * Event context description
132 * Event fields description
137 An _event stream_ can be divided into contiguous event packets of
138 variable size. An event packet can contain a certain amount of padding
139 at the end. The stream header is repeated at the beginning of each
140 event packet. The rationale for the event stream design choices is
141 explained in [Stream header rationale](#specB).
143 The event stream header will therefore be referred to as the
144 _event packet header_ throughout the rest of this document.
149 Types are organized as type classes. Each type class belong to either
150 of two kind of types: _basic types_ or _compound types_.
155 A basic type is a scalar type, as described in this section. It
156 includes integers, GNU/C bitfields, enumerations, and floating
160 #### 4.1.1 Type inheritance
162 Type specifications can be inherited to allow deriving types from a
163 type class. For example, see the uint32_t named type derived from the
164 [_integer_ type](#spec4.1.5) class. Types have a precise binary
165 representation in the trace. A type class has methods to read and write
166 these types, but must be derived into a type to be usable in an event
172 We define _byte-packed_ types as aligned on the byte size, namely 8-bit.
173 We define _bit-packed_ types as following on the next bit, as defined
174 by the [Integers](#spec4.1.5) section.
176 Each basic type must specify its alignment, in bits. Examples of
177 possible alignments are: bit-packed (`align = 1`), byte-packed
178 (`align = 8`), or word-aligned (e.g. `align = 32` or `align = 64`).
179 The choice depends on the architecture preference and compactness vs
180 performance trade-offs of the implementation. Architectures providing
181 fast unaligned write byte-packed basic types to save space, aligning
182 each type on byte boundaries (8-bit). Architectures with slow unaligned
183 writes align types on specific alignment values. If no specific
184 alignment is declared for a type, it is assumed to be bit-packed for
185 integers with size not multiple of 8 bits and for gcc bitfields. All
186 other basic types are byte-packed by default. It is however recommended
187 to always specify the alignment explicitly. Alignment values must be
188 power of two. Compound types are aligned as specified in their
189 individual specification.
191 The base offset used for field alignment is the start of the packet
192 containing the field. For instance, a field aligned on 32-bit needs to
193 be at an offset multiple of 32-bit from the start of the packet that
196 TSDL metadata attribute representation of a specific alignment:
199 align = /* value in bits */;
202 #### 4.1.3 Byte order
204 By default, byte order of a basic type is the byte order described in
205 the trace description. It can be overridden by specifying a
206 `byte_order` attribute for a basic type. Typical use-case is to specify
207 the network byte order (big endian: `be`) to save data captured from
208 the network into the trace without conversion.
210 TSDL metadata representation:
213 /* network and be are aliases */
214 byte_order = /* native OR network OR be OR le */;
217 The `native` keyword selects the byte order described in the trace
218 description. The `network` byte order is an alias for big endian.
220 Even though the trace description section is not per se a type, for
221 sake of clarity, it should be noted that `native` and `network` byte
222 orders are only allowed within type declaration. The `byte_order`
223 specified in the trace description section only accepts `be` or `le`
229 Type size, in bits, for integers and floats is that returned by
230 `sizeof()` in C multiplied by `CHAR_BIT`. We require the size of `char`
231 and `unsigned char` types (`CHAR_BIT`) to be fixed to 8 bits for
232 cross-endianness compatibility.
234 TSDL metadata representation:
237 size = /* value is in bits */;
243 Signed integers are represented in two-complement. Integer alignment,
244 size, signedness and byte ordering are defined in the TSDL metadata.
245 Integers aligned on byte size (8-bit) and with length multiple of byte
246 size (8-bit) correspond to the C99 standard integers. In addition,
247 integers with alignment and/or size that are _not_ a multiple of the
248 byte size are permitted; these correspond to the C99 standard bitfields,
249 with the added specification that the CTF integer bitfields have a fixed
250 binary representation. Integer size needs to be a positive integer.
251 Integers of size 0 are **forbidden**. An MIT-licensed reference
252 implementation of the CTF portable bitfields is available
253 [here](http://git.efficios.com/?p=babeltrace.git;a=blob;f=include/babeltrace/bitfield.h).
255 Binary representation of integers:
257 * On little and big endian:
258 * Within a byte, high bits correspond to an integer high bits, and
259 low bits correspond to low bits
261 * Integer across multiple bytes are placed from the less significant
262 to the most significant
263 * Consecutive integers are placed from lower bits to higher bits
266 * Integer across multiple bytes are placed from the most significant
267 to the less significant
268 * Consecutive integers are placed from higher bits to lower bits
271 This binary representation is derived from the bitfield implementation
272 in GCC for little and big endian. However, contrary to what GCC does,
273 integers can cross units boundaries (no padding is required). Padding
274 can be [explicitly added](#spec4.1.6) to follow the GCC layout if needed.
276 TSDL metadata representation:
280 signed = /* true OR false */; /* default: false */
281 byte_order = /* native OR network OR be OR le */; /* default: native */
282 size = /* value in bits */; /* no default */
283 align = /* value in bits */;
285 /* base used for pretty-printing output; default: decimal */
286 base = /* decimal OR dec OR d OR i OR u OR 10 OR hexadecimal OR hex
287 OR x OR X OR p OR 16 OR octal OR oct OR o OR 8 OR binary
290 /* character encoding */
291 encoding = /* none or UTF8 or ASCII */; /* default: none */
295 Example of type inheritance (creation of a `uint32_t` named type):
305 Definition of a named 5-bit signed bitfield:
315 The character encoding field can be used to specify that the integer
316 must be printed as a text character when read. e.g.:
327 #### 4.1.6 GNU/C bitfields
329 The GNU/C bitfields follow closely the integer representation, with a
330 particularity on alignment: if a bitfield cannot fit in the current
331 unit, the unit is padded and the bitfield starts at the following unit.
332 The unit size is defined by the size of the type `unit_type`.
334 TSDL metadata representation:
340 As an example, the following structure declared in C compiled by GCC:
349 The example structure is aligned on the largest element (short). The
350 second bitfield would be aligned on the next unit boundary, because it
351 would not fit in the current unit.
354 #### 4.1.7 Floating point
356 The floating point values byte ordering is defined in the TSDL metadata.
358 Floating point values follow the IEEE 754-2008 standard interchange
359 formats. Description of the floating point values include the exponent
360 and mantissa size in bits. Some requirements are imposed on the
361 floating point values:
363 * `FLT_RADIX` must be 2.
364 * `mant_dig` is the number of digits represented in the mantissa. It is
365 specified by the ISO C99 standard, section 5.2.4, as `FLT_MANT_DIG`,
366 `DBL_MANT_DIG` and `LDBL_MANT_DIG` as defined by `<float.h>`.
367 * `exp_dig` is the number of digits represented in the exponent. Given
368 that `mant_dig` is one bit more than its actual size in bits (leading
369 1 is not needed) and also given that the sign bit always takes one
370 bit, `exp_dig` can be specified as:
371 * `sizeof(float) * CHAR_BIT - FLT_MANT_DIG`
372 * `sizeof(double) * CHAR_BIT - DBL_MANT_DIG`
373 * `sizeof(long double) * CHAR_BIT - LDBL_MANT_DIG`
375 TSDL metadata representation:
379 exp_dig = /* value */;
380 mant_dig = /* value */;
381 byte_order = /* native OR network OR be OR le */;
386 Example of type inheritance:
389 typealias floating_point {
390 exp_dig = 8; /* sizeof(float) * CHAR_BIT - FLT_MANT_DIG */
391 mant_dig = 24; /* FLT_MANT_DIG */
397 TODO: define NaN, +inf, -inf behavior.
399 Bit-packed, byte-packed or larger alignments can be used for floating
400 point values, similarly to integers.
403 #### 4.1.8 Enumerations
405 Enumerations are a mapping between an integer type and a table of
406 strings. The numerical representation of the enumeration follows the
407 integer type specified by the metadata. The enumeration mapping table
408 is detailed in the enumeration description within the metadata. The
409 mapping table maps inclusive value ranges (or single values) to strings.
410 Instead of being limited to simple `value -> string` mappings, these
411 enumerations map `[ start_value ... end_value ] -> string`, which map
412 inclusive ranges of values to strings. An enumeration from the C
413 language can be represented in this format by having the same
414 `start_value` and `end_value` for each mapping, which is in fact a
415 range of size 1. This single-value range is supported without repeating
416 the start and end values with the `value = string` declaration.
417 Enumerations need to contain at least one entry.
420 enum name : integer_type {
421 somestring = /* start_value1 */ ... /* end_value1 */,
422 "other string" = /* start_value2 */ ... /* end_value2 */,
423 yet_another_string, /* will be assigned to end_value2 + 1 */
424 "some other string" = /* value */,
429 If the values are omitted, the enumeration starts at 0 and increment
430 of 1 for each entry. An entry with omitted value that follows a range
431 entry takes as value the `end_value` of the previous range + 1:
434 enum name : unsigned int {
443 Overlapping ranges within a single enumeration are implementation
446 A nameless enumeration can be declared as a field type or as part of
450 enum : integer_type {
455 Enumerations omitting the container type `: integer_type` use the `int`
456 type (for compatibility with C99). The `int` type _must be_ previously
460 typealias integer { size = 32; align = 32; signed = true; } := int;
467 An enumeration field can have an integral value for which the associated
468 enumeration type does not map to a string.
470 ### 4.2 Compound types
472 Compound are aggregation of type declarations. Compound types include
473 structures, variant, arrays, sequences, and strings.
476 #### 4.2.1 Structures
478 Structures are aligned on the largest alignment required by basic types
479 contained within the structure. (This follows the ISO/C standard for
482 TSDL metadata representation of a named structure:
486 field_type field_name;
487 field_type field_name;
496 integer { /* nameless type */
501 uint64_t second_field_name; /* named type declared in the metadata */
505 The fields are placed in a sequence next to each other. They each
506 possess a field name, which is a unique identifier within the structure.
507 The identifier is not allowed to use any [reserved keyword](#specC.1.2).
508 Replacing reserved keywords with underscore-prefixed field names is
509 **recommended**. Fields starting with an underscore should have their
510 leading underscore removed by the CTF trace readers.
512 A nameless structure can be declared as a field type or as part of
521 Alignment for a structure compound type can be forced to a minimum
522 value by adding an `align` specifier after the declaration of a
523 structure body. This attribute is read as: `align(value)`. The value is
524 specified in bits. The structure will be aligned on the maximum value
525 between this attribute and the alignment required by the basic types
526 contained within the structure. e.g.
534 #### 4.2.2 Variants (discriminated/tagged unions)
536 A CTF variant is a selection between different types. A CTF variant must
537 always be defined within the scope of a structure or within fields
538 contained within a structure (defined recursively). A _tag_ enumeration
539 field must appear in either the same static scope, prior to the variant
540 field (in field declaration order), in an upper static scope, or in an
541 upper dynamic scope (see [Static and dynamic scopes](#spec7.3.2)).
542 The type selection is indicated by the mapping from the enumeration
543 value to the string used as variant type selector. The field to use as
544 tag is specified by the `tag_field`, specified between `< >` after the
545 `variant` keyword for unnamed variants, and after _variant name_ for
546 named variants. It is not required that each enumeration mapping appears
547 as variant type tag field. It is also not required that each variant
548 type tag appears as enumeration mapping. However, it is required that
549 any enumeration mapping encountered within a stream has a matching
550 variant type tag field.
552 The alignment of the variant is the alignment of the type as selected
553 by the tag value for the specific instance of the variant. The size of
554 the variant is the size as selected by the tag value for the specific
555 instance of the variant.
557 The alignment of the type containing the variant is independent of the
558 variant alignment. For instance, if a structure contains two fields, a
559 32-bit integer, aligned on 32 bits, and a variant, which contains two
560 choices: either a 32-bit field, aligned on 32 bits, or a 64-bit field,
561 aligned on 64 bits, the alignment of the outmost structure will be
562 32-bit (the alignment of its largest field, disregarding the alignment
563 of the variant). The alignment of the variant will depend on the
564 selector: if the variant's 32-bit field is selected, its alignment will
565 be 32-bit, or 64-bit otherwise. It is important to note that variants
566 are specifically tailored for compactness in a stream. Therefore, the
567 relative offsets of compound type fields can vary depending on the
568 offset at which the compound type starts if it contains a variant
569 that itself contains a type with alignment larger than the largest field
570 contained within the compound type. This is caused by the fact that the
571 compound type may contain the enumeration that select the variant's
572 choice, and therefore the alignment to be applied to the compound type
573 cannot be determined before encountering the enumeration.
575 Each variant type selector possess a field name, which is a unique
576 identifier within the variant. The identifier is not allowed to use any
577 [reserved keyword](#C.1.2). Replacing reserved keywords with
578 underscore-prefixed field names is recommended. Fields starting with an
579 underscore should have their leading underscore removed by the CTF trace
582 A named variant declaration followed by its definition within a
583 structure declaration:
594 enum : integer_type { sel1, sel2, sel3, /* ... */ } tag_field;
596 variant name <tag_field> v;
600 An unnamed variant definition within a structure is expressed by the
601 following TSDL metadata:
605 enum : integer_type { sel1, sel2, sel3, /* ... */ } tag_field;
607 variant <tag_field> {
616 Example of a named variant within a sequence that refers to a single
627 enum : uint2_t { a, b, c } choice;
629 variant example <choice> v[seqlen];
633 Example of an unnamed variant:
637 enum : uint2_t { a, b, c, d } choice;
639 /* Unrelated fields can be added between the variant and its tag */
653 Example of an unnamed variant within an array:
657 enum : uint2_t { a, b, c } choice;
666 Example of a variant type definition within a structure, where the
667 defined type is then declared within an array of structures. This
668 variant refers to a tag located in an upper static scope. This example
669 clearly shows that a variant type definition referring to the tag `x`
670 uses the closest preceding field from the static scope of the type
675 enum : uint2_t { a, b, c, d } x;
678 * "x" refers to the preceding "x" enumeration in the
679 * static scope of the type definition.
681 typedef variant <x> {
688 enum : int { x, y, z } x; /* This enumeration is not used by "v". */
690 /* "v" uses the "enum : uint2_t { a, b, c, d }" tag. */
699 Arrays are fixed-length. Their length is declared in the type
700 declaration within the metadata. They contain an array of _inner type_
701 elements, which can refer to any type not containing the type of the
702 array being declared (no circular dependency). The length is the number
703 of elements in an array.
705 TSDL metadata representation of a named array:
708 typedef elem_type name[/* length */];
711 A nameless array can be declared as a field type within a
715 uint8_t field_name[10];
718 Arrays are always aligned on their element alignment requirement.
723 Sequences are dynamically-sized arrays. They refer to a _length_
724 unsigned integer field, which must appear in either the same static
725 scope, prior to the sequence field (in field declaration order),
726 in an upper static scope, or in an upper dynamic scope
727 (see [Static and dynamic scopes](#spec7.3.2)). This length field represents
728 the number of elements in the sequence. The sequence per se is an
729 array of _inner type_ elements.
731 TSDL metadata representation for a sequence type definition:
735 unsigned int length_field;
736 typedef elem_type typename[length_field];
737 typename seq_field_name;
741 A sequence can also be declared as a field type, e.g.:
745 unsigned int length_field;
746 long seq_field_name[length_field];
750 Multiple sequences can refer to the same length field, and these length
751 fields can be in a different upper dynamic scope, e.g., assuming the
752 `stream.event.header` defines:
758 event.header := struct {
767 long seq_a[stream.event.header.seq_len];
768 char seq_b[stream.event.header.seq_len];
773 The sequence elements follow the [array](#spec4.2.3) specifications.
778 Strings are an array of _bytes_ of variable size and are terminated by
779 a `'\0'` "NULL" character. Their encoding is described in the TSDL
780 metadata. In absence of encoding attribute information, the default
783 TSDL metadata representation of a named string type:
787 encoding = /* UTF8 OR ASCII */;
791 A nameless string type can be declared as a field type:
794 string field_name; /* use default UTF8 encoding */
797 Strings are always aligned on byte size.
800 ## 5. Event packet header
802 The event packet header consists of two parts: the
803 _event packet header_ is the same for all streams of a trace. The
804 second part, the _event packet context_, is described on a per-stream
805 basis. Both are described in the TSDL metadata.
807 Event packet header (all fields are optional, specified by
810 * **Magic number** (CTF magic number: 0xC1FC1FC1) specifies that this is
811 a CTF packet. This magic number is optional, but when present, it
812 should come at the very beginning of the packet.
813 * **Trace UUID**, used to ensure the event packet match the metadata used.
814 Note: we cannot use a metadata checksum in every cases instead of a
815 UUID because metadata can be appended to while tracing is active.
816 This field is optional.
817 * **Stream ID**, used as reference to stream description in metadata.
818 This field is optional if there is only one stream description in
819 the metadata, but becomes required if there are more than one
820 stream in the TSDL metadata description.
822 Event packet context (all fields are optional, specified by
825 * Event packet **content size** (in bits).
826 * Event packet **size** (in bits, includes padding).
827 * Event packet content checksum. Checksum excludes the event packet
829 * Per-stream event **packet sequence count** (to deal with UDP packet
830 loss). The number of significant sequence counter bits should also
831 be present, so wrap-arounds are dealt with correctly.
832 * Time-stamp at the beginning and timestamp at the end of the event
833 packet. Both timestamps are written in the packet header, but
834 sampled respectively while (or before) writing the first event and
835 while (or after) writing the last event in the packet. The inclusive
836 range between these timestamps should include all event timestamps
837 assigned to events contained within the packet. The timestamp at the
838 beginning of an event packet is guaranteed to be less than or equal
839 to the timestamp at the end of that event packet. The timestamp at
840 the beginning of an event packet is guaranteed to be grater than or
841 equal to timestamps at the beginning of any prior packet within the
842 same stream. The timestamp at the end of an event packet is
843 guaranteed to be less than or equal to the timestamps at the end of
844 any following packet within the same stream. See [Clocks](#spec8)
846 * **Events discarded count**. Snapshot of a per-stream
847 free-running counter, counting the number of events discarded that
848 were supposed to be written in the stream after the last event in
849 the event packet. Note: producer-consumer buffer full condition can
850 fill the current event packet with padding so we know exactly where
851 events have been discarded. However, if the buffer full condition
852 chooses not to fill the current event packet with padding, all we
853 know about the timestamp range in which the events have been
854 discarded is that it is somewhere between the beginning and the end
856 * Lossless **compression scheme** used for the event packet content.
857 Applied directly to raw data. New types of compression can be added
858 in following versions of the format.
859 * 0: no compression scheme
863 * **Cypher** used for the event packet content. Applied after
867 * **Checksum scheme** used for the event packet content. Applied after
875 ### 5.1 Event packet header description
877 The event packet header layout is indicated by the
878 `trace.packet.header` field. Here is a recommended structure type for
879 the packet header with the fields typically expected (although these
880 fields are each optional):
883 struct event_packet_header {
891 packet.header := struct event_packet_header;
895 If the magic number (`magic` field) is not present,
896 tools such as `file` will have no mean to discover the file type.
898 If the `uuid` field is not present, no validation that the metadata
899 actually corresponds to the stream is performed.
901 If the `stream_id` packet header field is missing, the trace can only
902 contain a single stream. Its `id` field can be left out, and its events
903 don't need to declare a `stream_id` field.
906 ### 5.2 Event packet context description
908 Event packet context example. These are declared within the stream
909 declaration in the metadata. All these fields are optional. If the
910 packet size field is missing, the whole stream only contains a single
911 packet. If the content size field is missing, the packet is filled
912 (no padding). The content and packet sizes include all headers.
914 An example event packet context type:
917 struct event_packet_context {
918 uint64_t timestamp_begin;
919 uint64_t timestamp_end;
921 uint32_t stream_packet_count;
922 uint32_t events_discarded;
924 uint64_t content_size;
925 uint64_t packet_size;
926 uint8_t compression_scheme;
927 uint8_t encryption_scheme;
928 uint8_t checksum_scheme;
933 ## 6. Event Structure
935 The overall structure of an event is:
937 1. Event header (as specified by the stream metadata)
938 2. Stream event context (as specified by the stream metadata)
939 3. Event context (as specified by the event metadata)
940 4. Event payload (as specified by the event metadata)
942 This structure defines an implicit dynamic scoping, where variants
943 located in inner structures (those with a higher number in the listing
944 above) can refer to the fields of outer structures (with lower number
945 in the listing above). See [TSDL scopes](#spec7.3) for more detail.
947 The total length of an event is defined as the difference between the
948 end of its event payload and the end of the previous event's event
949 payload. Therefore, it includes the event header alignment padding, and
950 all its fields and their respective alignment padding. Events of length
956 Event headers can be described within the metadata. We hereby propose,
957 as an example, two types of events headers. Type 1 accommodates streams
958 with less than 31 event IDs. Type 2 accommodates streams with 31 or
961 One major factor can vary between streams: the number of event IDs
962 assigned to a stream. Luckily, this information tends to stay
963 relatively constant (modulo event registration while trace is being
964 recorded), so we can specify different representations for streams
965 containing few event IDs and streams containing many event IDs, so we
966 end up representing the event ID and timestamp as densely as possible
969 The header is extended in the rare occasions where the information
970 cannot be represented in the ranges available in the standard event
971 header. They are also used in the rare occasions where the data
972 required for a field could not be collected: the flag corresponding to
973 the missing field within the `missing_fields` array is then set to 1.
975 Types `uintX_t` represent an `X`-bit unsigned integer, as declared with
996 For more information about timestamp fields, see [Clocks](#spec8).
999 #### 6.1.1 Type 1: few event IDs
1001 * Aligned on 32-bit (or 8-bit if byte-packed, depending on the
1002 architecture preference)
1003 * Native architecture byte ordering
1004 * For `compact` selection, fixed size of 32 bits
1005 * For "extended" selection, size depends on the architecture and
1009 struct event_header_1 {
1011 * id: range: 0 - 30.
1012 * id 31 is reserved to indicate an extended header.
1014 enum : uint5_t { compact = 0 ... 30, extended = 31 } id;
1020 uint32_t id; /* 32-bit event IDs */
1021 uint64_t timestamp; /* 64-bit timestamps */
1024 } align(32); /* or align(8) */
1028 #### 6.1.2 Type 2: many event IDs
1030 * Aligned on 16-bit (or 8-bit if byte-packed, depending on the
1031 architecture preference)
1032 * Native architecture byte ordering
1033 * For `compact` selection, size depends on the architecture and
1035 * For `extended` selection, size depends on the architecture and
1039 struct event_header_2 {
1041 * id: range: 0 - 65534.
1042 * id 65535 is reserved to indicate an extended header.
1044 enum : uint16_t { compact = 0 ... 65534, extended = 65535 } id;
1050 uint32_t id; /* 32-bit event IDs */
1051 uint64_t timestamp; /* 64-bit timestamps */
1054 } align(16); /* or align(8) */
1058 ### 6.2 Stream event context and event context
1060 The event context contains information relative to the current event.
1061 The choice and meaning of this information is specified by the TSDL
1062 stream and event metadata descriptions. The stream context is applied
1063 to all events within the stream. The stream context structure follows
1064 the event header. The event context is applied to specific events. Its
1065 structure follows the stream context structure.
1067 An example of stream-level event context is to save the event payload
1068 size with each event, or to save the current PID with each event.
1069 These are declared within the stream declaration within the metadata:
1074 event.context := struct {
1076 uint16_t payload_size;
1081 An example of event-specific event context is to declare a bitmap of
1082 missing fields, only appended after the stream event context if the
1083 extended event header is selected. `NR_FIELDS` is the number of fields
1084 within the event (a numeric value).
1092 /* missing event fields bitmap */
1093 uint1_t missing_fields[NR_FIELDS];
1102 ### 6.3 Event payload
1104 An event payload contains fields specific to a given event type. The
1105 fields belonging to an event type are described in the event-specific
1106 metadata within a structure type.
1111 No padding at the end of the event payload. This differs from the ISO/C
1112 standard for structures, but follows the CTF standard for structures.
1113 In a trace, even though it makes sense to align the beginning of a
1114 structure, it really makes no sense to add padding at the end of the
1115 structure, because structures are usually not followed by a structure
1118 This trick can be done by adding a zero-length `end` field at the end
1119 of the C structures, and by using the offset of this field rather than
1120 using `sizeof()` when calculating the size of a structure
1121 (see [Helper macros](#specA)).
1124 #### 6.3.2 Alignment
1126 The event payload is aligned on the largest alignment required by types
1127 contained within the payload. This follows the ISO/C standard for
1131 ## 7. Trace Stream Description Language (TSDL)
1133 The Trace Stream Description Language (TSDL) allows expression of the
1134 binary trace streams layout in a C99-like Domain Specific Language
1140 The trace stream layout description is located in the trace metadata.
1141 The metadata is itself located in a stream identified by its name:
1144 The metadata description can be expressed in two different formats:
1145 text-only and packet-based. The text-only description facilitates
1146 generation of metadata and provides a convenient way to enter the
1147 metadata information by hand. The packet-based metadata provides the
1148 CTF stream packet facilities (checksumming, compression, encryption,
1149 network-readiness) for metadata stream generated and transported by a
1152 The text-only metadata file is a plain-text TSDL description. This file
1153 must begin with the following characters to identify the file as a CTF
1154 TSDL text-based metadata file (without the double-quotes):
1160 It must be followed by a space, and the version of the specification
1161 followed by the CTF trace, e.g.:
1167 These characters allow automated discovery of file type and CTF
1168 specification version. They are interpreted as a the beginning of a
1169 comment by the TSDL metadata parser. The comment can be continued to
1170 contain extra commented characters before it is closed.
1172 The packet-based metadata is made of _metadata packets_, which each
1173 start with a metadata packet header. The packet-based metadata
1174 description is detected by reading the magic number 0x75D11D57 at the
1175 beginning of the file. This magic number is also used to detect the
1176 endianness of the architecture by trying to read the CTF magic number
1177 and its counterpart in reversed endianness. The events within the
1178 metadata stream have no event header nor event context. Each event only
1179 contains a special _sequence_ payload, which is a sequence of bits which
1180 length is implicitly calculated by using the
1181 `trace.packet.header.content_size` field, minus the packet header size.
1182 The formatting of this sequence of bits is a plain-text representation
1183 of the TSDL description. Each metadata packet start with a special
1184 packet header, specific to the metadata stream, which contains,
1188 struct metadata_packet_header {
1189 uint32_t magic; /* 0x75D11D57 */
1190 uint8_t uuid[16]; /* Unique Universal Identifier */
1191 uint32_t checksum; /* 0 if unused */
1192 uint32_t content_size; /* in bits */
1193 uint32_t packet_size; /* in bits */
1194 uint8_t compression_scheme; /* 0 if unused */
1195 uint8_t encryption_scheme; /* 0 if unused */
1196 uint8_t checksum_scheme; /* 0 if unused */
1197 uint8_t major; /* CTF spec version major number */
1198 uint8_t minor; /* CTF spec version minor number */
1202 The packet-based metadata can be converted to a text-only metadata by
1203 concatenating all the strings it contains.
1205 In the textual representation of the metadata, the text contained
1206 within `/*` and `*/`, as well as within `//` and end of line, are
1207 treated as comments. Boolean values can be represented as `true`,
1208 `TRUE`, or `1` for true, and `false`, `FALSE`, or `0` for false. Within
1209 the string-based metadata description, the trace UUID is represented as
1210 a string of hexadecimal digits and dashes `-`. In the event packet
1211 header, the trace UUID is represented as an array of bytes.
1214 ### 7.2 Declaration vs definition
1216 A declaration associates a layout to a type, without specifying where
1217 this type is located in the event [structure hierarchy](#spec6).
1218 This therefore includes `typedef`, `typealias`, as well as all type
1219 specifiers. In certain circumstances (`typedef`, structure field and
1220 variant field), a declaration is followed by a declarator, which specify
1221 the newly defined type name (for `typedef`), or the field name (for
1222 declarations located within structure and variants). Array and sequence,
1223 declared with square brackets (`[` `]`), are part of the declarator,
1224 similarly to C99. The enumeration base type is specified by
1225 `: enum_base`, which is part of the type specifier. The variant tag
1226 name, specified between `<` `>`, is also part of the type specifier.
1228 A definition associates a type to a location in the event
1229 [structure hierarchy](#spec6). This association is denoted by `:=`,
1230 as shown in [TSDL scopes](#spec7.3).
1235 TSDL uses three different types of scoping: a lexical scope is used for
1236 declarations and type definitions, and static and dynamic scopes are
1237 used for variants references to tag fields (with relative and absolute
1238 path lookups) and for sequence references to length fields.
1241 #### 7.3.1 Lexical Scope
1243 Each of `trace`, `env`, `stream`, `event`, `struct` and `variant` have
1244 their own nestable declaration scope, within which types can be declared
1245 using `typedef` and `typealias`. A root declaration scope also contains
1246 all declarations located outside of any of the aforementioned
1247 declarations. An inner declaration scope can refer to type declared
1248 within its container lexical scope prior to the inner declaration scope.
1249 Redefinition of a typedef or typealias is not valid, although hiding an
1250 upper scope typedef or typealias is allowed within a sub-scope.
1253 #### 7.3.2 Static and dynamic scopes
1255 A local static scope consists in the scope generated by the declaration
1256 of fields within a compound type. A static scope is a local static scope
1257 augmented with the nested sub-static-scopes it contains.
1259 A dynamic scope consists in the static scope augmented with the
1260 implicit [event structure](#spec6) definition hierarchy.
1262 Multiple declarations of the same field name within a local static scope
1263 is not valid. It is however valid to re-use the same field name in
1264 different local scopes.
1266 Nested static and dynamic scopes form lookup paths. These are used for
1267 variant tag and sequence length references. They are used at the variant
1268 and sequence definition site to look up the location of the tag field
1269 associated with a variant, and to lookup up the location of the length
1270 field associated with a sequence.
1272 Variants and sequences can refer to a tag field either using a relative
1273 path or an absolute path. The relative path is relative to the scope in
1274 which the variant or sequence performing the lookup is located.
1275 Relative paths are only allowed to lookup within the same static scope,
1276 which includes its nested static scopes. Lookups targeting parent static
1277 scopes need to be performed with an absolute path.
1279 Absolute path lookups use the full path including the dynamic scope
1280 followed by a `.` and then the static scope. Therefore, variants (or
1281 sequences) in lower levels in the dynamic scope (e.g., event context)
1282 can refer to a tag (or length) field located in upper levels
1283 (e.g., in the event header) by specifying, in this case, the associated
1284 tag with `<stream.event.header.field_name>`. This allows, for instance,
1285 the event context to define a variant referring to the `id` field of
1286 the event header as selector.
1288 The dynamic scope prefixes are thus:
1290 * Trace environment: `<env. >`
1291 * Trace packet header: `<trace.packet.header. >`
1292 * Stream packet context: `<stream.packet.context. >`
1293 * Event header: `<stream.event.header. >`
1294 * Stream event context: `<stream.event.context. >`
1295 * Event context: `<event.context. >`
1296 * Event payload: `<event.fields. >`
1298 The target dynamic scope must be specified explicitly when referring to
1299 a field outside of the static scope (absolute scope reference). No
1300 conflict can occur between relative and dynamic paths, because the
1301 keywords `trace`, `stream`, and `event` are reserved, and thus not
1302 permitted as field names. It is recommended that field names clashing
1303 with CTF and C99 reserved keywords use an underscore prefix to
1304 eliminate the risk of generating a description containing an invalid
1305 field name. Consequently, fields starting with an underscore should have
1306 their leading underscore removed by the CTF trace readers.
1308 The information available in the dynamic scopes can be thought of as the
1309 current tracing context. At trace production, information about the
1310 current context is saved into the specified scope field levels. At trace
1311 consumption, for each event, the current trace context is therefore
1312 readable by accessing the upper dynamic scopes.
1315 ### 7.4 TSDL examples
1317 The grammar representing the TSDL metadata is presented in
1318 [TSDL grammar](#specC). This section presents a rather lighter reading that
1319 consists in examples of TSDL metadata, with template values.
1321 The stream ID can be left out if there is only one stream in the
1322 trace. The event `id` field can be left out if there is only one event
1327 major = /* value */; /* CTF spec version major number */
1328 minor = /* value */; /* CTF spec version minor number */
1329 uuid = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"; /* Trace UUID */
1330 byte_order = /* be OR le */; /* Endianness (required) */
1331 packet.header := struct {
1339 * The "env" (environment) scope contains assignment expressions. The
1340 * field names and content are implementation-defined.
1343 pid = /* value */; /* example */
1344 proc_name = "name"; /* example */
1349 id = /* stream_id */;
1350 /* Type 1 - Few event IDs; Type 2 - Many event IDs. See section 6.1. */
1351 event.header := /* event_header_1 OR event_header_2 */;
1352 event.context := struct {
1355 packet.context := struct {
1361 name = "event_name";
1362 id = /* value */; /* Numeric identifier within the stream */
1363 stream_id = /* stream_id */;
1364 loglevel = /* value */;
1365 model.emf.uri = "string";
1375 name = "event_name";
1383 More detail on [types](#spec4):
1389 * Type declarations behave similarly to the C standard.
1392 typedef aliased_type_specifiers new_type_declarators;
1394 /* e.g.: typedef struct example new_type_name[10]; */
1399 * The "typealias" declaration can be used to give a name (including
1400 * pointer declarator specifier) to a type. It should also be used to
1401 * map basic C types (float, int, unsigned long, ...) to a CTF type.
1402 * Typealias is a superset of "typedef": it also allows assignment of a
1403 * simple variable identifier to a type.
1406 typealias type_class {
1408 } := type_specifiers type_declarator;
1412 * typealias integer {
1416 * } := struct page *;
1418 * typealias integer {
1433 enum name : integer_type {
1438 Unnamed types, contained within compound type fields, `typedef` or
1460 enum : integer_type {
1466 typedef type new_type[length];
1469 type field_name[length];
1474 typedef type new_type[length_type];
1477 type field_name[length_type];
1495 integer_type field_name:size; /* GNU/C bitfield */
1508 Clock metadata allows to describe the clock topology of the system, as
1509 well as to detail each clock parameter. In absence of clock description,
1510 it is assumed that all fields named `timestamp` use the same clock
1511 source, which increments once per nanosecond.
1513 Describing a clock and how it is used by streams is threefold: first,
1514 the clock and clock topology should be described in a `clock`
1515 description block, e.g.:
1519 name = cycle_counter_sync;
1520 uuid = "62189bee-96dc-11e0-91a8-cfa3d89f3923";
1521 description = "Cycle counter synchronized across CPUs";
1522 freq = 1000000000; /* frequency, in Hz */
1523 /* precision in seconds is: 1000 * (1/freq) */
1526 * clock value offset from Epoch is:
1527 * offset_s + (offset * (1/freq))
1529 offset_s = 1326476837;
1535 The mandatory `name` field specifies the name of the clock identifier,
1536 which can later be used as a reference. The optional field `uuid` is
1537 the unique identifier of the clock. It can be used to correlate
1538 different traces that use the same clock. An optional textual
1539 description string can be added with the `description` field. The
1540 `freq` field is the initial frequency of the clock, in Hz. If the
1541 `freq` field is not present, the frequency is assumed to be 1000000000
1542 (providing clock increment of 1 ns). The optional `precision` field
1543 details the uncertainty on the clock measurements, in (1/freq) units.
1544 The `offset_s` and `offset` fields indicate the offset from
1545 POSIX.1 Epoch, 1970-01-01 00:00:00 +0000 (UTC), to the zero of value
1546 of the clock. The `offset_s` field is in seconds. The `offset` field is
1547 in (1/freq) units. If any of the `offset_s` or `offset` field is not
1548 present, it is assigned the 0 value. The field `absolute` is `TRUE` if
1549 the clock is a global reference across different clock UUID
1550 (e.g. NTP time). Otherwise, `absolute` is `FALSE`, and the clock can
1551 be considered as synchronized only with other clocks that have the same
1554 Secondly, a reference to this clock should be added within an integer
1559 size = 64; align = 1; signed = false;
1560 map = clock.cycle_counter_sync.value;
1564 Thirdly, stream declarations can reference the clock they use as a
1568 struct packet_context {
1569 uint64_ccnt_t ccnt_begin;
1570 uint64_ccnt_t ccnt_end;
1576 event.header := struct {
1577 uint64_ccnt_t timestamp;
1580 packet.context := struct packet_context;
1584 Within the stream event context, event context, and event payload,
1585 fields of N-bit integer type referring to a clock, if the integer overflows
1586 compared to the N low order bits of the clock prior value found in the
1587 same stream, then it is assumed that one, and only one, overflow
1588 occurred. It is therefore important that events encoding time on a small
1589 number of bits happen frequently enough to detect when more than one
1590 N-bit overflow occurs.
1592 In a packet context, clock field names ending with `_begin` and `_end`
1593 have a special meaning: this refers to the timestamps at, respectively,
1594 the beginning and the end of each packet. Those are required to be
1595 complete representations of the clock value.
1599 The two following macros keep track of the size of a GNU/C structure
1600 without padding at the end by placing HEADER_END as the last field.
1601 A one byte end field is used for C90 compatibility (C99 flexible arrays
1602 could be used here). Note that this does not affect the effective
1603 structure size, which should always be calculated with the
1604 `header_sizeof()` helper.
1607 #define HEADER_END char end_field
1608 #define header_sizeof(type) offsetof(typeof(type), end_field)
1611 ## B. Stream header rationale
1613 An event stream is divided in contiguous event packets of variable
1614 size. These subdivisions allow the trace analyzer to perform a fast
1615 binary search by time within the stream (typically requiring to index
1616 only the event packet headers) without reading the whole stream. These
1617 subdivisions have a variable size to eliminate the need to transfer the
1618 event packet padding when partially filled event packets must be sent
1619 when streaming a trace for live viewing/analysis. An event packet can
1620 contain a certain amount of padding at the end. Dividing streams into
1621 event packets is also useful for network streaming over UDP and flight
1622 recorder mode tracing (a whole event packet can be swapped out of the
1623 buffer atomically for reading).
1625 The stream header is repeated at the beginning of each event packet to
1626 allow flexibility in terms of:
1629 * allowing arbitrary buffers to be discarded without making the trace
1631 * allow UDP packet loss handling by either dealing with missing event packet
1632 or asking for re-transmission
1633 * transparently support flight recorder mode
1634 * transparently support crash dump
1641 * Common Trace Format (CTF) Trace Stream Description Language (TSDL) Grammar.
1643 * Inspired from the C99 grammar:
1644 * http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf (Annex A)
1645 * and c++1x grammar (draft)
1646 * http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3291.pdf (Annex A)
1648 * Specialized for CTF needs by including only constant and declarations from
1649 * C99 (excluding function declarations), and by adding support for variants,
1650 * sequences and CTF-specific specifiers. Enumeration container types
1651 * semantic is inspired from c++1x enum-base.
1656 ### C.1 Lexical grammar
1659 #### C.1.1 Lexical elements
1706 #### C.1.3 Identifiers
1711 identifier identifier-nondigit
1714 identifier-nondigit:
1716 universal-character-name
1717 any other implementation-defined characters
1721 [a-zA-Z] /* regular expression */
1724 [0-9] /* regular expression */
1728 #### C.1.4 Universal character names
1731 universal-character-name:
1733 \U hex-quad hex-quad
1736 hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit
1740 ##### C.1.5 Constants
1745 enumeration-constant
1749 decimal-constant integer-suffix-opt
1750 octal-constant integer-suffix-opt
1751 hexadecimal-constant integer-suffix-opt
1755 decimal-constant digit
1759 octal-constant octal-digit
1761 hexadecimal-constant:
1762 hexadecimal-prefix hexadecimal-digit
1763 hexadecimal-constant hexadecimal-digit
1773 unsigned-suffix long-suffix-opt
1774 unsigned-suffix long-long-suffix
1775 long-suffix unsigned-suffix-opt
1776 long-long-suffix unsigned-suffix-opt
1790 enumeration-constant:
1796 L' c-char-sequence '
1800 c-char-sequence c-char
1803 any member of source charset except single-quote ('), backslash
1804 (\), or new-line character.
1808 simple-escape-sequence
1809 octal-escape-sequence
1810 hexadecimal-escape-sequence
1811 universal-character-name
1813 simple-escape-sequence: one of
1814 \' \" \? \\ \a \b \f \n \r \t \v
1816 octal-escape-sequence:
1818 \ octal-digit octal-digit
1819 \ octal-digit octal-digit octal-digit
1821 hexadecimal-escape-sequence:
1822 \x hexadecimal-digit
1823 hexadecimal-escape-sequence hexadecimal-digit
1827 #### C.1.6 String literals
1831 " s-char-sequence-opt "
1832 L" s-char-sequence-opt "
1836 s-char-sequence s-char
1839 any member of source charset except double-quote ("), backslash
1840 (\), or new-line character.
1845 #### C.1.7 Punctuators
1849 [ ] ( ) { } . -> * + - < > : ; ... = ,
1853 ### C.2 Phrase structure grammar
1860 ( unary-expression )
1864 postfix-expression [ unary-expression ]
1865 postfix-expression . identifier
1866 postfix-expressoin -> identifier
1870 unary-operator postfix-expression
1872 unary-operator: one of
1875 assignment-operator:
1878 type-assignment-operator:
1881 constant-expression-range:
1882 unary-expression ... unary-expression
1886 #### C.2.2 Declarations:
1890 declaration-specifiers declarator-list-opt ;
1893 declaration-specifiers:
1894 storage-class-specifier declaration-specifiers-opt
1895 type-specifier declaration-specifiers-opt
1896 type-qualifier declaration-specifiers-opt
1900 declarator-list , declarator
1902 abstract-declarator-list:
1904 abstract-declarator-list , abstract-declarator
1906 storage-class-specifier:
1929 align ( unary-expression )
1932 struct identifier-opt { struct-or-variant-declaration-list-opt } align-attribute-opt
1933 struct identifier align-attribute-opt
1935 struct-or-variant-declaration-list:
1936 struct-or-variant-declaration
1937 struct-or-variant-declaration-list struct-or-variant-declaration
1939 struct-or-variant-declaration:
1940 specifier-qualifier-list struct-or-variant-declarator-list ;
1941 declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list ;
1942 typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list ;
1943 typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list ;
1945 specifier-qualifier-list:
1946 type-specifier specifier-qualifier-list-opt
1947 type-qualifier specifier-qualifier-list-opt
1949 struct-or-variant-declarator-list:
1950 struct-or-variant-declarator
1951 struct-or-variant-declarator-list , struct-or-variant-declarator
1953 struct-or-variant-declarator:
1955 declarator-opt : unary-expression
1958 variant identifier-opt variant-tag-opt { struct-or-variant-declaration-list }
1959 variant identifier variant-tag
1962 < unary-expression >
1965 enum identifier-opt { enumerator-list }
1966 enum identifier-opt { enumerator-list , }
1968 enum identifier-opt : declaration-specifiers { enumerator-list }
1969 enum identifier-opt : declaration-specifiers { enumerator-list , }
1973 enumerator-list , enumerator
1976 enumeration-constant
1977 enumeration-constant assignment-operator unary-expression
1978 enumeration-constant assignment-operator constant-expression-range
1984 pointer-opt direct-declarator
1989 direct-declarator [ unary-expression ]
1991 abstract-declarator:
1992 pointer-opt direct-abstract-declarator
1994 direct-abstract-declarator:
1996 ( abstract-declarator )
1997 direct-abstract-declarator [ unary-expression ]
1998 direct-abstract-declarator [ ]
2001 * type-qualifier-list-opt
2002 * type-qualifier-list-opt pointer
2004 type-qualifier-list:
2006 type-qualifier-list type-qualifier
2013 #### C.2.3 CTF-specific declarations
2017 clock { ctf-assignment-expression-list-opt }
2018 event { ctf-assignment-expression-list-opt }
2019 stream { ctf-assignment-expression-list-opt }
2020 env { ctf-assignment-expression-list-opt }
2021 trace { ctf-assignment-expression-list-opt }
2022 callsite { ctf-assignment-expression-list-opt }
2023 typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list
2024 typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list
2027 floating_point { ctf-assignment-expression-list-opt }
2028 integer { ctf-assignment-expression-list-opt }
2029 string { ctf-assignment-expression-list-opt }
2032 ctf-assignment-expression-list:
2033 ctf-assignment-expression ;
2034 ctf-assignment-expression-list ctf-assignment-expression ;
2036 ctf-assignment-expression:
2037 unary-expression assignment-operator unary-expression
2038 unary-expression type-assignment-operator type-specifier
2039 declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list
2040 typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list
2041 typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list