+Common Trace Format (CTF) Specification (v1.7)
+
+Mathieu Desnoyers, EfficiOS Inc.
+
+The goal of the present document is to specify a trace format that suits the
+needs of the embedded, telecom, high-performance and kernel communities. It is
+based on the Common Trace Format Requirements (v1.4) document. It is designed to
+allow traces to be natively generated by the Linux kernel, Linux user-space
+applications written in C/C++, and hardware components. One major element of
+CTF is the Trace Stream Description Language (TSDL) which flexibility
+enables description of various binary trace stream layouts.
+
+The latest version of this document can be found at:
+
+ git tree: git://git.efficios.com/ctf.git
+ gitweb: http://git.efficios.com/?p=ctf.git
+
+A reference implementation of a library to read and write this trace format is
+being implemented within the BabelTrace project, a converter between trace
+formats. The development tree is available at:
+
+ git tree: git://git.efficios.com/babeltrace.git
+ gitweb: http://git.efficios.com/?p=babeltrace.git
+
+The CE Workgroup of the Linux Foundation, Ericsson, and EfficiOS have
+sponsored this work.
+
+
+Table of Contents
+
+1. Preliminary definitions
+2. High-level representation of a trace
+3. Event stream
+4. Types
+ 4.1 Basic types
+ 4.1.1 Type inheritance
+ 4.1.2 Alignment
+ 4.1.3 Byte order
+ 4.1.4 Size
+ 4.1.5 Integers
+ 4.1.6 GNU/C bitfields
+ 4.1.7 Floating point
+ 4.1.8 Enumerations
+4.2 Compound types
+ 4.2.1 Structures
+ 4.2.2 Variants (Discriminated/Tagged Unions)
+ 4.2.3 Arrays
+ 4.2.4 Sequences
+ 4.2.5 Strings
+5. Event Packet Header
+ 5.1 Event Packet Header Description
+ 5.2 Event Packet Context Description
+6. Event Structure
+ 6.1 Event Header
+ 6.1.1 Type 1 - Few event IDs
+ 6.1.2 Type 2 - Many event IDs
+ 6.2 Event Context
+ 6.3 Event Payload
+ 6.3.1 Padding
+ 6.3.2 Alignment
+7. Trace Stream Description Language (TSDL)
+ 7.1 Meta-data
+ 7.2 Declaration vs Definition
+ 7.3 TSDL Scopes
+ 7.3.1 Lexical Scope
+ 7.3.2 Dynamic Scope
+ 7.4 TSDL Examples
+
+
+1. Preliminary definitions
+
+ - Event Trace: An ordered sequence of events.
+ - Event Stream: An ordered sequence of events, containing a subset of the
+ trace event types.
+ - Event Packet: A sequence of physically contiguous events within an event
+ stream.
+ - Event: This is the basic entry in a trace. (aka: a trace record).
+ - An event identifier (ID) relates to the class (a type) of event within
+ an event stream.
+ e.g. event: irq_entry.
+ - An event (or event record) relates to a specific instance of an event
+ class.
+ e.g. event: irq_entry, at time X, on CPU Y
+ - Source Architecture: Architecture writing the trace.
+ - Reader Architecture: Architecture reading the trace.
+
+
+2. High-level representation of a trace
+
+A trace is divided into multiple event streams. Each event stream contains a
+subset of the trace event types.
+
+The final output of the trace, after its generation and optional transport over
+the network, is expected to be either on permanent or temporary storage in a
+virtual file system. Because each event stream is appended to while a trace is
+being recorded, each is associated with a separate file for output. Therefore,
+a stored trace can be represented as a directory containing one file per stream.
+
+Meta-data description associated with the trace contains information on
+trace event types expressed in the Trace Stream Description Language
+(TSDL). This language describes:
+
+- Trace version.
+- Types available.
+- Per-trace event header description.
+- Per-stream event header description.
+- Per-stream event context description.
+- Per-event
+ - Event type to stream mapping.
+ - Event type to name mapping.
+ - Event type to ID mapping.
+ - Event context description.
+ - Event fields description.
+
+
+3. Event stream
+
+An event stream can be divided into contiguous event packets of variable
+size. These subdivisions have a variable size. An event packet can
+contain a certain amount of padding at the end. The stream header is
+repeated at the beginning of each event packet. The rationale for the
+event stream design choices is explained in Appendix B. Stream Header
+Rationale.
+
+The event stream header will therefore be referred to as the "event packet
+header" throughout the rest of this document.
+
+
+4. Types
+
+Types are organized as type classes. Each type class belong to either of two
+kind of types: basic types or compound types.
+
+4.1 Basic types
+
+A basic type is a scalar type, as described in this section. It includes
+integers, GNU/C bitfields, enumerations, and floating point values.
+
+4.1.1 Type inheritance
+
+Type specifications can be inherited to allow deriving types from a
+type class. For example, see the uint32_t named type derived from the "integer"
+type class below ("Integers" section). Types have a precise binary
+representation in the trace. A type class has methods to read and write these
+types, but must be derived into a type to be usable in an event field.
+
+4.1.2 Alignment
+
+We define "byte-packed" types as aligned on the byte size, namely 8-bit.
+We define "bit-packed" types as following on the next bit, as defined by the
+"Integers" section.
+
+Each basic type must specify its alignment, in bits. Examples of
+possible alignments are: bit-packed (align = 1), byte-packed (align =
+8), or word-aligned (e.g. align = 32 or align = 64). The choice depends
+on the architecture preference and compactness vs performance trade-offs
+of the implementation. Architectures providing fast unaligned write
+byte-packed basic types to save space, aligning each type on byte
+boundaries (8-bit). Architectures with slow unaligned writes align types
+on specific alignment values. If no specific alignment is declared for a
+type, it is assumed to be bit-packed for integers with size not multiple
+of 8 bits and for gcc bitfields. All other basic types are byte-packed
+by default. It is however recommended to always specify the alignment
+explicitly. Alignment values must be power of two. Compound types are
+aligned as specified in their individual specification.
+
+TSDL meta-data attribute representation of a specific alignment:
+
+ align = value; /* value in bits */
+
+4.1.3 Byte order
+
+By default, the native endianness of the source architecture the trace is used.
+Byte order can be overridden for a basic type by specifying a "byte_order"
+attribute. Typical use-case is to specify the network byte order (big endian:
+"be") to save data captured from the network into the trace without conversion.
+If not specified, the byte order is native.
+
+TSDL meta-data representation:
+
+ byte_order = native OR network OR be OR le; /* network and be are aliases */
+
+4.1.4 Size
+
+Type size, in bits, for integers and floats is that returned by "sizeof()" in C
+multiplied by CHAR_BIT.
+We require the size of "char" and "unsigned char" types (CHAR_BIT) to be fixed
+to 8 bits for cross-endianness compatibility.
+
+TSDL meta-data representation:
+
+ size = value; (value is in bits)
+
+4.1.5 Integers
+
+Signed integers are represented in two-complement. Integer alignment,
+size, signedness and byte ordering are defined in the TSDL meta-data.
+Integers aligned on byte size (8-bit) and with length multiple of byte
+size (8-bit) correspond to the C99 standard integers. In addition,
+integers with alignment and/or size that are _not_ a multiple of the
+byte size are permitted; these correspond to the C99 standard bitfields,
+with the added specification that the CTF integer bitfields have a fixed
+binary representation. A MIT-licensed reference implementation of the
+CTF portable bitfields is available at:
+
+ http://git.efficios.com/?p=babeltrace.git;a=blob;f=include/babeltrace/bitfield.h
+
+Binary representation of integers:
+
+- On little and big endian:
+ - Within a byte, high bits correspond to an integer high bits, and low bits
+ correspond to low bits.
+- On little endian:
+ - Integer across multiple bytes are placed from the less significant to the
+ most significant.
+ - Consecutive integers are placed from lower bits to higher bits (even within
+ a byte).
+- On big endian:
+ - Integer across multiple bytes are placed from the most significant to the
+ less significant.
+ - Consecutive integers are placed from higher bits to lower bits (even within
+ a byte).
+
+This binary representation is derived from the bitfield implementation in GCC
+for little and big endian. However, contrary to what GCC does, integers can
+cross units boundaries (no padding is required). Padding can be explicitly
+added (see 4.1.6 GNU/C bitfields) to follow the GCC layout if needed.
+
+TSDL meta-data representation:
+
+ integer {
+ signed = true OR false; /* default false */
+ byte_order = native OR network OR be OR le; /* default native */
+ size = value; /* value in bits, no default */
+ align = value; /* value in bits */
+ /* based used for pretty-printing output, default: decimal. */
+ base = decimal OR dec OR OR d OR i OR u OR 10 OR hexadecimal OR hex OR x OR X OR p OR 16
+ OR octal OR oct OR o OR 8 OR binary OR b OR 2;
+ /* character encoding, default: none */
+ encoding = none or UTF8 or ASCII;
+ }
+
+Example of type inheritance (creation of a uint32_t named type):
+
+typealias integer {
+ size = 32;
+ signed = false;
+ align = 32;
+} := uint32_t;
+
+Definition of a named 5-bit signed bitfield:
+
+typealias integer {
+ size = 5;
+ signed = true;
+ align = 1;
+} := int5_t;
+
+The character encoding field can be used to specify that the integer
+must be printed as a text character when read. e.g.:
+
+typealias integer {
+ size = 8;
+ align = 8;
+ signed = false;
+ encoding = UTF8;
+} := utf_char;
+
+
+4.1.6 GNU/C bitfields
+
+The GNU/C bitfields follow closely the integer representation, with a
+particularity on alignment: if a bitfield cannot fit in the current unit, the
+unit is padded and the bitfield starts at the following unit. The unit size is
+defined by the size of the type "unit_type".
+
+TSDL meta-data representation:
+
+ unit_type name:size;
+
+As an example, the following structure declared in C compiled by GCC:
+
+struct example {
+ short a:12;
+ short b:5;
+};
+
+The example structure is aligned on the largest element (short). The second
+bitfield would be aligned on the next unit boundary, because it would not fit in
+the current unit.
+
+4.1.7 Floating point
+
+The floating point values byte ordering is defined in the TSDL meta-data.
+
+Floating point values follow the IEEE 754-2008 standard interchange formats.
+Description of the floating point values include the exponent and mantissa size
+in bits. Some requirements are imposed on the floating point values:
+
+- FLT_RADIX must be 2.
+- mant_dig is the number of digits represented in the mantissa. It is specified
+ by the ISO C99 standard, section 5.2.4, as FLT_MANT_DIG, DBL_MANT_DIG and
+ LDBL_MANT_DIG as defined by <float.h>.
+- exp_dig is the number of digits represented in the exponent. Given that
+ mant_dig is one bit more than its actual size in bits (leading 1 is not
+ needed) and also given that the sign bit always takes one bit, exp_dig can be
+ specified as:
+
+ - sizeof(float) * CHAR_BIT - FLT_MANT_DIG
+ - sizeof(double) * CHAR_BIT - DBL_MANT_DIG
+ - sizeof(long double) * CHAR_BIT - LDBL_MANT_DIG
+
+TSDL meta-data representation:
+
+floating_point {
+ exp_dig = value;
+ mant_dig = value;
+ byte_order = native OR network OR be OR le;
+ align = value;
+}
+
+Example of type inheritance:
+
+typealias floating_point {
+ exp_dig = 8; /* sizeof(float) * CHAR_BIT - FLT_MANT_DIG */
+ mant_dig = 24; /* FLT_MANT_DIG */
+ byte_order = native;
+ align = 32;
+} := float;
+
+TODO: define NaN, +inf, -inf behavior.
+
+Bit-packed, byte-packed or larger alignments can be used for floating
+point values, similarly to integers.
+
+4.1.8 Enumerations
+
+Enumerations are a mapping between an integer type and a table of strings. The
+numerical representation of the enumeration follows the integer type specified
+by the meta-data. The enumeration mapping table is detailed in the enumeration
+description within the meta-data. The mapping table maps inclusive value
+ranges (or single values) to strings. Instead of being limited to simple
+"value -> string" mappings, these enumerations map
+"[ start_value ... end_value ] -> string", which map inclusive ranges of
+values to strings. An enumeration from the C language can be represented in
+this format by having the same start_value and end_value for each element, which
+is in fact a range of size 1. This single-value range is supported without
+repeating the start and end values with the value = string declaration.
+
+enum name : integer_type {
+ somestring = start_value1 ... end_value1,
+ "other string" = start_value2 ... end_value2,
+ yet_another_string, /* will be assigned to end_value2 + 1 */
+ "some other string" = value,
+ ...
+};
+
+If the values are omitted, the enumeration starts at 0 and increment of 1 for
+each entry:
+
+enum name : unsigned int {
+ ZERO,
+ ONE,
+ TWO,
+ TEN = 10,
+ ELEVEN,
+};
+
+Overlapping ranges within a single enumeration are implementation defined.
+
+A nameless enumeration can be declared as a field type or as part of a typedef:
+
+enum : integer_type {
+ ...
+}
+
+Enumerations omitting the container type ": integer_type" use the "int"
+type (for compatibility with C99). The "int" type must be previously
+declared. E.g.:
+
+typealias integer { size = 32; align = 32; signed = true } := int;
+
+enum {
+ ...
+}
+
+
+4.2 Compound types
+
+Compound are aggregation of type declarations. Compound types include
+structures, variant, arrays, sequences, and strings.
+
+4.2.1 Structures
+
+Structures are aligned on the largest alignment required by basic types
+contained within the structure. (This follows the ISO/C standard for structures)
+
+TSDL meta-data representation of a named structure:
+
+struct name {
+ field_type field_name;
+ field_type field_name;
+ ...
+};
+
+Example:
+
+struct example {
+ integer { /* Nameless type */
+ size = 16;
+ signed = true;
+ align = 16;
+ } first_field_name;
+ uint64_t second_field_name; /* Named type declared in the meta-data */
+};
+
+The fields are placed in a sequence next to each other. They each possess a
+field name, which is a unique identifier within the structure.
+
+A nameless structure can be declared as a field type or as part of a typedef:
+
+struct {
+ ...
+}
+
+Alignment for a structure compound type can be forced to a minimum value
+by adding an "align" specifier after the declaration of a structure
+body. This attribute is read as: align(value). The value is specified in
+bits. The structure will be aligned on the maximum value between this
+attribute and the alignment required by the basic types contained within
+the structure. e.g.
+
+struct {
+ ...
+} align(32)
+
+4.2.2 Variants (Discriminated/Tagged Unions)
+
+A CTF variant is a selection between different types. A CTF variant must
+always be defined within the scope of a structure or within fields
+contained within a structure (defined recursively). A "tag" enumeration
+field must appear in either the same lexical scope, prior to the variant
+field (in field declaration order), in an upper lexical scope (see
+Section 7.3.1), or in an upper dynamic scope (see Section 7.3.2). The
+type selection is indicated by the mapping from the enumeration value to
+the string used as variant type selector. The field to use as tag is
+specified by the "tag_field", specified between "< >" after the
+"variant" keyword for unnamed variants, and after "variant name" for
+named variants.
+
+The alignment of the variant is the alignment of the type as selected by the tag
+value for the specific instance of the variant. The alignment of the type
+containing the variant is independent of the variant alignment. The size of the
+variant is the size as selected by the tag value for the specific instance of
+the variant.
+
+A named variant declaration followed by its definition within a structure
+declaration:
+
+variant name {
+ field_type sel1;
+ field_type sel2;
+ field_type sel3;
+ ...
+};
+
+struct {
+ enum : integer_type { sel1, sel2, sel3, ... } tag_field;
+ ...
+ variant name <tag_field> v;
+}
+
+An unnamed variant definition within a structure is expressed by the following
+TSDL meta-data:
+
+struct {
+ enum : integer_type { sel1, sel2, sel3, ... } tag_field;
+ ...
+ variant <tag_field> {
+ field_type sel1;
+ field_type sel2;
+ field_type sel3;
+ ...
+ } v;
+}
+
+Example of a named variant within a sequence that refers to a single tag field:
+
+variant example {
+ uint32_t a;
+ uint64_t b;
+ short c;
+};
+
+struct {
+ enum : uint2_t { a, b, c } choice;
+ unsigned int seqlen;
+ variant example <choice> v[seqlen];
+}
+
+Example of an unnamed variant:
+
+struct {
+ enum : uint2_t { a, b, c, d } choice;
+ /* Unrelated fields can be added between the variant and its tag */
+ int32_t somevalue;
+ variant <choice> {
+ uint32_t a;
+ uint64_t b;
+ short c;
+ struct {
+ unsigned int field1;
+ uint64_t field2;
+ } d;
+ } s;
+}
+
+Example of an unnamed variant within an array:
+
+struct {
+ enum : uint2_t { a, b, c } choice;
+ variant <choice> {
+ uint32_t a;
+ uint64_t b;
+ short c;
+ } v[10];
+}
+
+Example of a variant type definition within a structure, where the defined type
+is then declared within an array of structures. This variant refers to a tag
+located in an upper lexical scope. This example clearly shows that a variant
+type definition referring to the tag "x" uses the closest preceding field from
+the lexical scope of the type definition.
+
+struct {
+ enum : uint2_t { a, b, c, d } x;
+
+ typedef variant <x> { /*
+ * "x" refers to the preceding "x" enumeration in the
+ * lexical scope of the type definition.
+ */
+ uint32_t a;
+ uint64_t b;
+ short c;
+ } example_variant;
+
+ struct {
+ enum : int { x, y, z } x; /* This enumeration is not used by "v". */
+ example_variant v; /*
+ * "v" uses the "enum : uint2_t { a, b, c, d }"
+ * tag.
+ */
+ } a[10];
+}
+
+4.2.3 Arrays
+
+Arrays are fixed-length. Their length is declared in the type
+declaration within the meta-data. They contain an array of "inner type"
+elements, which can refer to any type not containing the type of the
+array being declared (no circular dependency). The length is the number
+of elements in an array.
+
+TSDL meta-data representation of a named array:
+
+typedef elem_type name[length];
+
+A nameless array can be declared as a field type within a structure, e.g.:
+
+ uint8_t field_name[10];
+
+Arrays are always aligned on their element alignment requirement.
+
+4.2.4 Sequences
+
+Sequences are dynamically-sized arrays. They refer to a a "length"
+unsigned integer field, which must appear in either the same lexical scope,
+prior to the sequence field (in field declaration order), in an upper
+lexical scope (see Section 7.3.1), or in an upper dynamic scope (see
+Section 7.3.2). This length field represents the number of elements in
+the sequence. The sequence per se is an array of "inner type" elements.
+
+TSDL meta-data representation for a sequence type definition:
+
+struct {
+ unsigned int length_field;
+ typedef elem_type typename[length_field];
+ typename seq_field_name;
+}
+
+A sequence can also be declared as a field type, e.g.:
+
+struct {
+ unsigned int length_field;
+ long seq_field_name[length_field];
+}
+
+Multiple sequences can refer to the same length field, and these length
+fields can be in a different upper dynamic scope:
+
+e.g., assuming the stream.event.header defines:
+
+stream {
+ ...
+ id = 1;
+ event.header := struct {
+ uint16_t seq_len;
+ };
+};
+
+event {
+ ...
+ stream_id = 1;
+ fields := struct {
+ long seq_a[stream.event.header.seq_len];
+ char seq_b[stream.event.header.seq_len];
+ };
+};
+
+The sequence elements follow the "array" specifications.
+
+4.2.5 Strings
+
+Strings are an array of bytes of variable size and are terminated by a '\0'
+"NULL" character. Their encoding is described in the TSDL meta-data. In
+absence of encoding attribute information, the default encoding is
+UTF-8.
+
+TSDL meta-data representation of a named string type:
+
+typealias string {
+ encoding = UTF8 OR ASCII;
+} := name;
+
+A nameless string type can be declared as a field type:
+
+string field_name; /* Use default UTF8 encoding */
+
+Strings are always aligned on byte size.
+
+5. Event Packet Header
+
+The event packet header consists of two parts: the "event packet header"
+is the same for all streams of a trace. The second part, the "event
+packet context", is described on a per-stream basis. Both are described
+in the TSDL meta-data. The packets are aligned on architecture-page-sized
+addresses.
+
+Event packet header (all fields are optional, specified by TSDL meta-data):
+
+- Magic number (CTF magic number: 0xC1FC1FC1) specifies that this is a
+ CTF packet. This magic number is optional, but when present, it should
+ come at the very beginning of the packet.
+- Trace UUID, used to ensure the event packet match the meta-data used.
+ (note: we cannot use a meta-data checksum in every cases instead of a
+ UUID because meta-data can be appended to while tracing is active)
+ This field is optional.
+- Stream ID, used as reference to stream description in meta-data.
+ This field is optional if there is only one stream description in the
+ meta-data, but becomes required if there are more than one stream in
+ the TSDL meta-data description.
+
+Event packet context (all fields are optional, specified by TSDL meta-data):
+
+- Event packet content size (in bytes).
+- Event packet size (in bytes, includes padding).
+- Event packet content checksum (optional). Checksum excludes the event packet
+ header.
+- Per-stream event packet sequence count (to deal with UDP packet loss). The
+ number of significant sequence counter bits should also be present, so
+ wrap-arounds are dealt with correctly.
+- Time-stamp at the beginning and time-stamp at the end of the event packet.
+ Both timestamps are written in the packet header, but sampled respectively
+ while (or before) writing the first event and while (or after) writing the
+ last event in the packet. The inclusive range between these timestamps should
+ include all event timestamps assigned to events contained within the packet.
+- Events discarded count
+ - Snapshot of a per-stream free-running counter, counting the number of
+ events discarded that were supposed to be written in the stream prior to
+ the first event in the event packet.
+ * Note: producer-consumer buffer full condition should fill the current
+ event packet with padding so we know exactly where events have been
+ discarded.
+- Lossless compression scheme used for the event packet content. Applied
+ directly to raw data. New types of compression can be added in following
+ versions of the format.
+ 0: no compression scheme
+ 1: bzip2
+ 2: gzip
+ 3: xz
+- Cypher used for the event packet content. Applied after compression.
+ 0: no encryption
+ 1: AES
+- Checksum scheme used for the event packet content. Applied after encryption.
+ 0: no checksum
+ 1: md5
+ 2: sha1
+ 3: crc32
+
+5.1 Event Packet Header Description
+
+The event packet header layout is indicated by the trace packet.header
+field. Here is a recommended structure type for the packet header with
+the fields typically expected (although these fields are each optional):
+
+struct event_packet_header {
+ uint32_t magic;
+ uint8_t uuid[16];
+ uint32_t stream_id;
+};
+
+trace {
+ ...
+ packet.header := struct event_packet_header;
+};
+
+If the magic number is not present, tools such as "file" will have no
+mean to discover the file type.
+
+If the uuid is not present, no validation that the meta-data actually
+corresponds to the stream is performed.
+
+If the stream_id packet header field is missing, the trace can only
+contain a single stream. Its "id" field can be left out, and its events
+don't need to declare a "stream_id" field.
+
+
+5.2 Event Packet Context Description
+
+Event packet context example. These are declared within the stream declaration
+in the meta-data. All these fields are optional. If the packet size field is
+missing, the whole stream only contains a single packet. If the content
+size field is missing, the packet is filled (no padding). The content
+and packet sizes include all headers.
+
+An example event packet context type:
+
+struct event_packet_context {
+ uint64_t timestamp_begin;
+ uint64_t timestamp_end;
+ uint32_t checksum;
+ uint32_t stream_packet_count;
+ uint32_t events_discarded;
+ uint32_t cpu_id;
+ uint32_t/uint16_t content_size;
+ uint32_t/uint16_t packet_size;
+ uint8_t stream_packet_count_bits; /* Significant counter bits */
+ uint8_t compression_scheme;
+ uint8_t encryption_scheme;
+ uint8_t checksum_scheme;
+};
+
+
+6. Event Structure
+
+The overall structure of an event is:
+
+1 - Stream Packet Context (as specified by the stream meta-data)
+ 2 - Event Header (as specified by the stream meta-data)
+ 3 - Stream Event Context (as specified by the stream meta-data)
+ 4 - Event Context (as specified by the event meta-data)
+ 5 - Event Payload (as specified by the event meta-data)
+
+This structure defines an implicit dynamic scoping, where variants
+located in inner structures (those with a higher number in the listing
+above) can refer to the fields of outer structures (with lower number in
+the listing above). See Section 7.3 TSDL Scopes for more detail.
+
+6.1 Event Header
+
+Event headers can be described within the meta-data. We hereby propose, as an
+example, two types of events headers. Type 1 accommodates streams with less than
+31 event IDs. Type 2 accommodates streams with 31 or more event IDs.
+
+One major factor can vary between streams: the number of event IDs assigned to
+a stream. Luckily, this information tends to stay relatively constant (modulo
+event registration while trace is being recorded), so we can specify different
+representations for streams containing few event IDs and streams containing
+many event IDs, so we end up representing the event ID and time-stamp as
+densely as possible in each case.
+
+The header is extended in the rare occasions where the information cannot be
+represented in the ranges available in the standard event header. They are also
+used in the rare occasions where the data required for a field could not be
+collected: the flag corresponding to the missing field within the missing_fields
+array is then set to 1.
+
+Types uintX_t represent an X-bit unsigned integer, as declared with
+either:
+
+ typealias integer { size = X; align = X; signed = false } := uintX_t;
+
+ or
+
+ typealias integer { size = X; align = 1; signed = false } := uintX_t;
+
+6.1.1 Type 1 - Few event IDs
+
+ - Aligned on 32-bit (or 8-bit if byte-packed, depending on the architecture
+ preference).
+ - Native architecture byte ordering.
+ - For "compact" selection
+ - Fixed size: 32 bits.
+ - For "extended" selection
+ - Size depends on the architecture and variant alignment.
+
+struct event_header_1 {
+ /*
+ * id: range: 0 - 30.
+ * id 31 is reserved to indicate an extended header.
+ */
+ enum : uint5_t { compact = 0 ... 30, extended = 31 } id;
+ variant <id> {
+ struct {
+ uint27_t timestamp;
+ } compact;
+ struct {
+ uint32_t id; /* 32-bit event IDs */
+ uint64_t timestamp; /* 64-bit timestamps */
+ } extended;
+ } v;
+} align(32); /* or align(8) */
+
+
+6.1.2 Type 2 - Many event IDs
+
+ - Aligned on 16-bit (or 8-bit if byte-packed, depending on the architecture
+ preference).
+ - Native architecture byte ordering.
+ - For "compact" selection
+ - Size depends on the architecture and variant alignment.
+ - For "extended" selection
+ - Size depends on the architecture and variant alignment.
+
+struct event_header_2 {
+ /*
+ * id: range: 0 - 65534.
+ * id 65535 is reserved to indicate an extended header.
+ */
+ enum : uint16_t { compact = 0 ... 65534, extended = 65535 } id;
+ variant <id> {
+ struct {
+ uint32_t timestamp;
+ } compact;
+ struct {
+ uint32_t id; /* 32-bit event IDs */
+ uint64_t timestamp; /* 64-bit timestamps */
+ } extended;
+ } v;
+} align(16); /* or align(8) */
+
+
+6.2 Event Context
+
+The event context contains information relative to the current event.
+The choice and meaning of this information is specified by the TSDL
+stream and event meta-data descriptions. The stream context is applied
+to all events within the stream. The stream context structure follows
+the event header. The event context is applied to specific events. Its
+structure follows the stream context structure.
+
+An example of stream-level event context is to save the event payload size with
+each event, or to save the current PID with each event. These are declared
+within the stream declaration within the meta-data:
+
+ stream {
+ ...
+ event.context := struct {
+ uint pid;
+ uint16_t payload_size;
+ };
+ };
+
+An example of event-specific event context is to declare a bitmap of missing
+fields, only appended after the stream event context if the extended event
+header is selected. NR_FIELDS is the number of fields within the event (a
+numeric value).
+
+ event {
+ context = struct {
+ variant <id> {
+ struct { } compact;
+ struct {
+ uint1_t missing_fields[NR_FIELDS]; /* missing event fields bitmap */
+ } extended;
+ } v;
+ };
+ ...
+ }
+
+6.3 Event Payload
+
+An event payload contains fields specific to a given event type. The fields
+belonging to an event type are described in the event-specific meta-data
+within a structure type.
+
+6.3.1 Padding
+
+No padding at the end of the event payload. This differs from the ISO/C standard
+for structures, but follows the CTF standard for structures. In a trace, even
+though it makes sense to align the beginning of a structure, it really makes no
+sense to add padding at the end of the structure, because structures are usually
+not followed by a structure of the same type.
+
+This trick can be done by adding a zero-length "end" field at the end of the C
+structures, and by using the offset of this field rather than using sizeof()
+when calculating the size of a structure (see Appendix "A. Helper macros").
+
+6.3.2 Alignment
+
+The event payload is aligned on the largest alignment required by types
+contained within the payload. (This follows the ISO/C standard for structures)
+
+
+7. Trace Stream Description Language (TSDL)
+
+The Trace Stream Description Language (TSDL) allows expression of the
+binary trace streams layout in a C99-like Domain Specific Language
+(DSL).
+
+
+7.1 Meta-data
+
+The trace stream layout description is located in the trace meta-data.
+The meta-data is itself located in a stream identified by its name:
+"metadata".
+
+The meta-data description can be expressed in two different formats:
+text-only and packet-based. The text-only description facilitates
+generation of meta-data and provides a convenient way to enter the
+meta-data information by hand. The packet-based meta-data provides the
+CTF stream packet facilities (checksumming, compression, encryption,
+network-readiness) for meta-data stream generated and transported by a
+tracer.
+
+The text-only meta-data file is a plain text TSDL description.
+
+The packet-based meta-data is made of "meta-data packets", which each
+start with a meta-data packet header. The packet-based meta-data
+description is detected by reading the magic number "0x75D11D57" at the
+beginning of the file. This magic number is also used to detect the
+endianness of the architecture by trying to read the CTF magic number
+and its counterpart in reversed endianness. The events within the
+meta-data stream have no event header nor event context. Each event only
+contains a "sequence" payload, which is a sequence of bits using the
+"trace.packet.header.content_size" field as a placeholder for its length
+(the packet header size should be substracted). The formatting of this
+sequence of bits is a plain-text representation of the TSDL description.
+Each meta-data packet start with a special packet header, specific to
+the meta-data stream, which contains, exactly:
+
+struct metadata_packet_header {
+ uint32_t magic; /* 0x75D11D57 */
+ uint8_t uuid[16]; /* Unique Universal Identifier */
+ uint32_t checksum; /* 0 if unused */
+ uint32_t content_size; /* in bits */
+ uint32_t packet_size; /* in bits */
+ uint8_t compression_scheme; /* 0 if unused */
+ uint8_t encryption_scheme; /* 0 if unused */
+ uint8_t checksum_scheme; /* 0 if unused */
+};
+
+The packet-based meta-data can be converted to a text-only meta-data by
+concatenating all the strings in contains.
+
+In the textual representation of the meta-data, the text contained
+within "/*" and "*/", as well as within "//" and end of line, are
+treated as comments. Boolean values can be represented as true, TRUE,
+or 1 for true, and false, FALSE, or 0 for false. Within the string-based
+meta-data description, the trace UUID is represented as a string of
+hexadecimal digits and dashes "-". In the event packet header, the trace
+UUID is represented as an array of bytes.
+
+
+7.2 Declaration vs Definition
+
+A declaration associates a layout to a type, without specifying where
+this type is located in the event structure hierarchy (see Section 6).
+This therefore includes typedef, typealias, as well as all type
+specifiers. In certain circumstances (typedef, structure field and
+variant field), a declaration is followed by a declarator, which specify
+the newly defined type name (for typedef), or the field name (for
+declarations located within structure and variants). Array and sequence,
+declared with square brackets ("[" "]"), are part of the declarator,
+similarly to C99. The enumeration base type is specified by
+": enum_base", which is part of the type specifier. The variant tag
+name, specified between "<" ">", is also part of the type specifier.
+
+A definition associates a type to a location in the event structure
+hierarchy (see Section 6). This association is denoted by ":=", as shown
+in Section 7.3.
+
+
+7.3 TSDL Scopes
+
+TSDL uses two different types of scoping: a lexical scope is used for
+declarations and type definitions, and a dynamic scope is used for
+variants references to tag fields and for sequence references to length
+fields.
+
+7.3.1 Lexical Scope
+
+Each of "trace", "stream", "event", "struct" and "variant" have their own
+nestable declaration scope, within which types can be declared using "typedef"
+and "typealias". A root declaration scope also contains all declarations
+located outside of any of the aforementioned declarations. An inner
+declaration scope can refer to type declared within its container
+lexical scope prior to the inner declaration scope. Redefinition of a
+typedef or typealias is not valid, although hiding an upper scope
+typedef or typealias is allowed within a sub-scope.
+
+7.3.2 Dynamic Scope
+
+A dynamic scope consists in the lexical scope augmented with the
+implicit event structure definition hierarchy presented at Section 6.
+The dynamic scope is used for variant tag and sequence length
+definitions. It is used at definition time to look up the location of
+the tag field associated with a variant, and to lookup up the location
+of the length field associated with a sequence.
+
+Therefore, variants (or sequences) in lower levels in the dynamic scope
+(e.g. event context) can refer to a tag (or length) field located in
+upper levels (e.g. in the event header) by specifying, in this case, the
+associated tag with <header.field_name>. This allows, for instance, the
+event context to define a variant referring to the "id" field of the
+event header as selector.
+
+The target dynamic scope must be specified explicitly when referring to
+a field outside of the local static scope. The dynamic scope prefixes
+are thus:
+
+ - Trace Packet Header: <trace.packet.header. >,
+ - Stream Packet Context: <stream.packet.context. >,
+ - Event Header: <stream.event.header. >,
+ - Stream Event Context: <stream.event.context. >,
+ - Event Context: <event.context. >,
+ - Event Payload: <event.fields. >.
+
+Multiple declarations of the same field name within a single scope is
+not valid. It is however valid to re-use the same field name in
+different scopes. There is no possible conflict, because the dynamic
+scope must be specified when a variant refers to a tag field located in
+a different dynamic scope.
+
+The information available in the dynamic scopes can be thought of as the
+current tracing context. At trace production, information about the
+current context is saved into the specified scope field levels. At trace
+consumption, for each event, the current trace context is therefore
+readable by accessing the upper dynamic scopes.
+
+
+7.4 TSDL Examples
+
+The grammar representing the TSDL meta-data is presented in Appendix C.
+TSDL Grammar. This section presents a rather lighter reading that
+consists in examples of TSDL meta-data, with template values.
+
+The stream "id" can be left out if there is only one stream in the
+trace. The event "id" field can be left out if there is only one event
+in a stream.
+
+trace {
+ major = value; /* Trace format version */
+ minor = value;
+ uuid = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"; /* Trace UUID */
+ byte_order = be OR le; /* Endianness (required) */
+ packet.header := struct {
+ uint32_t magic;
+ uint8_t uuid[16];
+ uint32_t stream_id;
+ };
+};
+
+stream {
+ id = stream_id;
+ /* Type 1 - Few event IDs; Type 2 - Many event IDs. See section 6.1. */
+ event.header := event_header_1 OR event_header_2;
+ event.context := struct {
+ ...
+ };
+ packet.context := struct {
+ ...
+ };
+};
+
+event {
+ name = event_name;
+ id = value; /* Numeric identifier within the stream */
+ stream_id = stream_id;
+ context := struct {
+ ...
+ };
+ fields := struct {
+ ...
+ };
+};
+
+/* More detail on types in section 4. Types */
+
+/*
+ * Named types:
+ *
+ * Type declarations behave similarly to the C standard.
+ */
+
+typedef aliased_type_specifiers new_type_declarators;
+
+/* e.g.: typedef struct example new_type_name[10]; */
+
+/*
+ * typealias
+ *
+ * The "typealias" declaration can be used to give a name (including
+ * pointer declarator specifier) to a type. It should also be used to
+ * map basic C types (float, int, unsigned long, ...) to a CTF type.
+ * Typealias is a superset of "typedef": it also allows assignment of a
+ * simple variable identifier to a type.
+ */
+
+typealias type_class {
+ ...
+} := type_specifiers type_declarator;
+
+/*
+ * e.g.:
+ * typealias integer {
+ * size = 32;
+ * align = 32;
+ * signed = false;
+ * } := struct page *;
+ *
+ * typealias integer {
+ * size = 32;
+ * align = 32;
+ * signed = true;
+ * } := int;
+ */
+
+struct name {
+ ...
+};
+
+variant name {
+ ...
+};
+
+enum name : integer_type {
+ ...
+};
+
+
+/*
+ * Unnamed types, contained within compound type fields, typedef or typealias.
+ */
+
+struct {
+ ...
+}
+
+struct {
+ ...
+} align(value)
+
+variant {
+ ...
+}
+
+enum : integer_type {
+ ...
+}
+
+typedef type new_type[length];
+
+struct {
+ type field_name[length];
+}
+
+typedef type new_type[length_type];
+
+struct {
+ type field_name[length_type];
+}
+
+integer {
+ ...
+}
+
+floating_point {
+ ...
+}
+
+struct {
+ integer_type field_name:size; /* GNU/C bitfield */
+}
+
+struct {
+ string field_name;
+}
+
+
+A. Helper macros
+
+The two following macros keep track of the size of a GNU/C structure without
+padding at the end by placing HEADER_END as the last field. A one byte end field
+is used for C90 compatibility (C99 flexible arrays could be used here). Note
+that this does not affect the effective structure size, which should always be
+calculated with the header_sizeof() helper.
+
+#define HEADER_END char end_field
+#define header_sizeof(type) offsetof(typeof(type), end_field)
+
+
+B. Stream Header Rationale
+
+An event stream is divided in contiguous event packets of variable size. These
+subdivisions allow the trace analyzer to perform a fast binary search by time
+within the stream (typically requiring to index only the event packet headers)
+without reading the whole stream. These subdivisions have a variable size to
+eliminate the need to transfer the event packet padding when partially filled
+event packets must be sent when streaming a trace for live viewing/analysis.
+An event packet can contain a certain amount of padding at the end. Dividing
+streams into event packets is also useful for network streaming over UDP and
+flight recorder mode tracing (a whole event packet can be swapped out of the
+buffer atomically for reading).
+
+The stream header is repeated at the beginning of each event packet to allow
+flexibility in terms of:
+
+ - streaming support,
+ - allowing arbitrary buffers to be discarded without making the trace
+ unreadable,
+ - allow UDP packet loss handling by either dealing with missing event packet
+ or asking for re-transmission.
+ - transparently support flight recorder mode,
+ - transparently support crash dump.
+
+
+C. TSDL Grammar
+
+/*
+ * Common Trace Format (CTF) Trace Stream Description Language (TSDL) Grammar.
+ *
+ * Inspired from the C99 grammar:
+ * http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf (Annex A)
+ * and c++1x grammar (draft)
+ * http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3291.pdf (Annex A)
+ *
+ * Specialized for CTF needs by including only constant and declarations from
+ * C99 (excluding function declarations), and by adding support for variants,
+ * sequences and CTF-specific specifiers. Enumeration container types
+ * semantic is inspired from c++1x enum-base.
+ */
+
+1) Lexical grammar
+
+1.1) Lexical elements
+
+token:
+ keyword
+ identifier
+ constant
+ string-literal
+ punctuator
+
+1.2) Keywords
+
+keyword: is one of
+
+align
+const
+char
+double
+enum
+event
+floating_point
+float
+integer
+int
+long
+short
+signed
+stream
+string
+struct
+trace
+typealias
+typedef
+unsigned
+variant
+void
+_Bool
+_Complex
+_Imaginary
+
+
+1.3) Identifiers
+
+identifier:
+ identifier-nondigit
+ identifier identifier-nondigit
+ identifier digit
+
+identifier-nondigit:
+ nondigit
+ universal-character-name
+ any other implementation-defined characters
+
+nondigit:
+ _
+ [a-zA-Z] /* regular expression */
+
+digit:
+ [0-9] /* regular expression */
+
+1.4) Universal character names
+
+universal-character-name:
+ \u hex-quad
+ \U hex-quad hex-quad
+
+hex-quad:
+ hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit
+
+1.5) Constants
+
+constant:
+ integer-constant
+ enumeration-constant
+ character-constant
+
+integer-constant:
+ decimal-constant integer-suffix-opt
+ octal-constant integer-suffix-opt
+ hexadecimal-constant integer-suffix-opt
+
+decimal-constant:
+ nonzero-digit
+ decimal-constant digit
+
+octal-constant:
+ 0
+ octal-constant octal-digit
+
+hexadecimal-constant:
+ hexadecimal-prefix hexadecimal-digit
+ hexadecimal-constant hexadecimal-digit
+
+hexadecimal-prefix:
+ 0x
+ 0X
+
+nonzero-digit:
+ [1-9]
+
+integer-suffix:
+ unsigned-suffix long-suffix-opt
+ unsigned-suffix long-long-suffix
+ long-suffix unsigned-suffix-opt
+ long-long-suffix unsigned-suffix-opt
+
+unsigned-suffix:
+ u
+ U
+
+long-suffix:
+ l
+ L
+
+long-long-suffix:
+ ll
+ LL
+
+enumeration-constant:
+ identifier
+ string-literal
+
+character-constant:
+ ' c-char-sequence '
+ L' c-char-sequence '
+
+c-char-sequence:
+ c-char
+ c-char-sequence c-char
+
+c-char:
+ any member of source charset except single-quote ('), backslash
+ (\), or new-line character.
+ escape-sequence
+
+escape-sequence:
+ simple-escape-sequence
+ octal-escape-sequence
+ hexadecimal-escape-sequence
+ universal-character-name
+
+simple-escape-sequence: one of
+ \' \" \? \\ \a \b \f \n \r \t \v
+
+octal-escape-sequence:
+ \ octal-digit
+ \ octal-digit octal-digit
+ \ octal-digit octal-digit octal-digit
+
+hexadecimal-escape-sequence:
+ \x hexadecimal-digit
+ hexadecimal-escape-sequence hexadecimal-digit
+
+1.6) String literals
+
+string-literal:
+ " s-char-sequence-opt "
+ L" s-char-sequence-opt "
+
+s-char-sequence:
+ s-char
+ s-char-sequence s-char
+
+s-char:
+ any member of source charset except double-quote ("), backslash
+ (\), or new-line character.
+ escape-sequence
+
+1.7) Punctuators
+
+punctuator: one of
+ [ ] ( ) { } . -> * + - < > : ; ... = ,
+
+
+2) Phrase structure grammar
+
+primary-expression:
+ identifier
+ constant
+ string-literal
+ ( unary-expression )
+
+postfix-expression:
+ primary-expression
+ postfix-expression [ unary-expression ]
+ postfix-expression . identifier
+ postfix-expressoin -> identifier
+
+unary-expression:
+ postfix-expression
+ unary-operator postfix-expression
+
+unary-operator: one of
+ + -
+
+assignment-operator:
+ =
+
+type-assignment-operator:
+ :=
+
+constant-expression-range:
+ unary-expression ... unary-expression
+
+2.2) Declarations:
+
+declaration:
+ declaration-specifiers declarator-list-opt ;
+ ctf-specifier ;
+
+declaration-specifiers:
+ storage-class-specifier declaration-specifiers-opt
+ type-specifier declaration-specifiers-opt
+ type-qualifier declaration-specifiers-opt
+
+declarator-list:
+ declarator
+ declarator-list , declarator
+
+abstract-declarator-list:
+ abstract-declarator
+ abstract-declarator-list , abstract-declarator
+
+storage-class-specifier:
+ typedef
+
+type-specifier:
+ void
+ char
+ short
+ int
+ long
+ float
+ double
+ signed
+ unsigned
+ _Bool
+ _Complex
+ _Imaginary
+ struct-specifier
+ variant-specifier
+ enum-specifier
+ typedef-name
+ ctf-type-specifier
+
+align-attribute:
+ align ( unary-expression )
+
+struct-specifier:
+ struct identifier-opt { struct-or-variant-declaration-list-opt } align-attribute-opt
+ struct identifier align-attribute-opt
+
+struct-or-variant-declaration-list:
+ struct-or-variant-declaration
+ struct-or-variant-declaration-list struct-or-variant-declaration
+
+struct-or-variant-declaration:
+ specifier-qualifier-list struct-or-variant-declarator-list ;
+ declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list ;
+ typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list ;
+ typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list ;
+
+specifier-qualifier-list:
+ type-specifier specifier-qualifier-list-opt
+ type-qualifier specifier-qualifier-list-opt
+
+struct-or-variant-declarator-list:
+ struct-or-variant-declarator
+ struct-or-variant-declarator-list , struct-or-variant-declarator
+
+struct-or-variant-declarator:
+ declarator
+ declarator-opt : unary-expression
+
+variant-specifier:
+ variant identifier-opt variant-tag-opt { struct-or-variant-declaration-list }
+ variant identifier variant-tag
+
+variant-tag:
+ < identifier >
+
+enum-specifier:
+ enum identifier-opt { enumerator-list }
+ enum identifier-opt { enumerator-list , }
+ enum identifier
+ enum identifier-opt : declaration-specifiers { enumerator-list }
+ enum identifier-opt : declaration-specifiers { enumerator-list , }
+
+enumerator-list:
+ enumerator
+ enumerator-list , enumerator
+
+enumerator:
+ enumeration-constant
+ enumeration-constant assignment-operator unary-expression
+ enumeration-constant assignment-operator constant-expression-range
+
+type-qualifier:
+ const
+
+declarator:
+ pointer-opt direct-declarator
+
+direct-declarator:
+ identifier
+ ( declarator )
+ direct-declarator [ unary-expression ]
+
+abstract-declarator:
+ pointer-opt direct-abstract-declarator
+
+direct-abstract-declarator:
+ identifier-opt
+ ( abstract-declarator )
+ direct-abstract-declarator [ unary-expression ]
+ direct-abstract-declarator [ ]
+
+pointer:
+ * type-qualifier-list-opt
+ * type-qualifier-list-opt pointer
+
+type-qualifier-list:
+ type-qualifier
+ type-qualifier-list type-qualifier
+
+typedef-name:
+ identifier
+
+2.3) CTF-specific declarations
+
+ctf-specifier:
+ event { ctf-assignment-expression-list-opt }
+ stream { ctf-assignment-expression-list-opt }
+ trace { ctf-assignment-expression-list-opt }
+ typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list
+ typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list
+
+ctf-type-specifier:
+ floating_point { ctf-assignment-expression-list-opt }
+ integer { ctf-assignment-expression-list-opt }
+ string { ctf-assignment-expression-list-opt }
+ string
+
+ctf-assignment-expression-list:
+ ctf-assignment-expression ;
+ ctf-assignment-expression-list ctf-assignment-expression ;
+
+ctf-assignment-expression:
+ unary-expression assignment-operator unary-expression
+ unary-expression type-assignment-operator type-specifier
+ declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list
+ typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list
+ typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list