From: Mathieu Desnoyers Date: Wed, 22 Jun 2011 18:21:36 +0000 (-0400) Subject: Rename proposal into "specification", add credits. X-Git-Tag: v1.8~26 X-Git-Url: http://git.efficios.com/?p=ctf.git;a=commitdiff_plain;h=339a7dde945687b835bcdd379b13a72c2cfde239;hp=beabf088015cd7de7fc53cedf59c0551e692fc3b;ds=sidebyside Rename proposal into "specification", add credits. Signed-off-by: Mathieu Desnoyers --- diff --git a/common-trace-format-proposal.txt b/common-trace-format-proposal.txt deleted file mode 100644 index 3103722..0000000 --- a/common-trace-format-proposal.txt +++ /dev/null @@ -1,1605 +0,0 @@ - -RFC: Common Trace Format (CTF) Proposal (pre-v1.7) - -Mathieu Desnoyers, EfficiOS Inc. - -The goal of the present document is to propose a trace format that suits the -needs of the embedded, telecom, high-performance and kernel communities. It is -based on the Common Trace Format Requirements (v1.4) document. It is designed to -allow traces to be natively generated by the Linux kernel, Linux user-space -applications written in C/C++, and hardware components. One major element of -CTF is the Trace Stream Description Language (TSDL) which flexibility -enables description of various binary trace stream layouts. - -The latest version of this document can be found at: - - git tree: git://git.efficios.com/ctf.git - gitweb: http://git.efficios.com/?p=ctf.git - -A reference implementation of a library to read and write this trace format is -being implemented within the BabelTrace project, a converter between trace -formats. The development tree is available at: - - git tree: git://git.efficios.com/babeltrace.git - gitweb: http://git.efficios.com/?p=babeltrace.git - - -Table of Contents - -1. Preliminary definitions -2. High-level representation of a trace -3. Event stream -4. Types - 4.1 Basic types - 4.1.1 Type inheritance - 4.1.2 Alignment - 4.1.3 Byte order - 4.1.4 Size - 4.1.5 Integers - 4.1.6 GNU/C bitfields - 4.1.7 Floating point - 4.1.8 Enumerations -4.2 Compound types - 4.2.1 Structures - 4.2.2 Variants (Discriminated/Tagged Unions) - 4.2.3 Arrays - 4.2.4 Sequences - 4.2.5 Strings -5. Event Packet Header - 5.1 Event Packet Header Description - 5.2 Event Packet Context Description -6. Event Structure - 6.1 Event Header - 6.1.1 Type 1 - Few event IDs - 6.1.2 Type 2 - Many event IDs - 6.2 Event Context - 6.3 Event Payload - 6.3.1 Padding - 6.3.2 Alignment -7. Trace Stream Description Language (TSDL) - 7.1 Meta-data - 7.2 Declaration vs Definition - 7.3 TSDL Scopes - 7.3.1 Lexical Scope - 7.3.2 Dynamic Scope - 7.4 TSDL Examples - - -1. Preliminary definitions - - - Event Trace: An ordered sequence of events. - - Event Stream: An ordered sequence of events, containing a subset of the - trace event types. - - Event Packet: A sequence of physically contiguous events within an event - stream. - - Event: This is the basic entry in a trace. (aka: a trace record). - - An event identifier (ID) relates to the class (a type) of event within - an event stream. - e.g. event: irq_entry. - - An event (or event record) relates to a specific instance of an event - class. - e.g. event: irq_entry, at time X, on CPU Y - - Source Architecture: Architecture writing the trace. - - Reader Architecture: Architecture reading the trace. - - -2. High-level representation of a trace - -A trace is divided into multiple event streams. Each event stream contains a -subset of the trace event types. - -The final output of the trace, after its generation and optional transport over -the network, is expected to be either on permanent or temporary storage in a -virtual file system. Because each event stream is appended to while a trace is -being recorded, each is associated with a separate file for output. Therefore, -a stored trace can be represented as a directory containing one file per stream. - -Meta-data description associated with the trace contains information on -trace event types expressed in the Trace Stream Description Language -(TSDL). This language describes: - -- Trace version. -- Types available. -- Per-trace event header description. -- Per-stream event header description. -- Per-stream event context description. -- Per-event - - Event type to stream mapping. - - Event type to name mapping. - - Event type to ID mapping. - - Event context description. - - Event fields description. - - -3. Event stream - -An event stream can be divided into contiguous event packets of variable -size. These subdivisions have a variable size. An event packet can -contain a certain amount of padding at the end. The stream header is -repeated at the beginning of each event packet. The rationale for the -event stream design choices is explained in Appendix B. Stream Header -Rationale. - -The event stream header will therefore be referred to as the "event packet -header" throughout the rest of this document. - - -4. Types - -Types are organized as type classes. Each type class belong to either of two -kind of types: basic types or compound types. - -4.1 Basic types - -A basic type is a scalar type, as described in this section. It includes -integers, GNU/C bitfields, enumerations, and floating point values. - -4.1.1 Type inheritance - -Type specifications can be inherited to allow deriving types from a -type class. For example, see the uint32_t named type derived from the "integer" -type class below ("Integers" section). Types have a precise binary -representation in the trace. A type class has methods to read and write these -types, but must be derived into a type to be usable in an event field. - -4.1.2 Alignment - -We define "byte-packed" types as aligned on the byte size, namely 8-bit. -We define "bit-packed" types as following on the next bit, as defined by the -"Integers" section. - -Each basic type must specify its alignment, in bits. Examples of -possible alignments are: bit-packed (align = 1), byte-packed (align = -8), or word-aligned (e.g. align = 32 or align = 64). The choice depends -on the architecture preference and compactness vs performance trade-offs -of the implementation. Architectures providing fast unaligned write -byte-packed basic types to save space, aligning each type on byte -boundaries (8-bit). Architectures with slow unaligned writes align types -on specific alignment values. If no specific alignment is declared for a -type, it is assumed to be bit-packed for integers with size not multiple -of 8 bits and for gcc bitfields. All other basic types are byte-packed -by default. It is however recommended to always specify the alignment -explicitly. Alignment values must be power of two. Compound types are -aligned as specified in their individual specification. - -TSDL meta-data attribute representation of a specific alignment: - - align = value; /* value in bits */ - -4.1.3 Byte order - -By default, the native endianness of the source architecture the trace is used. -Byte order can be overridden for a basic type by specifying a "byte_order" -attribute. Typical use-case is to specify the network byte order (big endian: -"be") to save data captured from the network into the trace without conversion. -If not specified, the byte order is native. - -TSDL meta-data representation: - - byte_order = native OR network OR be OR le; /* network and be are aliases */ - -4.1.4 Size - -Type size, in bits, for integers and floats is that returned by "sizeof()" in C -multiplied by CHAR_BIT. -We require the size of "char" and "unsigned char" types (CHAR_BIT) to be fixed -to 8 bits for cross-endianness compatibility. - -TSDL meta-data representation: - - size = value; (value is in bits) - -4.1.5 Integers - -Signed integers are represented in two-complement. Integer alignment, -size, signedness and byte ordering are defined in the TSDL meta-data. -Integers aligned on byte size (8-bit) and with length multiple of byte -size (8-bit) correspond to the C99 standard integers. In addition, -integers with alignment and/or size that are _not_ a multiple of the -byte size are permitted; these correspond to the C99 standard bitfields, -with the added specification that the CTF integer bitfields have a fixed -binary representation. A MIT-licensed reference implementation of the -CTF portable bitfields is available at: - - http://git.efficios.com/?p=babeltrace.git;a=blob;f=include/babeltrace/bitfield.h - -Binary representation of integers: - -- On little and big endian: - - Within a byte, high bits correspond to an integer high bits, and low bits - correspond to low bits. -- On little endian: - - Integer across multiple bytes are placed from the less significant to the - most significant. - - Consecutive integers are placed from lower bits to higher bits (even within - a byte). -- On big endian: - - Integer across multiple bytes are placed from the most significant to the - less significant. - - Consecutive integers are placed from higher bits to lower bits (even within - a byte). - -This binary representation is derived from the bitfield implementation in GCC -for little and big endian. However, contrary to what GCC does, integers can -cross units boundaries (no padding is required). Padding can be explicitly -added (see 4.1.6 GNU/C bitfields) to follow the GCC layout if needed. - -TSDL meta-data representation: - - integer { - signed = true OR false; /* default false */ - byte_order = native OR network OR be OR le; /* default native */ - size = value; /* value in bits, no default */ - align = value; /* value in bits */ - /* based used for pretty-printing output, default: decimal. */ - base = decimal OR dec OR OR d OR i OR u OR 10 OR hexadecimal OR hex OR x OR X OR p OR 16 - OR octal OR oct OR o OR 8 OR binary OR b OR 2; - /* character encoding, default: none */ - encoding = none or UTF8 or ASCII; - } - -Example of type inheritance (creation of a uint32_t named type): - -typealias integer { - size = 32; - signed = false; - align = 32; -} := uint32_t; - -Definition of a named 5-bit signed bitfield: - -typealias integer { - size = 5; - signed = true; - align = 1; -} := int5_t; - -The character encoding field can be used to specify that the integer -must be printed as a text character when read. e.g.: - -typealias integer { - size = 8; - align = 8; - signed = false; - encoding = UTF8; -} := utf_char; - - -4.1.6 GNU/C bitfields - -The GNU/C bitfields follow closely the integer representation, with a -particularity on alignment: if a bitfield cannot fit in the current unit, the -unit is padded and the bitfield starts at the following unit. The unit size is -defined by the size of the type "unit_type". - -TSDL meta-data representation: - - unit_type name:size; - -As an example, the following structure declared in C compiled by GCC: - -struct example { - short a:12; - short b:5; -}; - -The example structure is aligned on the largest element (short). The second -bitfield would be aligned on the next unit boundary, because it would not fit in -the current unit. - -4.1.7 Floating point - -The floating point values byte ordering is defined in the TSDL meta-data. - -Floating point values follow the IEEE 754-2008 standard interchange formats. -Description of the floating point values include the exponent and mantissa size -in bits. Some requirements are imposed on the floating point values: - -- FLT_RADIX must be 2. -- mant_dig is the number of digits represented in the mantissa. It is specified - by the ISO C99 standard, section 5.2.4, as FLT_MANT_DIG, DBL_MANT_DIG and - LDBL_MANT_DIG as defined by . -- exp_dig is the number of digits represented in the exponent. Given that - mant_dig is one bit more than its actual size in bits (leading 1 is not - needed) and also given that the sign bit always takes one bit, exp_dig can be - specified as: - - - sizeof(float) * CHAR_BIT - FLT_MANT_DIG - - sizeof(double) * CHAR_BIT - DBL_MANT_DIG - - sizeof(long double) * CHAR_BIT - LDBL_MANT_DIG - -TSDL meta-data representation: - -floating_point { - exp_dig = value; - mant_dig = value; - byte_order = native OR network OR be OR le; - align = value; -} - -Example of type inheritance: - -typealias floating_point { - exp_dig = 8; /* sizeof(float) * CHAR_BIT - FLT_MANT_DIG */ - mant_dig = 24; /* FLT_MANT_DIG */ - byte_order = native; - align = 32; -} := float; - -TODO: define NaN, +inf, -inf behavior. - -Bit-packed, byte-packed or larger alignments can be used for floating -point values, similarly to integers. - -4.1.8 Enumerations - -Enumerations are a mapping between an integer type and a table of strings. The -numerical representation of the enumeration follows the integer type specified -by the meta-data. The enumeration mapping table is detailed in the enumeration -description within the meta-data. The mapping table maps inclusive value -ranges (or single values) to strings. Instead of being limited to simple -"value -> string" mappings, these enumerations map -"[ start_value ... end_value ] -> string", which map inclusive ranges of -values to strings. An enumeration from the C language can be represented in -this format by having the same start_value and end_value for each element, which -is in fact a range of size 1. This single-value range is supported without -repeating the start and end values with the value = string declaration. - -enum name : integer_type { - somestring = start_value1 ... end_value1, - "other string" = start_value2 ... end_value2, - yet_another_string, /* will be assigned to end_value2 + 1 */ - "some other string" = value, - ... -}; - -If the values are omitted, the enumeration starts at 0 and increment of 1 for -each entry: - -enum name : unsigned int { - ZERO, - ONE, - TWO, - TEN = 10, - ELEVEN, -}; - -Overlapping ranges within a single enumeration are implementation defined. - -A nameless enumeration can be declared as a field type or as part of a typedef: - -enum : integer_type { - ... -} - -Enumerations omitting the container type ": integer_type" use the "int" -type (for compatibility with C99). The "int" type must be previously -declared. E.g.: - -typealias integer { size = 32; align = 32; signed = true } := int; - -enum { - ... -} - - -4.2 Compound types - -Compound are aggregation of type declarations. Compound types include -structures, variant, arrays, sequences, and strings. - -4.2.1 Structures - -Structures are aligned on the largest alignment required by basic types -contained within the structure. (This follows the ISO/C standard for structures) - -TSDL meta-data representation of a named structure: - -struct name { - field_type field_name; - field_type field_name; - ... -}; - -Example: - -struct example { - integer { /* Nameless type */ - size = 16; - signed = true; - align = 16; - } first_field_name; - uint64_t second_field_name; /* Named type declared in the meta-data */ -}; - -The fields are placed in a sequence next to each other. They each possess a -field name, which is a unique identifier within the structure. - -A nameless structure can be declared as a field type or as part of a typedef: - -struct { - ... -} - -Alignment for a structure compound type can be forced to a minimum value -by adding an "align" specifier after the declaration of a structure -body. This attribute is read as: align(value). The value is specified in -bits. The structure will be aligned on the maximum value between this -attribute and the alignment required by the basic types contained within -the structure. e.g. - -struct { - ... -} align(32) - -4.2.2 Variants (Discriminated/Tagged Unions) - -A CTF variant is a selection between different types. A CTF variant must -always be defined within the scope of a structure or within fields -contained within a structure (defined recursively). A "tag" enumeration -field must appear in either the same lexical scope, prior to the variant -field (in field declaration order), in an upper lexical scope (see -Section 7.3.1), or in an upper dynamic scope (see Section 7.3.2). The -type selection is indicated by the mapping from the enumeration value to -the string used as variant type selector. The field to use as tag is -specified by the "tag_field", specified between "< >" after the -"variant" keyword for unnamed variants, and after "variant name" for -named variants. - -The alignment of the variant is the alignment of the type as selected by the tag -value for the specific instance of the variant. The alignment of the type -containing the variant is independent of the variant alignment. The size of the -variant is the size as selected by the tag value for the specific instance of -the variant. - -A named variant declaration followed by its definition within a structure -declaration: - -variant name { - field_type sel1; - field_type sel2; - field_type sel3; - ... -}; - -struct { - enum : integer_type { sel1, sel2, sel3, ... } tag_field; - ... - variant name v; -} - -An unnamed variant definition within a structure is expressed by the following -TSDL meta-data: - -struct { - enum : integer_type { sel1, sel2, sel3, ... } tag_field; - ... - variant { - field_type sel1; - field_type sel2; - field_type sel3; - ... - } v; -} - -Example of a named variant within a sequence that refers to a single tag field: - -variant example { - uint32_t a; - uint64_t b; - short c; -}; - -struct { - enum : uint2_t { a, b, c } choice; - unsigned int seqlen; - variant example v[seqlen]; -} - -Example of an unnamed variant: - -struct { - enum : uint2_t { a, b, c, d } choice; - /* Unrelated fields can be added between the variant and its tag */ - int32_t somevalue; - variant { - uint32_t a; - uint64_t b; - short c; - struct { - unsigned int field1; - uint64_t field2; - } d; - } s; -} - -Example of an unnamed variant within an array: - -struct { - enum : uint2_t { a, b, c } choice; - variant { - uint32_t a; - uint64_t b; - short c; - } v[10]; -} - -Example of a variant type definition within a structure, where the defined type -is then declared within an array of structures. This variant refers to a tag -located in an upper lexical scope. This example clearly shows that a variant -type definition referring to the tag "x" uses the closest preceding field from -the lexical scope of the type definition. - -struct { - enum : uint2_t { a, b, c, d } x; - - typedef variant { /* - * "x" refers to the preceding "x" enumeration in the - * lexical scope of the type definition. - */ - uint32_t a; - uint64_t b; - short c; - } example_variant; - - struct { - enum : int { x, y, z } x; /* This enumeration is not used by "v". */ - example_variant v; /* - * "v" uses the "enum : uint2_t { a, b, c, d }" - * tag. - */ - } a[10]; -} - -4.2.3 Arrays - -Arrays are fixed-length. Their length is declared in the type -declaration within the meta-data. They contain an array of "inner type" -elements, which can refer to any type not containing the type of the -array being declared (no circular dependency). The length is the number -of elements in an array. - -TSDL meta-data representation of a named array: - -typedef elem_type name[length]; - -A nameless array can be declared as a field type within a structure, e.g.: - - uint8_t field_name[10]; - -Arrays are always aligned on their element alignment requirement. - -4.2.4 Sequences - -Sequences are dynamically-sized arrays. They refer to a a "length" -unsigned integer field, which must appear in either the same lexical scope, -prior to the sequence field (in field declaration order), in an upper -lexical scope (see Section 7.3.1), or in an upper dynamic scope (see -Section 7.3.2). This length field represents the number of elements in -the sequence. The sequence per se is an array of "inner type" elements. - -TSDL meta-data representation for a sequence type definition: - -struct { - unsigned int length_field; - typedef elem_type typename[length_field]; - typename seq_field_name; -} - -A sequence can also be declared as a field type, e.g.: - -struct { - unsigned int length_field; - long seq_field_name[length_field]; -} - -Multiple sequences can refer to the same length field, and these length -fields can be in a different upper dynamic scope: - -e.g., assuming the stream.event.header defines: - -stream { - ... - id = 1; - event.header := struct { - uint16_t seq_len; - }; -}; - -event { - ... - stream_id = 1; - fields := struct { - long seq_a[stream.event.header.seq_len]; - char seq_b[stream.event.header.seq_len]; - }; -}; - -The sequence elements follow the "array" specifications. - -4.2.5 Strings - -Strings are an array of bytes of variable size and are terminated by a '\0' -"NULL" character. Their encoding is described in the TSDL meta-data. In -absence of encoding attribute information, the default encoding is -UTF-8. - -TSDL meta-data representation of a named string type: - -typealias string { - encoding = UTF8 OR ASCII; -} := name; - -A nameless string type can be declared as a field type: - -string field_name; /* Use default UTF8 encoding */ - -Strings are always aligned on byte size. - -5. Event Packet Header - -The event packet header consists of two parts: the "event packet header" -is the same for all streams of a trace. The second part, the "event -packet context", is described on a per-stream basis. Both are described -in the TSDL meta-data. The packets are aligned on architecture-page-sized -addresses. - -Event packet header (all fields are optional, specified by TSDL meta-data): - -- Magic number (CTF magic number: 0xC1FC1FC1) specifies that this is a - CTF packet. This magic number is optional, but when present, it should - come at the very beginning of the packet. -- Trace UUID, used to ensure the event packet match the meta-data used. - (note: we cannot use a meta-data checksum in every cases instead of a - UUID because meta-data can be appended to while tracing is active) - This field is optional. -- Stream ID, used as reference to stream description in meta-data. - This field is optional if there is only one stream description in the - meta-data, but becomes required if there are more than one stream in - the TSDL meta-data description. - -Event packet context (all fields are optional, specified by TSDL meta-data): - -- Event packet content size (in bytes). -- Event packet size (in bytes, includes padding). -- Event packet content checksum (optional). Checksum excludes the event packet - header. -- Per-stream event packet sequence count (to deal with UDP packet loss). The - number of significant sequence counter bits should also be present, so - wrap-arounds are dealt with correctly. -- Time-stamp at the beginning and time-stamp at the end of the event packet. - Both timestamps are written in the packet header, but sampled respectively - while (or before) writing the first event and while (or after) writing the - last event in the packet. The inclusive range between these timestamps should - include all event timestamps assigned to events contained within the packet. -- Events discarded count - - Snapshot of a per-stream free-running counter, counting the number of - events discarded that were supposed to be written in the stream prior to - the first event in the event packet. - * Note: producer-consumer buffer full condition should fill the current - event packet with padding so we know exactly where events have been - discarded. -- Lossless compression scheme used for the event packet content. Applied - directly to raw data. New types of compression can be added in following - versions of the format. - 0: no compression scheme - 1: bzip2 - 2: gzip - 3: xz -- Cypher used for the event packet content. Applied after compression. - 0: no encryption - 1: AES -- Checksum scheme used for the event packet content. Applied after encryption. - 0: no checksum - 1: md5 - 2: sha1 - 3: crc32 - -5.1 Event Packet Header Description - -The event packet header layout is indicated by the trace packet.header -field. Here is a recommended structure type for the packet header with -the fields typically expected (although these fields are each optional): - -struct event_packet_header { - uint32_t magic; - uint8_t uuid[16]; - uint32_t stream_id; -}; - -trace { - ... - packet.header := struct event_packet_header; -}; - -If the magic number is not present, tools such as "file" will have no -mean to discover the file type. - -If the uuid is not present, no validation that the meta-data actually -corresponds to the stream is performed. - -If the stream_id packet header field is missing, the trace can only -contain a single stream. Its "id" field can be left out, and its events -don't need to declare a "stream_id" field. - - -5.2 Event Packet Context Description - -Event packet context example. These are declared within the stream declaration -in the meta-data. All these fields are optional. If the packet size field is -missing, the whole stream only contains a single packet. If the content -size field is missing, the packet is filled (no padding). The content -and packet sizes include all headers. - -An example event packet context type: - -struct event_packet_context { - uint64_t timestamp_begin; - uint64_t timestamp_end; - uint32_t checksum; - uint32_t stream_packet_count; - uint32_t events_discarded; - uint32_t cpu_id; - uint32_t/uint16_t content_size; - uint32_t/uint16_t packet_size; - uint8_t stream_packet_count_bits; /* Significant counter bits */ - uint8_t compression_scheme; - uint8_t encryption_scheme; - uint8_t checksum_scheme; -}; - - -6. Event Structure - -The overall structure of an event is: - -1 - Stream Packet Context (as specified by the stream meta-data) - 2 - Event Header (as specified by the stream meta-data) - 3 - Stream Event Context (as specified by the stream meta-data) - 4 - Event Context (as specified by the event meta-data) - 5 - Event Payload (as specified by the event meta-data) - -This structure defines an implicit dynamic scoping, where variants -located in inner structures (those with a higher number in the listing -above) can refer to the fields of outer structures (with lower number in -the listing above). See Section 7.3 TSDL Scopes for more detail. - -6.1 Event Header - -Event headers can be described within the meta-data. We hereby propose, as an -example, two types of events headers. Type 1 accommodates streams with less than -31 event IDs. Type 2 accommodates streams with 31 or more event IDs. - -One major factor can vary between streams: the number of event IDs assigned to -a stream. Luckily, this information tends to stay relatively constant (modulo -event registration while trace is being recorded), so we can specify different -representations for streams containing few event IDs and streams containing -many event IDs, so we end up representing the event ID and time-stamp as -densely as possible in each case. - -The header is extended in the rare occasions where the information cannot be -represented in the ranges available in the standard event header. They are also -used in the rare occasions where the data required for a field could not be -collected: the flag corresponding to the missing field within the missing_fields -array is then set to 1. - -Types uintX_t represent an X-bit unsigned integer, as declared with -either: - - typealias integer { size = X; align = X; signed = false } := uintX_t; - - or - - typealias integer { size = X; align = 1; signed = false } := uintX_t; - -6.1.1 Type 1 - Few event IDs - - - Aligned on 32-bit (or 8-bit if byte-packed, depending on the architecture - preference). - - Native architecture byte ordering. - - For "compact" selection - - Fixed size: 32 bits. - - For "extended" selection - - Size depends on the architecture and variant alignment. - -struct event_header_1 { - /* - * id: range: 0 - 30. - * id 31 is reserved to indicate an extended header. - */ - enum : uint5_t { compact = 0 ... 30, extended = 31 } id; - variant { - struct { - uint27_t timestamp; - } compact; - struct { - uint32_t id; /* 32-bit event IDs */ - uint64_t timestamp; /* 64-bit timestamps */ - } extended; - } v; -} align(32); /* or align(8) */ - - -6.1.2 Type 2 - Many event IDs - - - Aligned on 16-bit (or 8-bit if byte-packed, depending on the architecture - preference). - - Native architecture byte ordering. - - For "compact" selection - - Size depends on the architecture and variant alignment. - - For "extended" selection - - Size depends on the architecture and variant alignment. - -struct event_header_2 { - /* - * id: range: 0 - 65534. - * id 65535 is reserved to indicate an extended header. - */ - enum : uint16_t { compact = 0 ... 65534, extended = 65535 } id; - variant { - struct { - uint32_t timestamp; - } compact; - struct { - uint32_t id; /* 32-bit event IDs */ - uint64_t timestamp; /* 64-bit timestamps */ - } extended; - } v; -} align(16); /* or align(8) */ - - -6.2 Event Context - -The event context contains information relative to the current event. -The choice and meaning of this information is specified by the TSDL -stream and event meta-data descriptions. The stream context is applied -to all events within the stream. The stream context structure follows -the event header. The event context is applied to specific events. Its -structure follows the stream context structure. - -An example of stream-level event context is to save the event payload size with -each event, or to save the current PID with each event. These are declared -within the stream declaration within the meta-data: - - stream { - ... - event.context := struct { - uint pid; - uint16_t payload_size; - }; - }; - -An example of event-specific event context is to declare a bitmap of missing -fields, only appended after the stream event context if the extended event -header is selected. NR_FIELDS is the number of fields within the event (a -numeric value). - - event { - context = struct { - variant { - struct { } compact; - struct { - uint1_t missing_fields[NR_FIELDS]; /* missing event fields bitmap */ - } extended; - } v; - }; - ... - } - -6.3 Event Payload - -An event payload contains fields specific to a given event type. The fields -belonging to an event type are described in the event-specific meta-data -within a structure type. - -6.3.1 Padding - -No padding at the end of the event payload. This differs from the ISO/C standard -for structures, but follows the CTF standard for structures. In a trace, even -though it makes sense to align the beginning of a structure, it really makes no -sense to add padding at the end of the structure, because structures are usually -not followed by a structure of the same type. - -This trick can be done by adding a zero-length "end" field at the end of the C -structures, and by using the offset of this field rather than using sizeof() -when calculating the size of a structure (see Appendix "A. Helper macros"). - -6.3.2 Alignment - -The event payload is aligned on the largest alignment required by types -contained within the payload. (This follows the ISO/C standard for structures) - - -7. Trace Stream Description Language (TSDL) - -The Trace Stream Description Language (TSDL) allows expression of the -binary trace streams layout in a C99-like Domain Specific Language -(DSL). - - -7.1 Meta-data - -The trace stream layout description is located in the trace meta-data. -The meta-data is itself located in a stream identified by its name: -"metadata". - -The meta-data description can be expressed in two different formats: -text-only and packet-based. The text-only description facilitates -generation of meta-data and provides a convenient way to enter the -meta-data information by hand. The packet-based meta-data provides the -CTF stream packet facilities (checksumming, compression, encryption, -network-readiness) for meta-data stream generated and transported by a -tracer. - -The text-only meta-data file is a plain text TSDL description. - -The packet-based meta-data is made of "meta-data packets", which each -start with a meta-data packet header. The packet-based meta-data -description is detected by reading the magic number "0x75D11D57" at the -beginning of the file. This magic number is also used to detect the -endianness of the architecture by trying to read the CTF magic number -and its counterpart in reversed endianness. The events within the -meta-data stream have no event header nor event context. Each event only -contains a "sequence" payload, which is a sequence of bits using the -"trace.packet.header.content_size" field as a placeholder for its length -(the packet header size should be substracted). The formatting of this -sequence of bits is a plain-text representation of the TSDL description. -Each meta-data packet start with a special packet header, specific to -the meta-data stream, which contains, exactly: - -struct metadata_packet_header { - uint32_t magic; /* 0x75D11D57 */ - uint8_t uuid[16]; /* Unique Universal Identifier */ - uint32_t checksum; /* 0 if unused */ - uint32_t content_size; /* in bits */ - uint32_t packet_size; /* in bits */ - uint8_t compression_scheme; /* 0 if unused */ - uint8_t encryption_scheme; /* 0 if unused */ - uint8_t checksum_scheme; /* 0 if unused */ -}; - -The packet-based meta-data can be converted to a text-only meta-data by -concatenating all the strings in contains. - -In the textual representation of the meta-data, the text contained -within "/*" and "*/", as well as within "//" and end of line, are -treated as comments. Boolean values can be represented as true, TRUE, -or 1 for true, and false, FALSE, or 0 for false. Within the string-based -meta-data description, the trace UUID is represented as a string of -hexadecimal digits and dashes "-". In the event packet header, the trace -UUID is represented as an array of bytes. - - -7.2 Declaration vs Definition - -A declaration associates a layout to a type, without specifying where -this type is located in the event structure hierarchy (see Section 6). -This therefore includes typedef, typealias, as well as all type -specifiers. In certain circumstances (typedef, structure field and -variant field), a declaration is followed by a declarator, which specify -the newly defined type name (for typedef), or the field name (for -declarations located within structure and variants). Array and sequence, -declared with square brackets ("[" "]"), are part of the declarator, -similarly to C99. The enumeration base type is specified by -": enum_base", which is part of the type specifier. The variant tag -name, specified between "<" ">", is also part of the type specifier. - -A definition associates a type to a location in the event structure -hierarchy (see Section 6). This association is denoted by ":=", as shown -in Section 7.3. - - -7.3 TSDL Scopes - -TSDL uses two different types of scoping: a lexical scope is used for -declarations and type definitions, and a dynamic scope is used for -variants references to tag fields and for sequence references to length -fields. - -7.3.1 Lexical Scope - -Each of "trace", "stream", "event", "struct" and "variant" have their own -nestable declaration scope, within which types can be declared using "typedef" -and "typealias". A root declaration scope also contains all declarations -located outside of any of the aforementioned declarations. An inner -declaration scope can refer to type declared within its container -lexical scope prior to the inner declaration scope. Redefinition of a -typedef or typealias is not valid, although hiding an upper scope -typedef or typealias is allowed within a sub-scope. - -7.3.2 Dynamic Scope - -A dynamic scope consists in the lexical scope augmented with the -implicit event structure definition hierarchy presented at Section 6. -The dynamic scope is used for variant tag and sequence length -definitions. It is used at definition time to look up the location of -the tag field associated with a variant, and to lookup up the location -of the length field associated with a sequence. - -Therefore, variants (or sequences) in lower levels in the dynamic scope -(e.g. event context) can refer to a tag (or length) field located in -upper levels (e.g. in the event header) by specifying, in this case, the -associated tag with . This allows, for instance, the -event context to define a variant referring to the "id" field of the -event header as selector. - -The target dynamic scope must be specified explicitly when referring to -a field outside of the local static scope. The dynamic scope prefixes -are thus: - - - Trace Packet Header: , - - Stream Packet Context: , - - Event Header: , - - Stream Event Context: , - - Event Context: , - - Event Payload: . - -Multiple declarations of the same field name within a single scope is -not valid. It is however valid to re-use the same field name in -different scopes. There is no possible conflict, because the dynamic -scope must be specified when a variant refers to a tag field located in -a different dynamic scope. - -The information available in the dynamic scopes can be thought of as the -current tracing context. At trace production, information about the -current context is saved into the specified scope field levels. At trace -consumption, for each event, the current trace context is therefore -readable by accessing the upper dynamic scopes. - - -7.4 TSDL Examples - -The grammar representing the TSDL meta-data is presented in Appendix C. -TSDL Grammar. This section presents a rather lighter reading that -consists in examples of TSDL meta-data, with template values. - -The stream "id" can be left out if there is only one stream in the -trace. The event "id" field can be left out if there is only one event -in a stream. - -trace { - major = value; /* Trace format version */ - minor = value; - uuid = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"; /* Trace UUID */ - byte_order = be OR le; /* Endianness (required) */ - packet.header := struct { - uint32_t magic; - uint8_t uuid[16]; - uint32_t stream_id; - }; -}; - -stream { - id = stream_id; - /* Type 1 - Few event IDs; Type 2 - Many event IDs. See section 6.1. */ - event.header := event_header_1 OR event_header_2; - event.context := struct { - ... - }; - packet.context := struct { - ... - }; -}; - -event { - name = event_name; - id = value; /* Numeric identifier within the stream */ - stream_id = stream_id; - context := struct { - ... - }; - fields := struct { - ... - }; -}; - -/* More detail on types in section 4. Types */ - -/* - * Named types: - * - * Type declarations behave similarly to the C standard. - */ - -typedef aliased_type_specifiers new_type_declarators; - -/* e.g.: typedef struct example new_type_name[10]; */ - -/* - * typealias - * - * The "typealias" declaration can be used to give a name (including - * pointer declarator specifier) to a type. It should also be used to - * map basic C types (float, int, unsigned long, ...) to a CTF type. - * Typealias is a superset of "typedef": it also allows assignment of a - * simple variable identifier to a type. - */ - -typealias type_class { - ... -} := type_specifiers type_declarator; - -/* - * e.g.: - * typealias integer { - * size = 32; - * align = 32; - * signed = false; - * } := struct page *; - * - * typealias integer { - * size = 32; - * align = 32; - * signed = true; - * } := int; - */ - -struct name { - ... -}; - -variant name { - ... -}; - -enum name : integer_type { - ... -}; - - -/* - * Unnamed types, contained within compound type fields, typedef or typealias. - */ - -struct { - ... -} - -struct { - ... -} align(value) - -variant { - ... -} - -enum : integer_type { - ... -} - -typedef type new_type[length]; - -struct { - type field_name[length]; -} - -typedef type new_type[length_type]; - -struct { - type field_name[length_type]; -} - -integer { - ... -} - -floating_point { - ... -} - -struct { - integer_type field_name:size; /* GNU/C bitfield */ -} - -struct { - string field_name; -} - - -A. Helper macros - -The two following macros keep track of the size of a GNU/C structure without -padding at the end by placing HEADER_END as the last field. A one byte end field -is used for C90 compatibility (C99 flexible arrays could be used here). Note -that this does not affect the effective structure size, which should always be -calculated with the header_sizeof() helper. - -#define HEADER_END char end_field -#define header_sizeof(type) offsetof(typeof(type), end_field) - - -B. Stream Header Rationale - -An event stream is divided in contiguous event packets of variable size. These -subdivisions allow the trace analyzer to perform a fast binary search by time -within the stream (typically requiring to index only the event packet headers) -without reading the whole stream. These subdivisions have a variable size to -eliminate the need to transfer the event packet padding when partially filled -event packets must be sent when streaming a trace for live viewing/analysis. -An event packet can contain a certain amount of padding at the end. Dividing -streams into event packets is also useful for network streaming over UDP and -flight recorder mode tracing (a whole event packet can be swapped out of the -buffer atomically for reading). - -The stream header is repeated at the beginning of each event packet to allow -flexibility in terms of: - - - streaming support, - - allowing arbitrary buffers to be discarded without making the trace - unreadable, - - allow UDP packet loss handling by either dealing with missing event packet - or asking for re-transmission. - - transparently support flight recorder mode, - - transparently support crash dump. - - -C. TSDL Grammar - -/* - * Common Trace Format (CTF) Trace Stream Description Language (TSDL) Grammar. - * - * Inspired from the C99 grammar: - * http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf (Annex A) - * and c++1x grammar (draft) - * http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3291.pdf (Annex A) - * - * Specialized for CTF needs by including only constant and declarations from - * C99 (excluding function declarations), and by adding support for variants, - * sequences and CTF-specific specifiers. Enumeration container types - * semantic is inspired from c++1x enum-base. - */ - -1) Lexical grammar - -1.1) Lexical elements - -token: - keyword - identifier - constant - string-literal - punctuator - -1.2) Keywords - -keyword: is one of - -align -const -char -double -enum -event -floating_point -float -integer -int -long -short -signed -stream -string -struct -trace -typealias -typedef -unsigned -variant -void -_Bool -_Complex -_Imaginary - - -1.3) Identifiers - -identifier: - identifier-nondigit - identifier identifier-nondigit - identifier digit - -identifier-nondigit: - nondigit - universal-character-name - any other implementation-defined characters - -nondigit: - _ - [a-zA-Z] /* regular expression */ - -digit: - [0-9] /* regular expression */ - -1.4) Universal character names - -universal-character-name: - \u hex-quad - \U hex-quad hex-quad - -hex-quad: - hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit - -1.5) Constants - -constant: - integer-constant - enumeration-constant - character-constant - -integer-constant: - decimal-constant integer-suffix-opt - octal-constant integer-suffix-opt - hexadecimal-constant integer-suffix-opt - -decimal-constant: - nonzero-digit - decimal-constant digit - -octal-constant: - 0 - octal-constant octal-digit - -hexadecimal-constant: - hexadecimal-prefix hexadecimal-digit - hexadecimal-constant hexadecimal-digit - -hexadecimal-prefix: - 0x - 0X - -nonzero-digit: - [1-9] - -integer-suffix: - unsigned-suffix long-suffix-opt - unsigned-suffix long-long-suffix - long-suffix unsigned-suffix-opt - long-long-suffix unsigned-suffix-opt - -unsigned-suffix: - u - U - -long-suffix: - l - L - -long-long-suffix: - ll - LL - -enumeration-constant: - identifier - string-literal - -character-constant: - ' c-char-sequence ' - L' c-char-sequence ' - -c-char-sequence: - c-char - c-char-sequence c-char - -c-char: - any member of source charset except single-quote ('), backslash - (\), or new-line character. - escape-sequence - -escape-sequence: - simple-escape-sequence - octal-escape-sequence - hexadecimal-escape-sequence - universal-character-name - -simple-escape-sequence: one of - \' \" \? \\ \a \b \f \n \r \t \v - -octal-escape-sequence: - \ octal-digit - \ octal-digit octal-digit - \ octal-digit octal-digit octal-digit - -hexadecimal-escape-sequence: - \x hexadecimal-digit - hexadecimal-escape-sequence hexadecimal-digit - -1.6) String literals - -string-literal: - " s-char-sequence-opt " - L" s-char-sequence-opt " - -s-char-sequence: - s-char - s-char-sequence s-char - -s-char: - any member of source charset except double-quote ("), backslash - (\), or new-line character. - escape-sequence - -1.7) Punctuators - -punctuator: one of - [ ] ( ) { } . -> * + - < > : ; ... = , - - -2) Phrase structure grammar - -primary-expression: - identifier - constant - string-literal - ( unary-expression ) - -postfix-expression: - primary-expression - postfix-expression [ unary-expression ] - postfix-expression . identifier - postfix-expressoin -> identifier - -unary-expression: - postfix-expression - unary-operator postfix-expression - -unary-operator: one of - + - - -assignment-operator: - = - -type-assignment-operator: - := - -constant-expression-range: - unary-expression ... unary-expression - -2.2) Declarations: - -declaration: - declaration-specifiers declarator-list-opt ; - ctf-specifier ; - -declaration-specifiers: - storage-class-specifier declaration-specifiers-opt - type-specifier declaration-specifiers-opt - type-qualifier declaration-specifiers-opt - -declarator-list: - declarator - declarator-list , declarator - -abstract-declarator-list: - abstract-declarator - abstract-declarator-list , abstract-declarator - -storage-class-specifier: - typedef - -type-specifier: - void - char - short - int - long - float - double - signed - unsigned - _Bool - _Complex - _Imaginary - struct-specifier - variant-specifier - enum-specifier - typedef-name - ctf-type-specifier - -align-attribute: - align ( unary-expression ) - -struct-specifier: - struct identifier-opt { struct-or-variant-declaration-list-opt } align-attribute-opt - struct identifier align-attribute-opt - -struct-or-variant-declaration-list: - struct-or-variant-declaration - struct-or-variant-declaration-list struct-or-variant-declaration - -struct-or-variant-declaration: - specifier-qualifier-list struct-or-variant-declarator-list ; - declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list ; - typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list ; - typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list ; - -specifier-qualifier-list: - type-specifier specifier-qualifier-list-opt - type-qualifier specifier-qualifier-list-opt - -struct-or-variant-declarator-list: - struct-or-variant-declarator - struct-or-variant-declarator-list , struct-or-variant-declarator - -struct-or-variant-declarator: - declarator - declarator-opt : unary-expression - -variant-specifier: - variant identifier-opt variant-tag-opt { struct-or-variant-declaration-list } - variant identifier variant-tag - -variant-tag: - < identifier > - -enum-specifier: - enum identifier-opt { enumerator-list } - enum identifier-opt { enumerator-list , } - enum identifier - enum identifier-opt : declaration-specifiers { enumerator-list } - enum identifier-opt : declaration-specifiers { enumerator-list , } - -enumerator-list: - enumerator - enumerator-list , enumerator - -enumerator: - enumeration-constant - enumeration-constant assignment-operator unary-expression - enumeration-constant assignment-operator constant-expression-range - -type-qualifier: - const - -declarator: - pointer-opt direct-declarator - -direct-declarator: - identifier - ( declarator ) - direct-declarator [ unary-expression ] - -abstract-declarator: - pointer-opt direct-abstract-declarator - -direct-abstract-declarator: - identifier-opt - ( abstract-declarator ) - direct-abstract-declarator [ unary-expression ] - direct-abstract-declarator [ ] - -pointer: - * type-qualifier-list-opt - * type-qualifier-list-opt pointer - -type-qualifier-list: - type-qualifier - type-qualifier-list type-qualifier - -typedef-name: - identifier - -2.3) CTF-specific declarations - -ctf-specifier: - event { ctf-assignment-expression-list-opt } - stream { ctf-assignment-expression-list-opt } - trace { ctf-assignment-expression-list-opt } - typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list - typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list - -ctf-type-specifier: - floating_point { ctf-assignment-expression-list-opt } - integer { ctf-assignment-expression-list-opt } - string { ctf-assignment-expression-list-opt } - string - -ctf-assignment-expression-list: - ctf-assignment-expression ; - ctf-assignment-expression-list ctf-assignment-expression ; - -ctf-assignment-expression: - unary-expression assignment-operator unary-expression - unary-expression type-assignment-operator type-specifier - declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list - typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list - typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list diff --git a/common-trace-format-specification.txt b/common-trace-format-specification.txt new file mode 100644 index 0000000..86060da --- /dev/null +++ b/common-trace-format-specification.txt @@ -0,0 +1,1607 @@ +Common Trace Format (CTF) Specification (v1.7) + +Mathieu Desnoyers, EfficiOS Inc. + +The goal of the present document is to specify a trace format that suits the +needs of the embedded, telecom, high-performance and kernel communities. It is +based on the Common Trace Format Requirements (v1.4) document. It is designed to +allow traces to be natively generated by the Linux kernel, Linux user-space +applications written in C/C++, and hardware components. One major element of +CTF is the Trace Stream Description Language (TSDL) which flexibility +enables description of various binary trace stream layouts. + +The latest version of this document can be found at: + + git tree: git://git.efficios.com/ctf.git + gitweb: http://git.efficios.com/?p=ctf.git + +A reference implementation of a library to read and write this trace format is +being implemented within the BabelTrace project, a converter between trace +formats. The development tree is available at: + + git tree: git://git.efficios.com/babeltrace.git + gitweb: http://git.efficios.com/?p=babeltrace.git + +The CE Workgroup of the Linux Foundation, Ericsson, and EfficiOS have +sponsored this work. + + +Table of Contents + +1. Preliminary definitions +2. High-level representation of a trace +3. Event stream +4. Types + 4.1 Basic types + 4.1.1 Type inheritance + 4.1.2 Alignment + 4.1.3 Byte order + 4.1.4 Size + 4.1.5 Integers + 4.1.6 GNU/C bitfields + 4.1.7 Floating point + 4.1.8 Enumerations +4.2 Compound types + 4.2.1 Structures + 4.2.2 Variants (Discriminated/Tagged Unions) + 4.2.3 Arrays + 4.2.4 Sequences + 4.2.5 Strings +5. Event Packet Header + 5.1 Event Packet Header Description + 5.2 Event Packet Context Description +6. Event Structure + 6.1 Event Header + 6.1.1 Type 1 - Few event IDs + 6.1.2 Type 2 - Many event IDs + 6.2 Event Context + 6.3 Event Payload + 6.3.1 Padding + 6.3.2 Alignment +7. Trace Stream Description Language (TSDL) + 7.1 Meta-data + 7.2 Declaration vs Definition + 7.3 TSDL Scopes + 7.3.1 Lexical Scope + 7.3.2 Dynamic Scope + 7.4 TSDL Examples + + +1. Preliminary definitions + + - Event Trace: An ordered sequence of events. + - Event Stream: An ordered sequence of events, containing a subset of the + trace event types. + - Event Packet: A sequence of physically contiguous events within an event + stream. + - Event: This is the basic entry in a trace. (aka: a trace record). + - An event identifier (ID) relates to the class (a type) of event within + an event stream. + e.g. event: irq_entry. + - An event (or event record) relates to a specific instance of an event + class. + e.g. event: irq_entry, at time X, on CPU Y + - Source Architecture: Architecture writing the trace. + - Reader Architecture: Architecture reading the trace. + + +2. High-level representation of a trace + +A trace is divided into multiple event streams. Each event stream contains a +subset of the trace event types. + +The final output of the trace, after its generation and optional transport over +the network, is expected to be either on permanent or temporary storage in a +virtual file system. Because each event stream is appended to while a trace is +being recorded, each is associated with a separate file for output. Therefore, +a stored trace can be represented as a directory containing one file per stream. + +Meta-data description associated with the trace contains information on +trace event types expressed in the Trace Stream Description Language +(TSDL). This language describes: + +- Trace version. +- Types available. +- Per-trace event header description. +- Per-stream event header description. +- Per-stream event context description. +- Per-event + - Event type to stream mapping. + - Event type to name mapping. + - Event type to ID mapping. + - Event context description. + - Event fields description. + + +3. Event stream + +An event stream can be divided into contiguous event packets of variable +size. These subdivisions have a variable size. An event packet can +contain a certain amount of padding at the end. The stream header is +repeated at the beginning of each event packet. The rationale for the +event stream design choices is explained in Appendix B. Stream Header +Rationale. + +The event stream header will therefore be referred to as the "event packet +header" throughout the rest of this document. + + +4. Types + +Types are organized as type classes. Each type class belong to either of two +kind of types: basic types or compound types. + +4.1 Basic types + +A basic type is a scalar type, as described in this section. It includes +integers, GNU/C bitfields, enumerations, and floating point values. + +4.1.1 Type inheritance + +Type specifications can be inherited to allow deriving types from a +type class. For example, see the uint32_t named type derived from the "integer" +type class below ("Integers" section). Types have a precise binary +representation in the trace. A type class has methods to read and write these +types, but must be derived into a type to be usable in an event field. + +4.1.2 Alignment + +We define "byte-packed" types as aligned on the byte size, namely 8-bit. +We define "bit-packed" types as following on the next bit, as defined by the +"Integers" section. + +Each basic type must specify its alignment, in bits. Examples of +possible alignments are: bit-packed (align = 1), byte-packed (align = +8), or word-aligned (e.g. align = 32 or align = 64). The choice depends +on the architecture preference and compactness vs performance trade-offs +of the implementation. Architectures providing fast unaligned write +byte-packed basic types to save space, aligning each type on byte +boundaries (8-bit). Architectures with slow unaligned writes align types +on specific alignment values. If no specific alignment is declared for a +type, it is assumed to be bit-packed for integers with size not multiple +of 8 bits and for gcc bitfields. All other basic types are byte-packed +by default. It is however recommended to always specify the alignment +explicitly. Alignment values must be power of two. Compound types are +aligned as specified in their individual specification. + +TSDL meta-data attribute representation of a specific alignment: + + align = value; /* value in bits */ + +4.1.3 Byte order + +By default, the native endianness of the source architecture the trace is used. +Byte order can be overridden for a basic type by specifying a "byte_order" +attribute. Typical use-case is to specify the network byte order (big endian: +"be") to save data captured from the network into the trace without conversion. +If not specified, the byte order is native. + +TSDL meta-data representation: + + byte_order = native OR network OR be OR le; /* network and be are aliases */ + +4.1.4 Size + +Type size, in bits, for integers and floats is that returned by "sizeof()" in C +multiplied by CHAR_BIT. +We require the size of "char" and "unsigned char" types (CHAR_BIT) to be fixed +to 8 bits for cross-endianness compatibility. + +TSDL meta-data representation: + + size = value; (value is in bits) + +4.1.5 Integers + +Signed integers are represented in two-complement. Integer alignment, +size, signedness and byte ordering are defined in the TSDL meta-data. +Integers aligned on byte size (8-bit) and with length multiple of byte +size (8-bit) correspond to the C99 standard integers. In addition, +integers with alignment and/or size that are _not_ a multiple of the +byte size are permitted; these correspond to the C99 standard bitfields, +with the added specification that the CTF integer bitfields have a fixed +binary representation. A MIT-licensed reference implementation of the +CTF portable bitfields is available at: + + http://git.efficios.com/?p=babeltrace.git;a=blob;f=include/babeltrace/bitfield.h + +Binary representation of integers: + +- On little and big endian: + - Within a byte, high bits correspond to an integer high bits, and low bits + correspond to low bits. +- On little endian: + - Integer across multiple bytes are placed from the less significant to the + most significant. + - Consecutive integers are placed from lower bits to higher bits (even within + a byte). +- On big endian: + - Integer across multiple bytes are placed from the most significant to the + less significant. + - Consecutive integers are placed from higher bits to lower bits (even within + a byte). + +This binary representation is derived from the bitfield implementation in GCC +for little and big endian. However, contrary to what GCC does, integers can +cross units boundaries (no padding is required). Padding can be explicitly +added (see 4.1.6 GNU/C bitfields) to follow the GCC layout if needed. + +TSDL meta-data representation: + + integer { + signed = true OR false; /* default false */ + byte_order = native OR network OR be OR le; /* default native */ + size = value; /* value in bits, no default */ + align = value; /* value in bits */ + /* based used for pretty-printing output, default: decimal. */ + base = decimal OR dec OR OR d OR i OR u OR 10 OR hexadecimal OR hex OR x OR X OR p OR 16 + OR octal OR oct OR o OR 8 OR binary OR b OR 2; + /* character encoding, default: none */ + encoding = none or UTF8 or ASCII; + } + +Example of type inheritance (creation of a uint32_t named type): + +typealias integer { + size = 32; + signed = false; + align = 32; +} := uint32_t; + +Definition of a named 5-bit signed bitfield: + +typealias integer { + size = 5; + signed = true; + align = 1; +} := int5_t; + +The character encoding field can be used to specify that the integer +must be printed as a text character when read. e.g.: + +typealias integer { + size = 8; + align = 8; + signed = false; + encoding = UTF8; +} := utf_char; + + +4.1.6 GNU/C bitfields + +The GNU/C bitfields follow closely the integer representation, with a +particularity on alignment: if a bitfield cannot fit in the current unit, the +unit is padded and the bitfield starts at the following unit. The unit size is +defined by the size of the type "unit_type". + +TSDL meta-data representation: + + unit_type name:size; + +As an example, the following structure declared in C compiled by GCC: + +struct example { + short a:12; + short b:5; +}; + +The example structure is aligned on the largest element (short). The second +bitfield would be aligned on the next unit boundary, because it would not fit in +the current unit. + +4.1.7 Floating point + +The floating point values byte ordering is defined in the TSDL meta-data. + +Floating point values follow the IEEE 754-2008 standard interchange formats. +Description of the floating point values include the exponent and mantissa size +in bits. Some requirements are imposed on the floating point values: + +- FLT_RADIX must be 2. +- mant_dig is the number of digits represented in the mantissa. It is specified + by the ISO C99 standard, section 5.2.4, as FLT_MANT_DIG, DBL_MANT_DIG and + LDBL_MANT_DIG as defined by . +- exp_dig is the number of digits represented in the exponent. Given that + mant_dig is one bit more than its actual size in bits (leading 1 is not + needed) and also given that the sign bit always takes one bit, exp_dig can be + specified as: + + - sizeof(float) * CHAR_BIT - FLT_MANT_DIG + - sizeof(double) * CHAR_BIT - DBL_MANT_DIG + - sizeof(long double) * CHAR_BIT - LDBL_MANT_DIG + +TSDL meta-data representation: + +floating_point { + exp_dig = value; + mant_dig = value; + byte_order = native OR network OR be OR le; + align = value; +} + +Example of type inheritance: + +typealias floating_point { + exp_dig = 8; /* sizeof(float) * CHAR_BIT - FLT_MANT_DIG */ + mant_dig = 24; /* FLT_MANT_DIG */ + byte_order = native; + align = 32; +} := float; + +TODO: define NaN, +inf, -inf behavior. + +Bit-packed, byte-packed or larger alignments can be used for floating +point values, similarly to integers. + +4.1.8 Enumerations + +Enumerations are a mapping between an integer type and a table of strings. The +numerical representation of the enumeration follows the integer type specified +by the meta-data. The enumeration mapping table is detailed in the enumeration +description within the meta-data. The mapping table maps inclusive value +ranges (or single values) to strings. Instead of being limited to simple +"value -> string" mappings, these enumerations map +"[ start_value ... end_value ] -> string", which map inclusive ranges of +values to strings. An enumeration from the C language can be represented in +this format by having the same start_value and end_value for each element, which +is in fact a range of size 1. This single-value range is supported without +repeating the start and end values with the value = string declaration. + +enum name : integer_type { + somestring = start_value1 ... end_value1, + "other string" = start_value2 ... end_value2, + yet_another_string, /* will be assigned to end_value2 + 1 */ + "some other string" = value, + ... +}; + +If the values are omitted, the enumeration starts at 0 and increment of 1 for +each entry: + +enum name : unsigned int { + ZERO, + ONE, + TWO, + TEN = 10, + ELEVEN, +}; + +Overlapping ranges within a single enumeration are implementation defined. + +A nameless enumeration can be declared as a field type or as part of a typedef: + +enum : integer_type { + ... +} + +Enumerations omitting the container type ": integer_type" use the "int" +type (for compatibility with C99). The "int" type must be previously +declared. E.g.: + +typealias integer { size = 32; align = 32; signed = true } := int; + +enum { + ... +} + + +4.2 Compound types + +Compound are aggregation of type declarations. Compound types include +structures, variant, arrays, sequences, and strings. + +4.2.1 Structures + +Structures are aligned on the largest alignment required by basic types +contained within the structure. (This follows the ISO/C standard for structures) + +TSDL meta-data representation of a named structure: + +struct name { + field_type field_name; + field_type field_name; + ... +}; + +Example: + +struct example { + integer { /* Nameless type */ + size = 16; + signed = true; + align = 16; + } first_field_name; + uint64_t second_field_name; /* Named type declared in the meta-data */ +}; + +The fields are placed in a sequence next to each other. They each possess a +field name, which is a unique identifier within the structure. + +A nameless structure can be declared as a field type or as part of a typedef: + +struct { + ... +} + +Alignment for a structure compound type can be forced to a minimum value +by adding an "align" specifier after the declaration of a structure +body. This attribute is read as: align(value). The value is specified in +bits. The structure will be aligned on the maximum value between this +attribute and the alignment required by the basic types contained within +the structure. e.g. + +struct { + ... +} align(32) + +4.2.2 Variants (Discriminated/Tagged Unions) + +A CTF variant is a selection between different types. A CTF variant must +always be defined within the scope of a structure or within fields +contained within a structure (defined recursively). A "tag" enumeration +field must appear in either the same lexical scope, prior to the variant +field (in field declaration order), in an upper lexical scope (see +Section 7.3.1), or in an upper dynamic scope (see Section 7.3.2). The +type selection is indicated by the mapping from the enumeration value to +the string used as variant type selector. The field to use as tag is +specified by the "tag_field", specified between "< >" after the +"variant" keyword for unnamed variants, and after "variant name" for +named variants. + +The alignment of the variant is the alignment of the type as selected by the tag +value for the specific instance of the variant. The alignment of the type +containing the variant is independent of the variant alignment. The size of the +variant is the size as selected by the tag value for the specific instance of +the variant. + +A named variant declaration followed by its definition within a structure +declaration: + +variant name { + field_type sel1; + field_type sel2; + field_type sel3; + ... +}; + +struct { + enum : integer_type { sel1, sel2, sel3, ... } tag_field; + ... + variant name v; +} + +An unnamed variant definition within a structure is expressed by the following +TSDL meta-data: + +struct { + enum : integer_type { sel1, sel2, sel3, ... } tag_field; + ... + variant { + field_type sel1; + field_type sel2; + field_type sel3; + ... + } v; +} + +Example of a named variant within a sequence that refers to a single tag field: + +variant example { + uint32_t a; + uint64_t b; + short c; +}; + +struct { + enum : uint2_t { a, b, c } choice; + unsigned int seqlen; + variant example v[seqlen]; +} + +Example of an unnamed variant: + +struct { + enum : uint2_t { a, b, c, d } choice; + /* Unrelated fields can be added between the variant and its tag */ + int32_t somevalue; + variant { + uint32_t a; + uint64_t b; + short c; + struct { + unsigned int field1; + uint64_t field2; + } d; + } s; +} + +Example of an unnamed variant within an array: + +struct { + enum : uint2_t { a, b, c } choice; + variant { + uint32_t a; + uint64_t b; + short c; + } v[10]; +} + +Example of a variant type definition within a structure, where the defined type +is then declared within an array of structures. This variant refers to a tag +located in an upper lexical scope. This example clearly shows that a variant +type definition referring to the tag "x" uses the closest preceding field from +the lexical scope of the type definition. + +struct { + enum : uint2_t { a, b, c, d } x; + + typedef variant { /* + * "x" refers to the preceding "x" enumeration in the + * lexical scope of the type definition. + */ + uint32_t a; + uint64_t b; + short c; + } example_variant; + + struct { + enum : int { x, y, z } x; /* This enumeration is not used by "v". */ + example_variant v; /* + * "v" uses the "enum : uint2_t { a, b, c, d }" + * tag. + */ + } a[10]; +} + +4.2.3 Arrays + +Arrays are fixed-length. Their length is declared in the type +declaration within the meta-data. They contain an array of "inner type" +elements, which can refer to any type not containing the type of the +array being declared (no circular dependency). The length is the number +of elements in an array. + +TSDL meta-data representation of a named array: + +typedef elem_type name[length]; + +A nameless array can be declared as a field type within a structure, e.g.: + + uint8_t field_name[10]; + +Arrays are always aligned on their element alignment requirement. + +4.2.4 Sequences + +Sequences are dynamically-sized arrays. They refer to a a "length" +unsigned integer field, which must appear in either the same lexical scope, +prior to the sequence field (in field declaration order), in an upper +lexical scope (see Section 7.3.1), or in an upper dynamic scope (see +Section 7.3.2). This length field represents the number of elements in +the sequence. The sequence per se is an array of "inner type" elements. + +TSDL meta-data representation for a sequence type definition: + +struct { + unsigned int length_field; + typedef elem_type typename[length_field]; + typename seq_field_name; +} + +A sequence can also be declared as a field type, e.g.: + +struct { + unsigned int length_field; + long seq_field_name[length_field]; +} + +Multiple sequences can refer to the same length field, and these length +fields can be in a different upper dynamic scope: + +e.g., assuming the stream.event.header defines: + +stream { + ... + id = 1; + event.header := struct { + uint16_t seq_len; + }; +}; + +event { + ... + stream_id = 1; + fields := struct { + long seq_a[stream.event.header.seq_len]; + char seq_b[stream.event.header.seq_len]; + }; +}; + +The sequence elements follow the "array" specifications. + +4.2.5 Strings + +Strings are an array of bytes of variable size and are terminated by a '\0' +"NULL" character. Their encoding is described in the TSDL meta-data. In +absence of encoding attribute information, the default encoding is +UTF-8. + +TSDL meta-data representation of a named string type: + +typealias string { + encoding = UTF8 OR ASCII; +} := name; + +A nameless string type can be declared as a field type: + +string field_name; /* Use default UTF8 encoding */ + +Strings are always aligned on byte size. + +5. Event Packet Header + +The event packet header consists of two parts: the "event packet header" +is the same for all streams of a trace. The second part, the "event +packet context", is described on a per-stream basis. Both are described +in the TSDL meta-data. The packets are aligned on architecture-page-sized +addresses. + +Event packet header (all fields are optional, specified by TSDL meta-data): + +- Magic number (CTF magic number: 0xC1FC1FC1) specifies that this is a + CTF packet. This magic number is optional, but when present, it should + come at the very beginning of the packet. +- Trace UUID, used to ensure the event packet match the meta-data used. + (note: we cannot use a meta-data checksum in every cases instead of a + UUID because meta-data can be appended to while tracing is active) + This field is optional. +- Stream ID, used as reference to stream description in meta-data. + This field is optional if there is only one stream description in the + meta-data, but becomes required if there are more than one stream in + the TSDL meta-data description. + +Event packet context (all fields are optional, specified by TSDL meta-data): + +- Event packet content size (in bytes). +- Event packet size (in bytes, includes padding). +- Event packet content checksum (optional). Checksum excludes the event packet + header. +- Per-stream event packet sequence count (to deal with UDP packet loss). The + number of significant sequence counter bits should also be present, so + wrap-arounds are dealt with correctly. +- Time-stamp at the beginning and time-stamp at the end of the event packet. + Both timestamps are written in the packet header, but sampled respectively + while (or before) writing the first event and while (or after) writing the + last event in the packet. The inclusive range between these timestamps should + include all event timestamps assigned to events contained within the packet. +- Events discarded count + - Snapshot of a per-stream free-running counter, counting the number of + events discarded that were supposed to be written in the stream prior to + the first event in the event packet. + * Note: producer-consumer buffer full condition should fill the current + event packet with padding so we know exactly where events have been + discarded. +- Lossless compression scheme used for the event packet content. Applied + directly to raw data. New types of compression can be added in following + versions of the format. + 0: no compression scheme + 1: bzip2 + 2: gzip + 3: xz +- Cypher used for the event packet content. Applied after compression. + 0: no encryption + 1: AES +- Checksum scheme used for the event packet content. Applied after encryption. + 0: no checksum + 1: md5 + 2: sha1 + 3: crc32 + +5.1 Event Packet Header Description + +The event packet header layout is indicated by the trace packet.header +field. Here is a recommended structure type for the packet header with +the fields typically expected (although these fields are each optional): + +struct event_packet_header { + uint32_t magic; + uint8_t uuid[16]; + uint32_t stream_id; +}; + +trace { + ... + packet.header := struct event_packet_header; +}; + +If the magic number is not present, tools such as "file" will have no +mean to discover the file type. + +If the uuid is not present, no validation that the meta-data actually +corresponds to the stream is performed. + +If the stream_id packet header field is missing, the trace can only +contain a single stream. Its "id" field can be left out, and its events +don't need to declare a "stream_id" field. + + +5.2 Event Packet Context Description + +Event packet context example. These are declared within the stream declaration +in the meta-data. All these fields are optional. If the packet size field is +missing, the whole stream only contains a single packet. If the content +size field is missing, the packet is filled (no padding). The content +and packet sizes include all headers. + +An example event packet context type: + +struct event_packet_context { + uint64_t timestamp_begin; + uint64_t timestamp_end; + uint32_t checksum; + uint32_t stream_packet_count; + uint32_t events_discarded; + uint32_t cpu_id; + uint32_t/uint16_t content_size; + uint32_t/uint16_t packet_size; + uint8_t stream_packet_count_bits; /* Significant counter bits */ + uint8_t compression_scheme; + uint8_t encryption_scheme; + uint8_t checksum_scheme; +}; + + +6. Event Structure + +The overall structure of an event is: + +1 - Stream Packet Context (as specified by the stream meta-data) + 2 - Event Header (as specified by the stream meta-data) + 3 - Stream Event Context (as specified by the stream meta-data) + 4 - Event Context (as specified by the event meta-data) + 5 - Event Payload (as specified by the event meta-data) + +This structure defines an implicit dynamic scoping, where variants +located in inner structures (those with a higher number in the listing +above) can refer to the fields of outer structures (with lower number in +the listing above). See Section 7.3 TSDL Scopes for more detail. + +6.1 Event Header + +Event headers can be described within the meta-data. We hereby propose, as an +example, two types of events headers. Type 1 accommodates streams with less than +31 event IDs. Type 2 accommodates streams with 31 or more event IDs. + +One major factor can vary between streams: the number of event IDs assigned to +a stream. Luckily, this information tends to stay relatively constant (modulo +event registration while trace is being recorded), so we can specify different +representations for streams containing few event IDs and streams containing +many event IDs, so we end up representing the event ID and time-stamp as +densely as possible in each case. + +The header is extended in the rare occasions where the information cannot be +represented in the ranges available in the standard event header. They are also +used in the rare occasions where the data required for a field could not be +collected: the flag corresponding to the missing field within the missing_fields +array is then set to 1. + +Types uintX_t represent an X-bit unsigned integer, as declared with +either: + + typealias integer { size = X; align = X; signed = false } := uintX_t; + + or + + typealias integer { size = X; align = 1; signed = false } := uintX_t; + +6.1.1 Type 1 - Few event IDs + + - Aligned on 32-bit (or 8-bit if byte-packed, depending on the architecture + preference). + - Native architecture byte ordering. + - For "compact" selection + - Fixed size: 32 bits. + - For "extended" selection + - Size depends on the architecture and variant alignment. + +struct event_header_1 { + /* + * id: range: 0 - 30. + * id 31 is reserved to indicate an extended header. + */ + enum : uint5_t { compact = 0 ... 30, extended = 31 } id; + variant { + struct { + uint27_t timestamp; + } compact; + struct { + uint32_t id; /* 32-bit event IDs */ + uint64_t timestamp; /* 64-bit timestamps */ + } extended; + } v; +} align(32); /* or align(8) */ + + +6.1.2 Type 2 - Many event IDs + + - Aligned on 16-bit (or 8-bit if byte-packed, depending on the architecture + preference). + - Native architecture byte ordering. + - For "compact" selection + - Size depends on the architecture and variant alignment. + - For "extended" selection + - Size depends on the architecture and variant alignment. + +struct event_header_2 { + /* + * id: range: 0 - 65534. + * id 65535 is reserved to indicate an extended header. + */ + enum : uint16_t { compact = 0 ... 65534, extended = 65535 } id; + variant { + struct { + uint32_t timestamp; + } compact; + struct { + uint32_t id; /* 32-bit event IDs */ + uint64_t timestamp; /* 64-bit timestamps */ + } extended; + } v; +} align(16); /* or align(8) */ + + +6.2 Event Context + +The event context contains information relative to the current event. +The choice and meaning of this information is specified by the TSDL +stream and event meta-data descriptions. The stream context is applied +to all events within the stream. The stream context structure follows +the event header. The event context is applied to specific events. Its +structure follows the stream context structure. + +An example of stream-level event context is to save the event payload size with +each event, or to save the current PID with each event. These are declared +within the stream declaration within the meta-data: + + stream { + ... + event.context := struct { + uint pid; + uint16_t payload_size; + }; + }; + +An example of event-specific event context is to declare a bitmap of missing +fields, only appended after the stream event context if the extended event +header is selected. NR_FIELDS is the number of fields within the event (a +numeric value). + + event { + context = struct { + variant { + struct { } compact; + struct { + uint1_t missing_fields[NR_FIELDS]; /* missing event fields bitmap */ + } extended; + } v; + }; + ... + } + +6.3 Event Payload + +An event payload contains fields specific to a given event type. The fields +belonging to an event type are described in the event-specific meta-data +within a structure type. + +6.3.1 Padding + +No padding at the end of the event payload. This differs from the ISO/C standard +for structures, but follows the CTF standard for structures. In a trace, even +though it makes sense to align the beginning of a structure, it really makes no +sense to add padding at the end of the structure, because structures are usually +not followed by a structure of the same type. + +This trick can be done by adding a zero-length "end" field at the end of the C +structures, and by using the offset of this field rather than using sizeof() +when calculating the size of a structure (see Appendix "A. Helper macros"). + +6.3.2 Alignment + +The event payload is aligned on the largest alignment required by types +contained within the payload. (This follows the ISO/C standard for structures) + + +7. Trace Stream Description Language (TSDL) + +The Trace Stream Description Language (TSDL) allows expression of the +binary trace streams layout in a C99-like Domain Specific Language +(DSL). + + +7.1 Meta-data + +The trace stream layout description is located in the trace meta-data. +The meta-data is itself located in a stream identified by its name: +"metadata". + +The meta-data description can be expressed in two different formats: +text-only and packet-based. The text-only description facilitates +generation of meta-data and provides a convenient way to enter the +meta-data information by hand. The packet-based meta-data provides the +CTF stream packet facilities (checksumming, compression, encryption, +network-readiness) for meta-data stream generated and transported by a +tracer. + +The text-only meta-data file is a plain text TSDL description. + +The packet-based meta-data is made of "meta-data packets", which each +start with a meta-data packet header. The packet-based meta-data +description is detected by reading the magic number "0x75D11D57" at the +beginning of the file. This magic number is also used to detect the +endianness of the architecture by trying to read the CTF magic number +and its counterpart in reversed endianness. The events within the +meta-data stream have no event header nor event context. Each event only +contains a "sequence" payload, which is a sequence of bits using the +"trace.packet.header.content_size" field as a placeholder for its length +(the packet header size should be substracted). The formatting of this +sequence of bits is a plain-text representation of the TSDL description. +Each meta-data packet start with a special packet header, specific to +the meta-data stream, which contains, exactly: + +struct metadata_packet_header { + uint32_t magic; /* 0x75D11D57 */ + uint8_t uuid[16]; /* Unique Universal Identifier */ + uint32_t checksum; /* 0 if unused */ + uint32_t content_size; /* in bits */ + uint32_t packet_size; /* in bits */ + uint8_t compression_scheme; /* 0 if unused */ + uint8_t encryption_scheme; /* 0 if unused */ + uint8_t checksum_scheme; /* 0 if unused */ +}; + +The packet-based meta-data can be converted to a text-only meta-data by +concatenating all the strings in contains. + +In the textual representation of the meta-data, the text contained +within "/*" and "*/", as well as within "//" and end of line, are +treated as comments. Boolean values can be represented as true, TRUE, +or 1 for true, and false, FALSE, or 0 for false. Within the string-based +meta-data description, the trace UUID is represented as a string of +hexadecimal digits and dashes "-". In the event packet header, the trace +UUID is represented as an array of bytes. + + +7.2 Declaration vs Definition + +A declaration associates a layout to a type, without specifying where +this type is located in the event structure hierarchy (see Section 6). +This therefore includes typedef, typealias, as well as all type +specifiers. In certain circumstances (typedef, structure field and +variant field), a declaration is followed by a declarator, which specify +the newly defined type name (for typedef), or the field name (for +declarations located within structure and variants). Array and sequence, +declared with square brackets ("[" "]"), are part of the declarator, +similarly to C99. The enumeration base type is specified by +": enum_base", which is part of the type specifier. The variant tag +name, specified between "<" ">", is also part of the type specifier. + +A definition associates a type to a location in the event structure +hierarchy (see Section 6). This association is denoted by ":=", as shown +in Section 7.3. + + +7.3 TSDL Scopes + +TSDL uses two different types of scoping: a lexical scope is used for +declarations and type definitions, and a dynamic scope is used for +variants references to tag fields and for sequence references to length +fields. + +7.3.1 Lexical Scope + +Each of "trace", "stream", "event", "struct" and "variant" have their own +nestable declaration scope, within which types can be declared using "typedef" +and "typealias". A root declaration scope also contains all declarations +located outside of any of the aforementioned declarations. An inner +declaration scope can refer to type declared within its container +lexical scope prior to the inner declaration scope. Redefinition of a +typedef or typealias is not valid, although hiding an upper scope +typedef or typealias is allowed within a sub-scope. + +7.3.2 Dynamic Scope + +A dynamic scope consists in the lexical scope augmented with the +implicit event structure definition hierarchy presented at Section 6. +The dynamic scope is used for variant tag and sequence length +definitions. It is used at definition time to look up the location of +the tag field associated with a variant, and to lookup up the location +of the length field associated with a sequence. + +Therefore, variants (or sequences) in lower levels in the dynamic scope +(e.g. event context) can refer to a tag (or length) field located in +upper levels (e.g. in the event header) by specifying, in this case, the +associated tag with . This allows, for instance, the +event context to define a variant referring to the "id" field of the +event header as selector. + +The target dynamic scope must be specified explicitly when referring to +a field outside of the local static scope. The dynamic scope prefixes +are thus: + + - Trace Packet Header: , + - Stream Packet Context: , + - Event Header: , + - Stream Event Context: , + - Event Context: , + - Event Payload: . + +Multiple declarations of the same field name within a single scope is +not valid. It is however valid to re-use the same field name in +different scopes. There is no possible conflict, because the dynamic +scope must be specified when a variant refers to a tag field located in +a different dynamic scope. + +The information available in the dynamic scopes can be thought of as the +current tracing context. At trace production, information about the +current context is saved into the specified scope field levels. At trace +consumption, for each event, the current trace context is therefore +readable by accessing the upper dynamic scopes. + + +7.4 TSDL Examples + +The grammar representing the TSDL meta-data is presented in Appendix C. +TSDL Grammar. This section presents a rather lighter reading that +consists in examples of TSDL meta-data, with template values. + +The stream "id" can be left out if there is only one stream in the +trace. The event "id" field can be left out if there is only one event +in a stream. + +trace { + major = value; /* Trace format version */ + minor = value; + uuid = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"; /* Trace UUID */ + byte_order = be OR le; /* Endianness (required) */ + packet.header := struct { + uint32_t magic; + uint8_t uuid[16]; + uint32_t stream_id; + }; +}; + +stream { + id = stream_id; + /* Type 1 - Few event IDs; Type 2 - Many event IDs. See section 6.1. */ + event.header := event_header_1 OR event_header_2; + event.context := struct { + ... + }; + packet.context := struct { + ... + }; +}; + +event { + name = event_name; + id = value; /* Numeric identifier within the stream */ + stream_id = stream_id; + context := struct { + ... + }; + fields := struct { + ... + }; +}; + +/* More detail on types in section 4. Types */ + +/* + * Named types: + * + * Type declarations behave similarly to the C standard. + */ + +typedef aliased_type_specifiers new_type_declarators; + +/* e.g.: typedef struct example new_type_name[10]; */ + +/* + * typealias + * + * The "typealias" declaration can be used to give a name (including + * pointer declarator specifier) to a type. It should also be used to + * map basic C types (float, int, unsigned long, ...) to a CTF type. + * Typealias is a superset of "typedef": it also allows assignment of a + * simple variable identifier to a type. + */ + +typealias type_class { + ... +} := type_specifiers type_declarator; + +/* + * e.g.: + * typealias integer { + * size = 32; + * align = 32; + * signed = false; + * } := struct page *; + * + * typealias integer { + * size = 32; + * align = 32; + * signed = true; + * } := int; + */ + +struct name { + ... +}; + +variant name { + ... +}; + +enum name : integer_type { + ... +}; + + +/* + * Unnamed types, contained within compound type fields, typedef or typealias. + */ + +struct { + ... +} + +struct { + ... +} align(value) + +variant { + ... +} + +enum : integer_type { + ... +} + +typedef type new_type[length]; + +struct { + type field_name[length]; +} + +typedef type new_type[length_type]; + +struct { + type field_name[length_type]; +} + +integer { + ... +} + +floating_point { + ... +} + +struct { + integer_type field_name:size; /* GNU/C bitfield */ +} + +struct { + string field_name; +} + + +A. Helper macros + +The two following macros keep track of the size of a GNU/C structure without +padding at the end by placing HEADER_END as the last field. A one byte end field +is used for C90 compatibility (C99 flexible arrays could be used here). Note +that this does not affect the effective structure size, which should always be +calculated with the header_sizeof() helper. + +#define HEADER_END char end_field +#define header_sizeof(type) offsetof(typeof(type), end_field) + + +B. Stream Header Rationale + +An event stream is divided in contiguous event packets of variable size. These +subdivisions allow the trace analyzer to perform a fast binary search by time +within the stream (typically requiring to index only the event packet headers) +without reading the whole stream. These subdivisions have a variable size to +eliminate the need to transfer the event packet padding when partially filled +event packets must be sent when streaming a trace for live viewing/analysis. +An event packet can contain a certain amount of padding at the end. Dividing +streams into event packets is also useful for network streaming over UDP and +flight recorder mode tracing (a whole event packet can be swapped out of the +buffer atomically for reading). + +The stream header is repeated at the beginning of each event packet to allow +flexibility in terms of: + + - streaming support, + - allowing arbitrary buffers to be discarded without making the trace + unreadable, + - allow UDP packet loss handling by either dealing with missing event packet + or asking for re-transmission. + - transparently support flight recorder mode, + - transparently support crash dump. + + +C. TSDL Grammar + +/* + * Common Trace Format (CTF) Trace Stream Description Language (TSDL) Grammar. + * + * Inspired from the C99 grammar: + * http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf (Annex A) + * and c++1x grammar (draft) + * http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3291.pdf (Annex A) + * + * Specialized for CTF needs by including only constant and declarations from + * C99 (excluding function declarations), and by adding support for variants, + * sequences and CTF-specific specifiers. Enumeration container types + * semantic is inspired from c++1x enum-base. + */ + +1) Lexical grammar + +1.1) Lexical elements + +token: + keyword + identifier + constant + string-literal + punctuator + +1.2) Keywords + +keyword: is one of + +align +const +char +double +enum +event +floating_point +float +integer +int +long +short +signed +stream +string +struct +trace +typealias +typedef +unsigned +variant +void +_Bool +_Complex +_Imaginary + + +1.3) Identifiers + +identifier: + identifier-nondigit + identifier identifier-nondigit + identifier digit + +identifier-nondigit: + nondigit + universal-character-name + any other implementation-defined characters + +nondigit: + _ + [a-zA-Z] /* regular expression */ + +digit: + [0-9] /* regular expression */ + +1.4) Universal character names + +universal-character-name: + \u hex-quad + \U hex-quad hex-quad + +hex-quad: + hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit + +1.5) Constants + +constant: + integer-constant + enumeration-constant + character-constant + +integer-constant: + decimal-constant integer-suffix-opt + octal-constant integer-suffix-opt + hexadecimal-constant integer-suffix-opt + +decimal-constant: + nonzero-digit + decimal-constant digit + +octal-constant: + 0 + octal-constant octal-digit + +hexadecimal-constant: + hexadecimal-prefix hexadecimal-digit + hexadecimal-constant hexadecimal-digit + +hexadecimal-prefix: + 0x + 0X + +nonzero-digit: + [1-9] + +integer-suffix: + unsigned-suffix long-suffix-opt + unsigned-suffix long-long-suffix + long-suffix unsigned-suffix-opt + long-long-suffix unsigned-suffix-opt + +unsigned-suffix: + u + U + +long-suffix: + l + L + +long-long-suffix: + ll + LL + +enumeration-constant: + identifier + string-literal + +character-constant: + ' c-char-sequence ' + L' c-char-sequence ' + +c-char-sequence: + c-char + c-char-sequence c-char + +c-char: + any member of source charset except single-quote ('), backslash + (\), or new-line character. + escape-sequence + +escape-sequence: + simple-escape-sequence + octal-escape-sequence + hexadecimal-escape-sequence + universal-character-name + +simple-escape-sequence: one of + \' \" \? \\ \a \b \f \n \r \t \v + +octal-escape-sequence: + \ octal-digit + \ octal-digit octal-digit + \ octal-digit octal-digit octal-digit + +hexadecimal-escape-sequence: + \x hexadecimal-digit + hexadecimal-escape-sequence hexadecimal-digit + +1.6) String literals + +string-literal: + " s-char-sequence-opt " + L" s-char-sequence-opt " + +s-char-sequence: + s-char + s-char-sequence s-char + +s-char: + any member of source charset except double-quote ("), backslash + (\), or new-line character. + escape-sequence + +1.7) Punctuators + +punctuator: one of + [ ] ( ) { } . -> * + - < > : ; ... = , + + +2) Phrase structure grammar + +primary-expression: + identifier + constant + string-literal + ( unary-expression ) + +postfix-expression: + primary-expression + postfix-expression [ unary-expression ] + postfix-expression . identifier + postfix-expressoin -> identifier + +unary-expression: + postfix-expression + unary-operator postfix-expression + +unary-operator: one of + + - + +assignment-operator: + = + +type-assignment-operator: + := + +constant-expression-range: + unary-expression ... unary-expression + +2.2) Declarations: + +declaration: + declaration-specifiers declarator-list-opt ; + ctf-specifier ; + +declaration-specifiers: + storage-class-specifier declaration-specifiers-opt + type-specifier declaration-specifiers-opt + type-qualifier declaration-specifiers-opt + +declarator-list: + declarator + declarator-list , declarator + +abstract-declarator-list: + abstract-declarator + abstract-declarator-list , abstract-declarator + +storage-class-specifier: + typedef + +type-specifier: + void + char + short + int + long + float + double + signed + unsigned + _Bool + _Complex + _Imaginary + struct-specifier + variant-specifier + enum-specifier + typedef-name + ctf-type-specifier + +align-attribute: + align ( unary-expression ) + +struct-specifier: + struct identifier-opt { struct-or-variant-declaration-list-opt } align-attribute-opt + struct identifier align-attribute-opt + +struct-or-variant-declaration-list: + struct-or-variant-declaration + struct-or-variant-declaration-list struct-or-variant-declaration + +struct-or-variant-declaration: + specifier-qualifier-list struct-or-variant-declarator-list ; + declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list ; + typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list ; + typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list ; + +specifier-qualifier-list: + type-specifier specifier-qualifier-list-opt + type-qualifier specifier-qualifier-list-opt + +struct-or-variant-declarator-list: + struct-or-variant-declarator + struct-or-variant-declarator-list , struct-or-variant-declarator + +struct-or-variant-declarator: + declarator + declarator-opt : unary-expression + +variant-specifier: + variant identifier-opt variant-tag-opt { struct-or-variant-declaration-list } + variant identifier variant-tag + +variant-tag: + < identifier > + +enum-specifier: + enum identifier-opt { enumerator-list } + enum identifier-opt { enumerator-list , } + enum identifier + enum identifier-opt : declaration-specifiers { enumerator-list } + enum identifier-opt : declaration-specifiers { enumerator-list , } + +enumerator-list: + enumerator + enumerator-list , enumerator + +enumerator: + enumeration-constant + enumeration-constant assignment-operator unary-expression + enumeration-constant assignment-operator constant-expression-range + +type-qualifier: + const + +declarator: + pointer-opt direct-declarator + +direct-declarator: + identifier + ( declarator ) + direct-declarator [ unary-expression ] + +abstract-declarator: + pointer-opt direct-abstract-declarator + +direct-abstract-declarator: + identifier-opt + ( abstract-declarator ) + direct-abstract-declarator [ unary-expression ] + direct-abstract-declarator [ ] + +pointer: + * type-qualifier-list-opt + * type-qualifier-list-opt pointer + +type-qualifier-list: + type-qualifier + type-qualifier-list type-qualifier + +typedef-name: + identifier + +2.3) CTF-specific declarations + +ctf-specifier: + event { ctf-assignment-expression-list-opt } + stream { ctf-assignment-expression-list-opt } + trace { ctf-assignment-expression-list-opt } + typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list + typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list + +ctf-type-specifier: + floating_point { ctf-assignment-expression-list-opt } + integer { ctf-assignment-expression-list-opt } + string { ctf-assignment-expression-list-opt } + string + +ctf-assignment-expression-list: + ctf-assignment-expression ; + ctf-assignment-expression-list ctf-assignment-expression ; + +ctf-assignment-expression: + unary-expression assignment-operator unary-expression + unary-expression type-assignment-operator type-specifier + declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list + typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list + typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list