X-Git-Url: http://git.efficios.com/?p=ctf.git;a=blobdiff_plain;f=common-trace-format-proposal.txt;h=e8ee258137ce7e0b97fc58529736570dc2964b20;hp=4cb69429e9c0256ee8d46d261e34f924fe747ce4;hb=3fde5da106cc1adbb564ceaf283701d796355ede;hpb=457d8b0a0cff67a7489b7eead5b8b26701b430f8 diff --git a/common-trace-format-proposal.txt b/common-trace-format-proposal.txt index 4cb6942..e8ee258 100644 --- a/common-trace-format-proposal.txt +++ b/common-trace-format-proposal.txt @@ -7,7 +7,9 @@ The goal of the present document is to propose a trace format that suits the needs of the embedded, telecom, high-performance and kernel communities. It is based on the Common Trace Format Requirements (v1.4) document. It is designed to allow traces to be natively generated by the Linux kernel, Linux user-space -applications written in C/C++, and hardware components. +applications written in C/C++, and hardware components. One major element of +CTF is the Trace Stream Description Language (TSDL) which flexibility +enables description of various binary trace stream layouts. The latest version of this document can be found at: @@ -51,31 +53,31 @@ virtual file system. Because each event stream is appended to while a trace is being recorded, each is associated with a separate file for output. Therefore, a stored trace can be represented as a directory containing one file per stream. -A metadata event stream contains information on trace event types. It describes: +Meta-data description associated with the trace contains information on +trace event types expressed in the Trace Stream Description Language +(TSDL). This language describes: - Trace version. - Types available. +- Per-trace event header description. - Per-stream event header description. -- Per-stream event header selection. -- Per-stream event context fields. +- Per-stream event context description. - Per-event - Event type to stream mapping. - Event type to name mapping. - Event type to ID mapping. + - Event context description. - Event fields description. 3. Event stream -An event stream is divided in contiguous event packets of variable size. These -subdivisions have a variable size. An event packet can contain a certain amount -of padding at the end. The rationale for the event stream design choices is -explained in Appendix B. Stream Header Rationale. - -An event stream is divided in contiguous event packets of variable size. These -subdivisions have a variable size. An event packet can contain a certain amount -of padding at the end. The stream header is repeated at the beginning of each -event packet. +An event stream can be divided into contiguous event packets of variable +size. These subdivisions have a variable size. An event packet can +contain a certain amount of padding at the end. The stream header is +repeated at the beginning of each event packet. The rationale for the +event stream design choices is explained in Appendix B. Stream Header +Rationale. The event stream header will therefore be referred to as the "event packet header" throughout the rest of this document. @@ -103,17 +105,23 @@ types, but must be derived into a type to be usable in an event field. We define "byte-packed" types as aligned on the byte size, namely 8-bit. We define "bit-packed" types as following on the next bit, as defined by the -"bitfields" section. - -All basic types, except bitfields, are either aligned on an architecture-defined -specific alignment or byte-packed, depending on the architecture preference. -Architectures providing fast unaligned write byte-packed basic types to save -space, aligning each type on byte boundaries (8-bit). Architectures with slow -unaligned writes align types on specific alignment values. If no specific -alignment is declared for a type nor its parents, it is assumed to be bit-packed -for bitfields and byte-packed for other types. - -Metadata attribute representation of a specific alignment: +"Integers" section. + +Each basic type must specify its alignment, in bits. Examples of +possible alignments are: bit-packed (align = 1), byte-packed (align = +8), or word-aligned (e.g. align = 32 or align = 64). The choice depends +on the architecture preference and compactness vs performance trade-offs +of the implementation. Architectures providing fast unaligned write +byte-packed basic types to save space, aligning each type on byte +boundaries (8-bit). Architectures with slow unaligned writes align types +on specific alignment values. If no specific alignment is declared for a +type, it is assumed to be bit-packed for integers with size not multiple +of 8 bits and for gcc bitfields. All other basic types are byte-packed +by default. It is however recommended to always specify the alignment +explicitly. Alignment values must be power of two. Compound types are +aligned as specified in their individual specification. + +TSDL meta-data attribute representation of a specific alignment: align = value; /* value in bits */ @@ -125,7 +133,7 @@ attribute. Typical use-case is to specify the network byte order (big endian: "be") to save data captured from the network into the trace without conversion. If not specified, the byte order is native. -Metadata representation: +TSDL meta-data representation: byte_order = native OR network OR be OR le; /* network and be are aliases */ @@ -136,20 +144,21 @@ multiplied by CHAR_BIT. We require the size of "char" and "unsigned char" types (CHAR_BIT) to be fixed to 8 bits for cross-endianness compatibility. -Metadata representation: +TSDL meta-data representation: size = value; (value is in bits) 4.1.5 Integers -Signed integers are represented in two-complement. Integer alignment, size, -signedness and byte ordering are defined in the metadata. Integers aligned on -byte size (8-bit) and with length multiple of byte size (8-bit) correspond to -the C99 standard integers. In addition, integers with alignment and/or size that -are _not_ a multiple of the byte size are permitted; these correspond to the C99 -standard bitfields, with the added specification that the CTF integer bitfields -have a fixed binary representation. A MIT-licensed reference implementation of -the CTF portable bitfields is available at: +Signed integers are represented in two-complement. Integer alignment, +size, signedness and byte ordering are defined in the TSDL meta-data. +Integers aligned on byte size (8-bit) and with length multiple of byte +size (8-bit) correspond to the C99 standard integers. In addition, +integers with alignment and/or size that are _not_ a multiple of the +byte size are permitted; these correspond to the C99 standard bitfields, +with the added specification that the CTF integer bitfields have a fixed +binary representation. A MIT-licensed reference implementation of the +CTF portable bitfields is available at: http://git.efficios.com/?p=babeltrace.git;a=blob;f=include/babeltrace/bitfield.h @@ -171,16 +180,19 @@ Binary representation of integers: This binary representation is derived from the bitfield implementation in GCC for little and big endian. However, contrary to what GCC does, integers can -cross units boundaries (no padding is required). Padding can be explicitely +cross units boundaries (no padding is required). Padding can be explicitly added (see 4.1.6 GNU/C bitfields) to follow the GCC layout if needed. -Metadata representation: +TSDL meta-data representation: integer { signed = true OR false; /* default false */ byte_order = native OR network OR be OR le; /* default native */ size = value; /* value in bits, no default */ align = value; /* value in bits */ + /* based used for pretty-printing output, default: decimal. */ + base = decimal OR dec OR OR d OR i OR u OR 10 OR hexadecimal OR hex OR x OR X OR p OR 16 + OR octal OR oct OR o OR 8 OR binary OR b OR 2; } Example of type inheritance (creation of a uint32_t named type): @@ -189,7 +201,7 @@ typealias integer { size = 32; signed = false; align = 32; -} : uint32_t; +} := uint32_t; Definition of a named 5-bit signed bitfield: @@ -197,7 +209,7 @@ typealias integer { size = 5; signed = true; align = 1; -} : int5_t; +} := int5_t; 4.1.6 GNU/C bitfields @@ -206,7 +218,7 @@ particularity on alignment: if a bitfield cannot fit in the current unit, the unit is padded and the bitfield starts at the following unit. The unit size is defined by the size of the type "unit_type". -Metadata representation: +TSDL meta-data representation: unit_type name:size: @@ -223,7 +235,7 @@ the current unit. 4.1.7 Floating point -The floating point values byte ordering is defined in the metadata. +The floating point values byte ordering is defined in the TSDL meta-data. Floating point values follow the IEEE 754-2008 standard interchange formats. Description of the floating point values include the exponent and mantissa size @@ -242,12 +254,13 @@ in bits. Some requirements are imposed on the floating point values: - sizeof(double) * CHAR_BIT - DBL_MANT_DIG - sizeof(long double) * CHAR_BIT - LDBL_MANT_DIG -Metadata representation: +TSDL meta-data representation: floating_point { - exp_dig = value; - mant_dig = value; - byte_order = native OR network OR be OR le; + exp_dig = value; + mant_dig = value; + byte_order = native OR network OR be OR le; + align = value; } Example of type inheritance: @@ -256,17 +269,21 @@ typealias floating_point { exp_dig = 8; /* sizeof(float) * CHAR_BIT - FLT_MANT_DIG */ mant_dig = 24; /* FLT_MANT_DIG */ byte_order = native; -} : float; + align = 32; +} := float; TODO: define NaN, +inf, -inf behavior. +Bit-packed, byte-packed or larger alignments can be used for floating +point values, similarly to integers. + 4.1.8 Enumerations Enumerations are a mapping between an integer type and a table of strings. The numerical representation of the enumeration follows the integer type specified -by the metadata. The enumeration mapping table is detailed in the enumeration -description within the metadata. The mapping table maps inclusive value ranges -(or single values) to strings. Instead of being limited to simple +by the meta-data. The enumeration mapping table is detailed in the enumeration +description within the meta-data. The mapping table maps inclusive value +ranges (or single values) to strings. Instead of being limited to simple "value -> string" mappings, these enumerations map "[ start_value ... end_value ] -> string", which map inclusive ranges of values to strings. An enumeration from the C language can be represented in @@ -274,10 +291,7 @@ this format by having the same start_value and end_value for each element, which is in fact a range of size 1. This single-value range is supported without repeating the start and end values with the value = string declaration. -If a numeric value is encountered between < >, it represents the integer type -size used to hold the enumeration, in bits. - -enum name { +enum name : integer_type { somestring = start_value1 ... end_value1, "other string" = start_value2 ... end_value2, yet_another_string, /* will be assigned to end_value2 + 1 */ @@ -288,7 +302,7 @@ enum name { If the values are omitted, the enumeration starts at 0 and increment of 1 for each entry: -enum name <32> { +enum name : unsigned int { ZERO, ONE, TWO, @@ -300,7 +314,17 @@ Overlapping ranges within a single enumeration are implementation defined. A nameless enumeration can be declared as a field type or as part of a typedef: -enum { +enum : integer_type { + ... +} + +Enumerations omitting the container type ": integer_type" use the "int" +type (for compatibility with C99). The "int" type must be previously +declared. E.g.: + +typealias integer { size = 32; align = 32; signed = true } := int; + +enum { ... } @@ -315,7 +339,7 @@ structures, variant, arrays, sequences, and strings. Structures are aligned on the largest alignment required by basic types contained within the structure. (This follows the ISO/C standard for structures) -Metadata representation of a named structure: +TSDL meta-data representation of a named structure: struct name { field_type field_name; @@ -331,7 +355,7 @@ struct example { signed = true; align = 16; } first_field_name; - uint64_t second_field_name; /* Named type declared in the metadata */ + uint64_t second_field_name; /* Named type declared in the meta-data */ }; The fields are placed in a sequence next to each other. They each possess a @@ -343,17 +367,28 @@ struct { ... } +Alignment for a structure compound type can be forced to a minimum value +by adding an "align" specifier after the declaration of a structure +body. This attribute is read as: align(value). The value is specified in +bits. The structure will be aligned on the maximum value between this +attribute and the alignment required by the basic types contained within +the structure. e.g. + +struct { + ... +} align(32) + 4.2.2 Variants (Discriminated/Tagged Unions) A CTF variant is a selection between different types. A CTF variant must always be defined within the scope of a structure or within fields contained within a structure (defined recursively). A "tag" enumeration field must appear in either the same lexical scope, prior to the variant -field (in field declaration order), in an uppermost lexical scope (see -Section 7.2.1), or in an uppermost dynamic scope (see Section 7.2.2). -The type selection is indicated by the mapping from the enumeration -value to the string used as variant type selector. The field to use as -tag is specified by the "tag_field", specified between "< >" after the +field (in field declaration order), in an upper lexical scope (see +Section 7.3.1), or in an upper dynamic scope (see Section 7.3.2). The +type selection is indicated by the mapping from the enumeration value to +the string used as variant type selector. The field to use as tag is +specified by the "tag_field", specified between "< >" after the "variant" keyword for unnamed variants, and after "variant name" for named variants. @@ -374,16 +409,16 @@ variant name { }; struct { - enum { sel1, sel2, sel3, ... } tag_field; + enum : integer_type { sel1, sel2, sel3, ... } tag_field; ... variant name v; } An unnamed variant definition within a structure is expressed by the following -metadata: +TSDL meta-data: struct { - enum { sel1, sel2, sel3, ... } tag_field; + enum : integer_type { sel1, sel2, sel3, ... } tag_field; ... variant { field_type sel1; @@ -402,14 +437,15 @@ variant example { }; struct { - enum { a, b, c } choice; - variant example v[unsigned int]; + enum : uint2_t { a, b, c } choice; + unsigned int seqlen; + variant example v[seqlen]; } Example of an unnamed variant: struct { - enum { a, b, c, d } choice; + enum : uint2_t { a, b, c, d } choice; /* Unrelated fields can be added between the variant and its tag */ int32_t somevalue; variant { @@ -426,7 +462,7 @@ struct { Example of an unnamed variant within an array: struct { - enum { a, b, c } choice; + enum : uint2_t { a, b, c } choice; variant { uint32_t a; uint64_t b; @@ -441,7 +477,7 @@ type definition referring to the tag "x" uses the closest preceding field from the lexical scope of the type definition. struct { - enum { a, b, c, d } x; + enum : uint2_t { a, b, c, d } x; typedef variant { /* * "x" refers to the preceding "x" enumeration in the @@ -453,9 +489,9 @@ struct { } example_variant; struct { - enum { x, y, z } x; /* This enumeration is not used by "v". */ + enum : int { x, y, z } x; /* This enumeration is not used by "v". */ example_variant v; /* - * "v" uses the "enum { a, b, c, d }" + * "v" uses the "enum : uint2_t { a, b, c, d }" * tag. */ } a[10]; @@ -463,12 +499,13 @@ struct { 4.2.3 Arrays -Arrays are fixed-length. Their length is declared in the type declaration within -the metadata. They contain an array of "inner type" elements, which can refer to -any type not containing the type of the array being declared (no circular -dependency). The length is the number of elements in an array. +Arrays are fixed-length. Their length is declared in the type +declaration within the meta-data. They contain an array of "inner type" +elements, which can refer to any type not containing the type of the +array being declared (no circular dependency). The length is the number +of elements in an array. -Metadata representation of a named array: +TSDL meta-data representation of a named array: typedef elem_type name[length]; @@ -476,67 +513,98 @@ A nameless array can be declared as a field type within a structure, e.g.: uint8_t field_name[10]; +Arrays are always aligned on their element alignment requirement. 4.2.4 Sequences -Sequences are dynamically-sized arrays. They start with an integer that specify -the length of the sequence, followed by an array of "inner type" elements. -The length is the number of elements in the sequence. +Sequences are dynamically-sized arrays. They refer to a a "length" +unsigned integer field, which must appear in either the same lexical scope, +prior to the sequence field (in field declaration order), in an upper +lexical scope (see Section 7.3.1), or in an upper dynamic scope (see +Section 7.3.2). This length field represents the number of elements in +the sequence. The sequence per se is an array of "inner type" elements. + +TSDL meta-data representation for a sequence type definition: + +struct { + unsigned int length_field; + typedef elem_type typename[length_field]; + typename seq_field_name; +} -Metadata representation for a named sequence: +A sequence can also be declared as a field type, e.g.: -typedef elem_type name[length_type]; +struct { + unsigned int length_field; + long seq_field_name[length_field]; +} -A nameless sequence can be declared as a field type, e.g.: +Multiple sequences can refer to the same length field, and these length +fields can be in a different upper dynamic scope: -long field_name[int]; +e.g., assuming the stream.event.header defines: + +stream { + ... + id = 1; + event.header := struct { + uint16_t seq_len; + }; +}; + +event { + ... + stream_id = 1; + fields := struct { + long seq_a[stream.event.header.seq_len]; + char seq_b[stream.event.header.seq_len]; + }; +}; -The length type follows the integer types specifications, and the sequence -elements follow the "array" specifications. +The sequence elements follow the "array" specifications. 4.2.5 Strings Strings are an array of bytes of variable size and are terminated by a '\0' -"NULL" character. Their encoding is described in the metadata. In absence of -encoding attribute information, the default encoding is UTF-8. +"NULL" character. Their encoding is described in the TSDL meta-data. In +absence of encoding attribute information, the default encoding is +UTF-8. -Metadata representation of a named string type: +TSDL meta-data representation of a named string type: typealias string { encoding = UTF8 OR ASCII; -} : name; +} := name; A nameless string type can be declared as a field type: string field_name; /* Use default UTF8 encoding */ -5. Event Packet Header +Strings are always aligned on byte size. -The event packet header consists of two part: one is mandatory and have a fixed -layout. The second part, the "event packet context", has its layout described in -the metadata. +5. Event Packet Header -- Aligned on page size. Fixed size. Fields either aligned or packed (depending - on the architecture preference). - No padding at the end of the event packet header. Native architecture byte - ordering. +The event packet header consists of two parts: the "event packet header" +is the same for all streams of a trace. The second part, the "event +packet context", is described on a per-stream basis. Both are described +in the TSDL meta-data. The packets are aligned on architecture-page-sized +addresses. -Fixed layout (event packet header): +Event packet header (all fields are optional, specified by TSDL meta-data): -- Magic number (CTF magic numbers: 0xC1FC1FC1 and its reverse endianness - representation: 0xC11FFCC1) It needs to have a non-symmetric bytewise - representation. Used to distinguish between big and little endian traces (this - information is determined by knowing the endianness of the architecture - reading the trace and comparing the magic number against its value and the - reverse, 0xC11FFCC1). This magic number specifies that we use the CTF metadata - description language described in this document. Different magic numbers - should be used for other metadata description languages. -- Trace UUID, used to ensure the event packet match the metadata used. - (note: we cannot use a metadata checksum because metadata can be appended to - while tracing is active) -- Stream ID, used as reference to stream description in metadata. +- Magic number (CTF magic number: 0xC1FC1FC1) specifies that this is a + CTF packet. This magic number is optional, but when present, it should + come at the very beginning of the packet. +- Trace UUID, used to ensure the event packet match the meta-data used. + (note: we cannot use a meta-data checksum in every cases instead of a + UUID because meta-data can be appended to while tracing is active) + This field is optional. +- Stream ID, used as reference to stream description in meta-data. + This field is optional if there is only one stream description in the + meta-data, but becomes required if there are more than one stream in + the TSDL meta-data description. -Metadata-defined layout (event packet context): +Event packet context (all fields are optional, specified by TSDL meta-data): - Event packet content size (in bytes). - Event packet size (in bytes, includes padding). @@ -544,8 +612,8 @@ Metadata-defined layout (event packet context): header. - Per-stream event packet sequence count (to deal with UDP packet loss). The number of significant sequence counter bits should also be present, so - wrap-arounds are deal with correctly. -- Timestamp at the beginning and timestamp at the end of the event packet. + wrap-arounds are dealt with correctly. +- Time-stamp at the beginning and time-stamp at the end of the event packet. Both timestamps are written in the packet header, but sampled respectively while (or before) writing the first event and while (or after) writing the last event in the packet. The inclusive range between these timestamps should @@ -573,19 +641,41 @@ Metadata-defined layout (event packet context): 2: sha1 3: crc32 -5.1 Event Packet Header Fixed Layout Description +5.1 Event Packet Header Description + +The event packet header layout is indicated by the trace packet.header +field. Here is a recommended structure type for the packet header with +the fields typically expected (although these fields are each optional): struct event_packet_header { uint32_t magic; - uint8_t trace_uuid[16]; + uint8_t uuid[16]; uint32_t stream_id; }; +trace { + ... + packet.header := struct event_packet_header; +}; + +If the magic number is not present, tools such as "file" will have no +mean to discover the file type. + +If the uuid is not present, no validation that the meta-data actually +corresponds to the stream is performed. + +If the stream_id packet header field is missing, the trace can only +contain a single stream. Its "id" field can be left out, and its events +don't need to declare a "stream_id" field. + + 5.2 Event Packet Context Description Event packet context example. These are declared within the stream declaration -in the metadata. All these fields are optional except for "content_size" and -"packet_size", which must be present in the context. +in the meta-data. All these fields are optional. If the packet size field is +missing, the whole stream only contains a single packet. If the content +size field is missing, the packet is filled (no padding). The content +and packet sizes include all headers. An example event packet context type: @@ -609,20 +699,20 @@ struct event_packet_context { The overall structure of an event is: -1 - Stream Packet Context (as specified by the stream metadata) - 2 - Event Header (as specified by the stream metadata) - 3 - Stream Event Context (as specified by the stream metadata) - 4 - Event Context (as specified by the event metadata) - 5 - Event Payload (as specified by the event metadata) +1 - Stream Packet Context (as specified by the stream meta-data) + 2 - Event Header (as specified by the stream meta-data) + 3 - Stream Event Context (as specified by the stream meta-data) + 4 - Event Context (as specified by the event meta-data) + 5 - Event Payload (as specified by the event meta-data) This structure defines an implicit dynamic scoping, where variants located in inner structures (those with a higher number in the listing above) can refer to the fields of outer structures (with lower number in -the listing above). See Section 7.2 Metadata Scopes for more detail. +the listing above). See Section 7.3 TSDL Scopes for more detail. 6.1 Event Header -Event headers can be described within the metadata. We hereby propose, as an +Event headers can be described within the meta-data. We hereby propose, as an example, two types of events headers. Type 1 accommodates streams with less than 31 event IDs. Type 2 accommodates streams with 31 or more event IDs. @@ -630,8 +720,8 @@ One major factor can vary between streams: the number of event IDs assigned to a stream. Luckily, this information tends to stay relatively constant (modulo event registration while trace is being recorded), so we can specify different representations for streams containing few event IDs and streams containing -many event IDs, so we end up representing the event ID and timestamp as densely -as possible in each case. +many event IDs, so we end up representing the event ID and time-stamp as +densely as possible in each case. The header is extended in the rare occasions where the information cannot be represented in the ranges available in the standard event header. They are also @@ -639,8 +729,14 @@ used in the rare occasions where the data required for a field could not be collected: the flag corresponding to the missing field within the missing_fields array is then set to 1. -Types uintX_t represent an X-bit unsigned integer. +Types uintX_t represent an X-bit unsigned integer, as declared with +either: + typealias integer { size = X; align = X; signed = false } := uintX_t; + + or + + typealias integer { size = X; align = 1; signed = false } := uintX_t; 6.1.1 Type 1 - Few event IDs @@ -657,7 +753,7 @@ struct event_header_1 { * id: range: 0 - 30. * id 31 is reserved to indicate an extended header. */ - enum { compact = 0 ... 30, extended = 31 } id; + enum : uint5_t { compact = 0 ... 30, extended = 31 } id; variant { struct { uint27_t timestamp; @@ -685,7 +781,7 @@ struct event_header_2 { * id: range: 0 - 65534. * id 65535 is reserved to indicate an extended header. */ - enum { compact = 0 ... 65534, extended = 65535 } id; + enum : uint16_t { compact = 0 ... 65534, extended = 65535 } id; variant { struct { uint32_t timestamp; @@ -700,26 +796,23 @@ struct event_header_2 { 6.2 Event Context -The event context contains information relative to the current event. The choice -and meaning of this information is specified by the metadata "stream" and -"event" information. The "stream" context is applied to all events within the -stream. The "stream" context structure follows the event header. The "event" -context is applied to specific events. Its structure follows the "stream" -context stucture. +The event context contains information relative to the current event. +The choice and meaning of this information is specified by the TSDL +stream and event meta-data descriptions. The stream context is applied +to all events within the stream. The stream context structure follows +the event header. The event context is applied to specific events. Its +structure follows the stream context structure. An example of stream-level event context is to save the event payload size with each event, or to save the current PID with each event. These are declared -within the stream declaration within the metadata: +within the stream declaration within the meta-data: stream { ... - event { - ... - context := struct { + event.context := struct { uint pid; uint16_t payload_size; - }; - } + }; }; An example of event-specific event context is to declare a bitmap of missing @@ -742,7 +835,7 @@ numeric value). 6.3 Event Payload An event payload contains fields specific to a given event type. The fields -belonging to an event type are described in the event-specific metadata +belonging to an event type are described in the event-specific meta-data within a structure type. 6.3.1 Padding @@ -763,25 +856,67 @@ The event payload is aligned on the largest alignment required by types contained within the payload. (This follows the ISO/C standard for structures) -7. Metadata +7. Trace Stream Description Language (TSDL) + +The Trace Stream Description Language (TSDL) allows expression of the +binary trace streams layout in a C99-like Domain Specific Language +(DSL). + + +7.1 Meta-data + +The trace stream layout description is located in the trace meta-data. +The meta-data is itself located in a stream identified by its name: +"metadata". + +The meta-data description can be expressed in two different formats: +text-only and packet-based. The text-only description facilitates +generation of meta-data and provides a convenient way to enter the +meta-data information by hand. The packet-based meta-data provides the +CTF stream packet facilities (checksumming, compression, encryption, +network-readiness) for meta-data stream generated and transported by a +tracer. + +The text-only meta-data file is a plain text TSDL description. + +The packet-based meta-data is made of "meta-data packets", which each +start with a meta-data packet header. The packet-based meta-data +description is detected by reading the magic number "0x75D11D57" at the +beginning of the file. This magic number is also used to detect the +endianness of the architecture by trying to read the CTF magic number +and its counterpart in reversed endianness. The events within the +meta-data stream have no event header nor event context. Each event only +contains a "sequence" payload, which is a sequence of bits using the +"trace.packet.header.content_size" field as a placeholder for its +length. The formatting of this sequence of bits is a plain-text +representation of the TSDL description. Each meta-data packet start with +a special packet header, specific to the meta-data stream, which +contains, exactly: + +struct metadata_packet_header { + uint32_t magic; /* 0x75D11D57 */ + uint8_t uuid[16]; /* Unique Universal Identifier */ + uint32_t checksum; /* 0 if unused */ + uint32_t content_size; /* in bits */ + uint32_t packet_size; /* in bits */ + uint8_t compression_scheme; /* 0 if unused */ + uint8_t encryption_scheme; /* 0 if unused */ + uint8_t checksum_scheme; /* 0 if unused */ +}; -The meta-data is located in a stream named "metadata". It is made of "event -packets", which each start with an event packet header. The event type within -the metadata stream have no event header nor event context. Each event only -contains a null-terminated "string" payload, which is a metadata description -entry. The events are packed one next to another. Each event packet start with -an event packet header, which contains, amongst other fields, the magic number -and trace UUID. The trace UUID is represented as a string of hexadecimal digits -and dashes "-". +The packet-based meta-data can be converted to a text-only meta-data by +concatenating all the strings in contains. -The metadata can be parsed by reading through the metadata strings, skipping -newlines and null-characters. Type names are made of a single identifier, and -can be surrounded by prefix/postfix. Text contained within "/*" and "*/", as -well as within "//" and end of line, are treated as comments. Boolean values can -be represented as true, TRUE, or 1 for true, and false, FALSE, or 0 for false. +In the textual representation of the meta-data, the text contained +within "/*" and "*/", as well as within "//" and end of line, are +treated as comments. Boolean values can be represented as true, TRUE, +or 1 for true, and false, FALSE, or 0 for false. Within the string-based +meta-data description, the trace UUID is represented as a string of +hexadecimal digits and dashes "-". In the event packet header, the trace +UUID is represented as an array of bytes. -7.1 Declaration vs Definition +7.2 Declaration vs Definition A declaration associates a layout to a type, without specifying where this type is located in the event structure hierarchy (see Section 6). @@ -791,20 +926,23 @@ variant field), a declaration is followed by a declarator, which specify the newly defined type name (for typedef), or the field name (for declarations located within structure and variants). Array and sequence, declared with square brackets ("[" "]"), are part of the declarator, -similarly to C99. The enumeration type specifier and variant tag name -(both specified with "<" ">") are part of the type specifier. +similarly to C99. The enumeration base type is specified by +": enum_base", which is part of the type specifier. The variant tag +name, specified between "<" ">", is also part of the type specifier. A definition associates a type to a location in the event structure -hierarchy (see Section 6). +hierarchy (see Section 6). This association is denoted by ":=", as shown +in Section 7.3. -7.2 Metadata Scopes +7.3 TSDL Scopes -CTF metadata uses two different types of scoping: a lexical scope is -used for declarations and type definitions, and a dynamic scope is used -for variants references to tag fields. +TSDL uses two different types of scoping: a lexical scope is used for +declarations and type definitions, and a dynamic scope is used for +variants references to tag fields and for sequence references to length +fields. -7.2.1 Lexical Scope +7.3.1 Lexical Scope Each of "trace", "stream", "event", "struct" and "variant" have their own nestable declaration scope, within which types can be declared using "typedef" @@ -815,25 +953,27 @@ lexical scope prior to the inner declaration scope. Redefinition of a typedef or typealias is not valid, although hiding an upper scope typedef or typealias is allowed within a sub-scope. -7.2.2 Dynamic Scope +7.3.2 Dynamic Scope A dynamic scope consists in the lexical scope augmented with the implicit event structure definition hierarchy presented at Section 6. -The dynamic scope is only used for variant tag definitions. It is used -at definition time to look up the location of the tag field associated -with a variant. - -Therefore, variants in lower levels in the dynamic scope (e.g. event -context) can refer to a tag field located in upper levels (e.g. in the -event header) by specifying, in this case, the associated tag with -. This allows, for instance, the event context to -define a variant referring to the "id" field of the event header as -selector. +The dynamic scope is used for variant tag and sequence length +definitions. It is used at definition time to look up the location of +the tag field associated with a variant, and to lookup up the location +of the length field associated with a sequence. + +Therefore, variants (or sequences) in lower levels in the dynamic scope +(e.g. event context) can refer to a tag (or length) field located in +upper levels (e.g. in the event header) by specifying, in this case, the +associated tag with . This allows, for instance, the +event context to define a variant referring to the "id" field of the +event header as selector. The target dynamic scope must be specified explicitly when referring to a field outside of the local static scope. The dynamic scope prefixes are thus: + - Trace Packet Header: , - Stream Packet Context: , - Event Header: , - Stream Event Context: , @@ -853,17 +993,26 @@ consumption, for each event, the current trace context is therefore readable by accessing the upper dynamic scopes. -7.2 Metadata Examples +7.4 TSDL Examples + +The grammar representing the TSDL meta-data is presented in Appendix C. +TSDL Grammar. This section presents a rather lighter reading that +consists in examples of TSDL meta-data, with template values. -The grammar representing the CTF metadata is presented in -Appendix C. CTF Metadata Grammar. This section presents a rather ligher -reading that consists in examples of CTF metadata, with template values: +The stream "id" can be left out if there is only one stream in the +trace. The event "id" field can be left out if there is only one event +in a stream. trace { major = value; /* Trace format version */ minor = value; uuid = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"; /* Trace UUID */ - word_size = value; + byte_order = be OR le; /* Endianness (required) */ + packet.header := struct { + uint32_t magic; + uint8_t uuid[16]; + uint32_t stream_id; + }; }; stream { @@ -881,7 +1030,7 @@ stream { event { name = event_name; id = value; /* Numeric identifier within the stream */ - stream = stream_id; + stream_id = stream_id; context := struct { ... }; @@ -898,7 +1047,7 @@ event { * Type declarations behave similarly to the C standard. */ -typedef aliased_type_prefix aliased_type new_type aliased_type_postfix; +typedef aliased_type_specifiers new_type_declarators; /* e.g.: typedef struct example new_type_name[10]; */ @@ -906,15 +1055,15 @@ typedef aliased_type_prefix aliased_type new_type aliased_type_postfix; * typealias * * The "typealias" declaration can be used to give a name (including - * prefix/postfix) to a type. It should also be used to map basic C types - * (float, int, unsigned long, ...) to a CTF type. Typealias is a superset of - * "typedef": it also allows assignment of a simple variable identifier to a - * type. + * pointer declarator specifier) to a type. It should also be used to + * map basic C types (float, int, unsigned long, ...) to a CTF type. + * Typealias is a superset of "typedef": it also allows assignment of a + * simple variable identifier to a type. */ typealias type_class { ... -} : new_type_prefix new_type new_type_postfix; +} := type_specifiers type_declarator; /* * e.g.: @@ -922,13 +1071,13 @@ typealias type_class { * size = 32; * align = 32; * signed = false; - * } : struct page *; + * } := struct page *; * * typealias integer { * size = 32; * align = 32; * signed = true; - * } : int; + * } := int; */ struct name { @@ -939,7 +1088,7 @@ variant name { ... }; -enum name { +enum name : integer_type { ... }; @@ -952,11 +1101,15 @@ struct { ... } +struct { + ... +} align(value) + variant { ... } -enum { +enum : integer_type { ... } @@ -1025,20 +1178,21 @@ flexibility in terms of: - transparently support flight recorder mode, - transparently support crash dump. -The event stream header will therefore be referred to as the "event packet -header" throughout the rest of this document. -C. CTF Metadata Grammar +C. TSDL Grammar /* - * Common Trace Format (CTF) Metadata Grammar. + * Common Trace Format (CTF) Trace Stream Description Language (TSDL) Grammar. * * Inspired from the C99 grammar: * http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf (Annex A) + * and c++1x grammar (draft) + * http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3291.pdf (Annex A) * * Specialized for CTF needs by including only constant and declarations from * C99 (excluding function declarations), and by adding support for variants, - * sequences and CTF-specific specifiers. + * sequences and CTF-specific specifiers. Enumeration container types + * semantic is inspired from c++1x enum-base. */ 1) Lexical grammar @@ -1056,6 +1210,7 @@ token: keyword: is one of +align const char double @@ -1159,14 +1314,6 @@ long-long-suffix: ll LL -digit-sequence: - digit - digit-sequence digit - -hexadecimal-digit-sequence: - hexadecimal-digit - hexadecimal-digit-sequence hexadecimal-digit - enumeration-constant: identifier string-literal @@ -1247,20 +1394,20 @@ unary-operator: one of assignment-operator: = -constant-expression: - unary-expression +type-assignment-operator: + := constant-expression-range: - constant-expression ... constant-expression + unary-expression ... unary-expression 2.2) Declarations: declaration: - declaration-specifiers ; - declaration-specifiers storage-class-specifier declaration-specifiers declarator-list ; + declaration-specifiers declarator-list-opt ; ctf-specifier ; declaration-specifiers: + storage-class-specifier declaration-specifiers-opt type-specifier declaration-specifiers-opt type-qualifier declaration-specifiers-opt @@ -1287,15 +1434,19 @@ type-specifier: unsigned _Bool _Complex + _Imaginary struct-specifier variant-specifier enum-specifier typedef-name ctf-type-specifier +align-attribute: + align ( unary-expression ) + struct-specifier: - struct identifier-opt { struct-or-variant-declaration-list-opt } - struct identifier + struct identifier-opt { struct-or-variant-declaration-list-opt } align-attribute-opt + struct identifier align-attribute-opt struct-or-variant-declaration-list: struct-or-variant-declaration @@ -1303,9 +1454,9 @@ struct-or-variant-declaration-list: struct-or-variant-declaration: specifier-qualifier-list struct-or-variant-declarator-list ; - declaration-specifiers storage-class-specifier declaration-specifiers declarator-list ; - typealias declaration-specifiers abstract-declarator-list : declaration-specifiers abstract-declarator-list ; - typealias declaration-specifiers abstract-declarator-list : declarator-list ; + declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list ; + typealias declaration-specifiers abstract-declarator-list := declaration-specifiers abstract-declarator-list ; + typealias declaration-specifiers abstract-declarator-list := declarator-list ; specifier-qualifier-list: type-specifier specifier-qualifier-list-opt @@ -1317,7 +1468,7 @@ struct-or-variant-declarator-list: struct-or-variant-declarator: declarator - declarator-opt : constant-expression + declarator-opt : unary-expression variant-specifier: variant identifier-opt variant-tag-opt { struct-or-variant-declaration-list } @@ -1330,12 +1481,8 @@ enum-specifier: enum identifier-opt { enumerator-list } enum identifier-opt { enumerator-list , } enum identifier - enum identifier-opt < declaration-specifiers > { enumerator-list } - enum identifier-opt < declaration-specifiers > { enumerator-list , } - enum identifier < declaration-specifiers > - enum identifier-opt < integer-constant > { enumerator-list } - enum identifier-opt < integer-constant > { enumerator-list , } - enum identifier < integer-constant > + enum identifier-opt : declaration-specifiers { enumerator-list } + enum identifier-opt : declaration-specifiers { enumerator-list , } enumerator-list: enumerator @@ -1343,7 +1490,7 @@ enumerator-list: enumerator: enumeration-constant - enumeration-constant = constant-expression + enumeration-constant = unary-expression enumeration-constant = constant-expression-range type-qualifier: @@ -1355,8 +1502,7 @@ declarator: direct-declarator: identifier ( declarator ) - direct-declarator [ type-specifier ] - direct-declarator [ constant-expression ] + direct-declarator [ unary-expression ] abstract-declarator: pointer-opt direct-abstract-declarator @@ -1364,8 +1510,7 @@ abstract-declarator: direct-abstract-declarator: identifier-opt ( abstract-declarator ) - direct-abstract-declarator [ type-specifier ] - direct-abstract-declarator [ constant-expression ] + direct-abstract-declarator [ unary-expression ] direct-abstract-declarator [ ] pointer: @@ -1385,8 +1530,8 @@ ctf-specifier: event { ctf-assignment-expression-list-opt } stream { ctf-assignment-expression-list-opt } trace { ctf-assignment-expression-list-opt } - typealias declaration-specifiers abstract-declarator-list : declaration-specifiers abstract-declarator-list ; - typealias declaration-specifiers abstract-declarator-list : declarator-list ; + typealias declaration-specifiers abstract-declarator-list := declaration-specifiers abstract-declarator-list ; + typealias declaration-specifiers abstract-declarator-list := declarator-list ; ctf-type-specifier: floating_point { ctf-assignment-expression-list-opt } @@ -1400,6 +1545,6 @@ ctf-assignment-expression-list: ctf-assignment-expression: unary-expression assignment-operator unary-expression unary-expression type-assignment-operator type-specifier - declaration-specifiers storage-class-specifier declaration-specifiers declarator-list - typealias declaration-specifiers abstract-declarator-list : declaration-specifiers abstract-declarator-list - typealias declaration-specifiers abstract-declarator-list : declarator-list + declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list + typealias declaration-specifiers abstract-declarator-list := declaration-specifiers abstract-declarator-list + typealias declaration-specifiers abstract-declarator-list := declarator-list