Rename proposal into "specification", add credits.

author Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Wed, 22 Jun 2011 18:21:36 +0000 (14:21 -0400)

committer Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Wed, 22 Jun 2011 18:21:36 +0000 (14:21 -0400)
author Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Wed, 22 Jun 2011 18:21:36 +0000 (14:21 -0400)
committer Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Wed, 22 Jun 2011 18:21:36 +0000 (14:21 -0400)
diff --git a/common-trace-format-proposal.txt b/common-trace-format-proposal.txt

deleted file mode 100644 (file)

index 3103722..0000000
--- a/common-trace-format-proposal.txt
+++ /dev/null
@@ -1,1605 +0,0 @@
-
-RFC: Common Trace Format (CTF) Proposal (pre-v1.7)
-
-Mathieu Desnoyers, EfficiOS Inc.
-
-The goal of the present document is to propose a trace format that suits the
-needs of the embedded, telecom, high-performance and kernel communities. It is
-based on the Common Trace Format Requirements (v1.4) document. It is designed to
-allow traces to be natively generated by the Linux kernel, Linux user-space
-applications written in C/C++, and hardware components. One major element of
-CTF is the Trace Stream Description Language (TSDL) which flexibility
-enables description of various binary trace stream layouts.
-
-The latest version of this document can be found at:
-
-  git tree:   git://git.efficios.com/ctf.git
-  gitweb:     http://git.efficios.com/?p=ctf.git
-
-A reference implementation of a library to read and write this trace format is
-being implemented within the BabelTrace project, a converter between trace
-formats. The development tree is available at:
-
-  git tree:   git://git.efficios.com/babeltrace.git
-  gitweb:     http://git.efficios.com/?p=babeltrace.git
-
-
-Table of Contents
-
-1. Preliminary definitions
-2. High-level representation of a trace
-3. Event stream
-4. Types
-   4.1 Basic types
-       4.1.1 Type inheritance
-       4.1.2 Alignment
-       4.1.3 Byte order
-       4.1.4 Size
-       4.1.5 Integers
-       4.1.6 GNU/C bitfields
-       4.1.7 Floating point
-       4.1.8 Enumerations
-4.2 Compound types
-    4.2.1 Structures
-    4.2.2 Variants (Discriminated/Tagged Unions)
-    4.2.3 Arrays
-    4.2.4 Sequences
-    4.2.5 Strings
-5. Event Packet Header
-   5.1 Event Packet Header Description
-   5.2 Event Packet Context Description
-6. Event Structure
-   6.1 Event Header
-       6.1.1 Type 1 - Few event IDs
-       6.1.2 Type 2 - Many event IDs
-   6.2 Event Context
-   6.3 Event Payload
-       6.3.1 Padding
-       6.3.2 Alignment
-7. Trace Stream Description Language (TSDL)
-   7.1 Meta-data
-   7.2 Declaration vs Definition
-   7.3 TSDL Scopes
-       7.3.1 Lexical Scope
-       7.3.2 Dynamic Scope
-   7.4 TSDL Examples
-
-
-1. Preliminary definitions
-
-  - Event Trace: An ordered sequence of events.
-  - Event Stream: An ordered sequence of events, containing a subset of the
-                  trace event types.
-  - Event Packet: A sequence of physically contiguous events within an event
-                  stream.
-  - Event: This is the basic entry in a trace. (aka: a trace record).
-    - An event identifier (ID) relates to the class (a type) of event within
-      an event stream.
-        e.g. event: irq_entry.
-    - An event (or event record) relates to a specific instance of an event
-      class.
-        e.g. event: irq_entry, at time X, on CPU Y
-  - Source Architecture: Architecture writing the trace.
-  - Reader Architecture: Architecture reading the trace.
-
-
-2. High-level representation of a trace
-
-A trace is divided into multiple event streams. Each event stream contains a
-subset of the trace event types.
-
-The final output of the trace, after its generation and optional transport over
-the network, is expected to be either on permanent or temporary storage in a
-virtual file system. Because each event stream is appended to while a trace is
-being recorded, each is associated with a separate file for output.  Therefore,
-a stored trace can be represented as a directory containing one file per stream.
-
-Meta-data description associated with the trace contains information on
-trace event types expressed in the Trace Stream Description Language
-(TSDL). This language describes:
-
-- Trace version.
-- Types available.
-- Per-trace event header description.
-- Per-stream event header description.
-- Per-stream event context description.
-- Per-event
-  - Event type to stream mapping.
-  - Event type to name mapping.
-  - Event type to ID mapping.
-  - Event context description.
-  - Event fields description.
-
-
-3. Event stream
-
-An event stream can be divided into contiguous event packets of variable
-size. These subdivisions have a variable size. An event packet can
-contain a certain amount of padding at the end. The stream header is
-repeated at the beginning of each event packet. The rationale for the
-event stream design choices is explained in Appendix B. Stream Header
-Rationale.
-
-The event stream header will therefore be referred to as the "event packet
-header" throughout the rest of this document.
-
-
-4. Types
-
-Types are organized as type classes. Each type class belong to either of two
-kind of types: basic types or compound types.
-
-4.1 Basic types
-
-A basic type is a scalar type, as described in this section. It includes
-integers, GNU/C bitfields, enumerations, and floating point values.
-
-4.1.1 Type inheritance
-
-Type specifications can be inherited to allow deriving types from a
-type class. For example, see the uint32_t named type derived from the "integer"
-type class below ("Integers" section). Types have a precise binary
-representation in the trace. A type class has methods to read and write these
-types, but must be derived into a type to be usable in an event field.
-
-4.1.2 Alignment
-
-We define "byte-packed" types as aligned on the byte size, namely 8-bit.
-We define "bit-packed" types as following on the next bit, as defined by the
-"Integers" section.
-
-Each basic type must specify its alignment, in bits. Examples of
-possible alignments are: bit-packed (align = 1), byte-packed (align =
-8), or word-aligned (e.g. align = 32 or align = 64). The choice depends
-on the architecture preference and compactness vs performance trade-offs
-of the implementation.  Architectures providing fast unaligned write
-byte-packed basic types to save space, aligning each type on byte
-boundaries (8-bit). Architectures with slow unaligned writes align types
-on specific alignment values. If no specific alignment is declared for a
-type, it is assumed to be bit-packed for integers with size not multiple
-of 8 bits and for gcc bitfields. All other basic types are byte-packed
-by default. It is however recommended to always specify the alignment
-explicitly. Alignment values must be power of two. Compound types are
-aligned as specified in their individual specification.
-
-TSDL meta-data attribute representation of a specific alignment:
-
-  align = value;                                /* value in bits */
-
-4.1.3 Byte order
-
-By default, the native endianness of the source architecture the trace is used.
-Byte order can be overridden for a basic type by specifying a "byte_order"
-attribute. Typical use-case is to specify the network byte order (big endian:
-"be") to save data captured from the network into the trace without conversion.
-If not specified, the byte order is native.
-
-TSDL meta-data representation:
-
-  byte_order = native OR network OR be OR le;  /* network and be are aliases */
-
-4.1.4 Size
-
-Type size, in bits, for integers and floats is that returned by "sizeof()" in C
-multiplied by CHAR_BIT.
-We require the size of "char" and "unsigned char" types (CHAR_BIT) to be fixed
-to 8 bits for cross-endianness compatibility.
-
-TSDL meta-data representation:
-
-  size = value;    (value is in bits)
-
-4.1.5 Integers
-
-Signed integers are represented in two-complement. Integer alignment,
-size, signedness and byte ordering are defined in the TSDL meta-data.
-Integers aligned on byte size (8-bit) and with length multiple of byte
-size (8-bit) correspond to the C99 standard integers. In addition,
-integers with alignment and/or size that are _not_ a multiple of the
-byte size are permitted; these correspond to the C99 standard bitfields,
-with the added specification that the CTF integer bitfields have a fixed
-binary representation. A MIT-licensed reference implementation of the
-CTF portable bitfields is available at:
-
-  http://git.efficios.com/?p=babeltrace.git;a=blob;f=include/babeltrace/bitfield.h
-
-Binary representation of integers:
-
-- On little and big endian:
-  - Within a byte, high bits correspond to an integer high bits, and low bits
-    correspond to low bits.
-- On little endian:
-  - Integer across multiple bytes are placed from the less significant to the
-    most significant.
-  - Consecutive integers are placed from lower bits to higher bits (even within
-    a byte).
-- On big endian:
-  - Integer across multiple bytes are placed from the most significant to the
-    less significant.
-  - Consecutive integers are placed from higher bits to lower bits (even within
-    a byte).
-
-This binary representation is derived from the bitfield implementation in GCC
-for little and big endian. However, contrary to what GCC does, integers can
-cross units boundaries (no padding is required). Padding can be explicitly
-added (see 4.1.6 GNU/C bitfields) to follow the GCC layout if needed.
-
-TSDL meta-data representation:
-
-  integer {
-    signed = true OR false;                     /* default false */
-    byte_order = native OR network OR be OR le; /* default native */
-    size = value;                               /* value in bits, no default */
-    align = value;                              /* value in bits */
-    /* based used for pretty-printing output, default: decimal. */
-    base = decimal OR dec OR OR d OR i OR u OR 10 OR hexadecimal OR hex OR x OR X OR p OR 16
-           OR octal OR oct OR o OR 8 OR binary OR b OR 2;
-    /* character encoding, default: none */
-    encoding = none or UTF8 or ASCII;
-  }
-
-Example of type inheritance (creation of a uint32_t named type):
-
-typealias integer {
-  size = 32;
-  signed = false;
-  align = 32;
-} := uint32_t;
-
-Definition of a named 5-bit signed bitfield:
-
-typealias integer {
-  size = 5;
-  signed = true;
-  align = 1;
-} := int5_t;
-
-The character encoding field can be used to specify that the integer
-must be printed as a text character when read. e.g.:
-
-typealias integer {
-  size = 8;
-  align = 8;
-  signed = false;
-  encoding = UTF8;
-} := utf_char;
-
-
-4.1.6 GNU/C bitfields
-
-The GNU/C bitfields follow closely the integer representation, with a
-particularity on alignment: if a bitfield cannot fit in the current unit, the
-unit is padded and the bitfield starts at the following unit. The unit size is
-defined by the size of the type "unit_type".
-
-TSDL meta-data representation:
-
-  unit_type name:size;
-
-As an example, the following structure declared in C compiled by GCC:
-
-struct example {
-  short a:12;
-  short b:5;
-};
-
-The example structure is aligned on the largest element (short). The second
-bitfield would be aligned on the next unit boundary, because it would not fit in
-the current unit.
-
-4.1.7 Floating point
-
-The floating point values byte ordering is defined in the TSDL meta-data.
-
-Floating point values follow the IEEE 754-2008 standard interchange formats.
-Description of the floating point values include the exponent and mantissa size
-in bits. Some requirements are imposed on the floating point values:
-
-- FLT_RADIX must be 2.
-- mant_dig is the number of digits represented in the mantissa. It is specified
-  by the ISO C99 standard, section 5.2.4, as FLT_MANT_DIG, DBL_MANT_DIG and
-  LDBL_MANT_DIG as defined by <float.h>.
-- exp_dig is the number of digits represented in the exponent. Given that
-  mant_dig is one bit more than its actual size in bits (leading 1 is not
-  needed) and also given that the sign bit always takes one bit, exp_dig can be
-  specified as:
-
-  - sizeof(float) * CHAR_BIT - FLT_MANT_DIG
-  - sizeof(double) * CHAR_BIT - DBL_MANT_DIG
-  - sizeof(long double) * CHAR_BIT - LDBL_MANT_DIG
-
-TSDL meta-data representation:
-
-floating_point {
-  exp_dig = value;
-  mant_dig = value;
-  byte_order = native OR network OR be OR le;
-  align = value;
-}
-
-Example of type inheritance:
-
-typealias floating_point {
-  exp_dig = 8;         /* sizeof(float) * CHAR_BIT - FLT_MANT_DIG */
-  mant_dig = 24;       /* FLT_MANT_DIG */
-  byte_order = native;
-  align = 32;
-} := float;
-
-TODO: define NaN, +inf, -inf behavior.
-
-Bit-packed, byte-packed or larger alignments can be used for floating
-point values, similarly to integers.
-
-4.1.8 Enumerations
-
-Enumerations are a mapping between an integer type and a table of strings. The
-numerical representation of the enumeration follows the integer type specified
-by the meta-data. The enumeration mapping table is detailed in the enumeration
-description within the meta-data. The mapping table maps inclusive value
-ranges (or single values) to strings. Instead of being limited to simple
-"value -> string" mappings, these enumerations map
-"[ start_value ... end_value ] -> string", which map inclusive ranges of
-values to strings.  An enumeration from the C language can be represented in
-this format by having the same start_value and end_value for each element, which
-is in fact a range of size 1. This single-value range is supported without
-repeating the start and end values with the value = string declaration.
-
-enum name : integer_type {
-  somestring          = start_value1 ... end_value1,
-  "other string"      = start_value2 ... end_value2,
-  yet_another_string,  /* will be assigned to end_value2 + 1 */
-  "some other string" = value,
-  ...
-};
-
-If the values are omitted, the enumeration starts at 0 and increment of 1 for
-each entry:
-
-enum name : unsigned int {
-  ZERO,
-  ONE,
-  TWO,
-  TEN = 10,
-  ELEVEN,
-};
-
-Overlapping ranges within a single enumeration are implementation defined.
-
-A nameless enumeration can be declared as a field type or as part of a typedef:
-
-enum : integer_type {
-  ...
-}
-
-Enumerations omitting the container type ": integer_type" use the "int"
-type (for compatibility with C99). The "int" type must be previously
-declared. E.g.:
-
-typealias integer { size = 32; align = 32; signed = true } := int;
-
-enum {
-  ...
-}
-
-
-4.2 Compound types
-
-Compound are aggregation of type declarations. Compound types include
-structures, variant, arrays, sequences, and strings.
-
-4.2.1 Structures
-
-Structures are aligned on the largest alignment required by basic types
-contained within the structure. (This follows the ISO/C standard for structures)
-
-TSDL meta-data representation of a named structure:
-
-struct name {
-  field_type field_name;
-  field_type field_name;
-  ...
-}; 
-
-Example:
-
-struct example {
-  integer {                       /* Nameless type */
-    size = 16;
-    signed = true;
-    align = 16;
-  } first_field_name;
-  uint64_t second_field_name;  /* Named type declared in the meta-data */
-};
-
-The fields are placed in a sequence next to each other. They each possess a
-field name, which is a unique identifier within the structure.
-
-A nameless structure can be declared as a field type or as part of a typedef:
-
-struct {
-  ...
-}
-
-Alignment for a structure compound type can be forced to a minimum value
-by adding an "align" specifier after the declaration of a structure
-body. This attribute is read as: align(value). The value is specified in
-bits. The structure will be aligned on the maximum value between this
-attribute and the alignment required by the basic types contained within
-the structure. e.g.
-
-struct {
-  ...
-} align(32)
-
-4.2.2 Variants (Discriminated/Tagged Unions)
-
-A CTF variant is a selection between different types. A CTF variant must
-always be defined within the scope of a structure or within fields
-contained within a structure (defined recursively). A "tag" enumeration
-field must appear in either the same lexical scope, prior to the variant
-field (in field declaration order), in an upper lexical scope (see
-Section 7.3.1), or in an upper dynamic scope (see Section 7.3.2). The
-type selection is indicated by the mapping from the enumeration value to
-the string used as variant type selector. The field to use as tag is
-specified by the "tag_field", specified between "< >" after the
-"variant" keyword for unnamed variants, and after "variant name" for
-named variants.
-
-The alignment of the variant is the alignment of the type as selected by the tag
-value for the specific instance of the variant. The alignment of the type
-containing the variant is independent of the variant alignment.  The size of the
-variant is the size as selected by the tag value for the specific instance of
-the variant.
-
-A named variant declaration followed by its definition within a structure
-declaration:
-
-variant name {
-  field_type sel1;
-  field_type sel2;
-  field_type sel3;
-  ...
-};
-
-struct {
-  enum : integer_type { sel1, sel2, sel3, ... } tag_field;
-  ...
-  variant name <tag_field> v;
-}
-
-An unnamed variant definition within a structure is expressed by the following
-TSDL meta-data:
-
-struct {
-  enum : integer_type { sel1, sel2, sel3, ... } tag_field;
-  ...
-  variant <tag_field> {
-    field_type sel1;
-    field_type sel2;
-    field_type sel3;
-    ...
-  } v;
-}
-
-Example of a named variant within a sequence that refers to a single tag field:
-
-variant example {
-  uint32_t a;
-  uint64_t b;
-  short c;
-};
-
-struct {
-  enum : uint2_t { a, b, c } choice;
-  unsigned int seqlen;
-  variant example <choice> v[seqlen];
-}
-
-Example of an unnamed variant:
-
-struct {
-  enum : uint2_t { a, b, c, d } choice;
-  /* Unrelated fields can be added between the variant and its tag */
-  int32_t somevalue;
-  variant <choice> {
-    uint32_t a;
-    uint64_t b;
-    short c;
-    struct {
-      unsigned int field1;
-      uint64_t field2;
-    } d;
-  } s;
-}
-
-Example of an unnamed variant within an array:
-
-struct {
-  enum : uint2_t { a, b, c } choice;
-  variant <choice> {
-    uint32_t a;
-    uint64_t b;
-    short c;
-  } v[10];
-}
-
-Example of a variant type definition within a structure, where the defined type
-is then declared within an array of structures. This variant refers to a tag
-located in an upper lexical scope. This example clearly shows that a variant
-type definition referring to the tag "x" uses the closest preceding field from
-the lexical scope of the type definition.
-
-struct {
-  enum : uint2_t { a, b, c, d } x;
-
-  typedef variant <x> {        /*
-                        * "x" refers to the preceding "x" enumeration in the
-                        * lexical scope of the type definition.
-                        */
-    uint32_t a;
-    uint64_t b;
-    short c;
-  } example_variant;
-
-  struct {
-    enum : int { x, y, z } x;  /* This enumeration is not used by "v". */
-    example_variant v;                 /*
-                                * "v" uses the "enum : uint2_t { a, b, c, d }"
-                                * tag.
-                                */
-  } a[10];
-}
-
-4.2.3 Arrays
-
-Arrays are fixed-length. Their length is declared in the type
-declaration within the meta-data. They contain an array of "inner type"
-elements, which can refer to any type not containing the type of the
-array being declared (no circular dependency). The length is the number
-of elements in an array.
-
-TSDL meta-data representation of a named array:
-
-typedef elem_type name[length];
-
-A nameless array can be declared as a field type within a structure, e.g.:
-
-  uint8_t field_name[10];
-
-Arrays are always aligned on their element alignment requirement.
-
-4.2.4 Sequences
-
-Sequences are dynamically-sized arrays. They refer to a a "length"
-unsigned integer field, which must appear in either the same lexical scope,
-prior to the sequence field (in field declaration order), in an upper
-lexical scope (see Section 7.3.1), or in an upper dynamic scope (see
-Section 7.3.2). This length field represents the number of elements in
-the sequence. The sequence per se is an array of "inner type" elements.
-
-TSDL meta-data representation for a sequence type definition:
-
-struct {
-  unsigned int length_field;
-  typedef elem_type typename[length_field];
-  typename seq_field_name;
-}
-
-A sequence can also be declared as a field type, e.g.:
-
-struct {
-  unsigned int length_field;
-  long seq_field_name[length_field];
-}
-
-Multiple sequences can refer to the same length field, and these length
-fields can be in a different upper dynamic scope:
-
-e.g., assuming the stream.event.header defines:
-
-stream {
-  ...
-  id = 1;
-  event.header := struct {
-    uint16_t seq_len;
-  };
-};
-
-event {
-  ...
-  stream_id = 1;
-  fields := struct {
-    long seq_a[stream.event.header.seq_len];
-    char seq_b[stream.event.header.seq_len];
-  };
-};
-
-The sequence elements follow the "array" specifications.
-
-4.2.5 Strings
-
-Strings are an array of bytes of variable size and are terminated by a '\0'
-"NULL" character.  Their encoding is described in the TSDL meta-data. In
-absence of encoding attribute information, the default encoding is
-UTF-8.
-
-TSDL meta-data representation of a named string type:
-
-typealias string {
-  encoding = UTF8 OR ASCII;
-} := name;
-
-A nameless string type can be declared as a field type:
-
-string field_name;     /* Use default UTF8 encoding */
-
-Strings are always aligned on byte size.
-
-5. Event Packet Header
-
-The event packet header consists of two parts: the "event packet header"
-is the same for all streams of a trace. The second part, the "event
-packet context", is described on a per-stream basis. Both are described
-in the TSDL meta-data. The packets are aligned on architecture-page-sized
-addresses.
-
-Event packet header (all fields are optional, specified by TSDL meta-data):
-
-- Magic number (CTF magic number: 0xC1FC1FC1) specifies that this is a
-  CTF packet. This magic number is optional, but when present, it should
-  come at the very beginning of the packet.
-- Trace UUID, used to ensure the event packet match the meta-data used.
-  (note: we cannot use a meta-data checksum in every cases instead of a
-   UUID because meta-data can be appended to while tracing is active)
-  This field is optional.
-- Stream ID, used as reference to stream description in meta-data.
-  This field is optional if there is only one stream description in the
-  meta-data, but becomes required if there are more than one stream in
-  the TSDL meta-data description.
-
-Event packet context (all fields are optional, specified by TSDL meta-data):
-
-- Event packet content size (in bytes).
-- Event packet size (in bytes, includes padding).
-- Event packet content checksum (optional). Checksum excludes the event packet
-  header.
-- Per-stream event packet sequence count (to deal with UDP packet loss). The
-  number of significant sequence counter bits should also be present, so
-  wrap-arounds are dealt with correctly.
-- Time-stamp at the beginning and time-stamp at the end of the event packet.
-  Both timestamps are written in the packet header, but sampled respectively
-  while (or before) writing the first event and while (or after) writing the
-  last event in the packet. The inclusive range between these timestamps should
-  include all event timestamps assigned to events contained within the packet.
-- Events discarded count
-  - Snapshot of a per-stream free-running counter, counting the number of
-    events discarded that were supposed to be written in the stream prior to
-    the first event in the event packet.
-    * Note: producer-consumer buffer full condition should fill the current
-            event packet with padding so we know exactly where events have been
-            discarded.
-- Lossless compression scheme used for the event packet content. Applied
-  directly to raw data. New types of compression can be added in following
-  versions of the format.
-  0: no compression scheme
-  1: bzip2
-  2: gzip
-  3: xz
-- Cypher used for the event packet content. Applied after compression.
-  0: no encryption
-  1: AES
-- Checksum scheme used for the event packet content. Applied after encryption.
-  0: no checksum
-  1: md5
-  2: sha1
-  3: crc32
-
-5.1 Event Packet Header Description
-
-The event packet header layout is indicated by the trace packet.header
-field. Here is a recommended structure type for the packet header with
-the fields typically expected (although these fields are each optional):
-
-struct event_packet_header {
-  uint32_t magic;
-  uint8_t  uuid[16];
-  uint32_t stream_id;
-};
-
-trace {
-  ...
-  packet.header := struct event_packet_header;
-};
-
-If the magic number is not present, tools such as "file" will have no
-mean to discover the file type.
-
-If the uuid is not present, no validation that the meta-data actually
-corresponds to the stream is performed.
-
-If the stream_id packet header field is missing, the trace can only
-contain a single stream. Its "id" field can be left out, and its events
-don't need to declare a "stream_id" field.
-
-
-5.2 Event Packet Context Description
-
-Event packet context example. These are declared within the stream declaration
-in the meta-data. All these fields are optional. If the packet size field is
-missing, the whole stream only contains a single packet. If the content
-size field is missing, the packet is filled (no padding). The content
-and packet sizes include all headers.
-
-An example event packet context type:
-
-struct event_packet_context {
-  uint64_t timestamp_begin;
-  uint64_t timestamp_end;
-  uint32_t checksum;
-  uint32_t stream_packet_count;
-  uint32_t events_discarded;
-  uint32_t cpu_id;
-  uint32_t/uint16_t content_size;
-  uint32_t/uint16_t packet_size;
-  uint8_t  stream_packet_count_bits;   /* Significant counter bits */
-  uint8_t  compression_scheme;
-  uint8_t  encryption_scheme;
-  uint8_t  checksum_scheme;
-};
-
-
-6. Event Structure
-
-The overall structure of an event is:
-
-1 - Stream Packet Context (as specified by the stream meta-data)
- 2 - Event Header (as specified by the stream meta-data)
-  3 - Stream Event Context (as specified by the stream meta-data)
-   4 - Event Context (as specified by the event meta-data)
-    5 - Event Payload (as specified by the event meta-data)
-
-This structure defines an implicit dynamic scoping, where variants
-located in inner structures (those with a higher number in the listing
-above) can refer to the fields of outer structures (with lower number in
-the listing above). See Section 7.3 TSDL Scopes for more detail.
-
-6.1 Event Header
-
-Event headers can be described within the meta-data. We hereby propose, as an
-example, two types of events headers. Type 1 accommodates streams with less than
-31 event IDs. Type 2 accommodates streams with 31 or more event IDs.
-
-One major factor can vary between streams: the number of event IDs assigned to
-a stream. Luckily, this information tends to stay relatively constant (modulo
-event registration while trace is being recorded), so we can specify different
-representations for streams containing few event IDs and streams containing
-many event IDs, so we end up representing the event ID and time-stamp as
-densely as possible in each case.
-
-The header is extended in the rare occasions where the information cannot be
-represented in the ranges available in the standard event header. They are also
-used in the rare occasions where the data required for a field could not be
-collected: the flag corresponding to the missing field within the missing_fields
-array is then set to 1.
-
-Types uintX_t represent an X-bit unsigned integer, as declared with
-either:
-
-  typealias integer { size = X; align = X; signed = false } := uintX_t;
-
-    or
-
-  typealias integer { size = X; align = 1; signed = false } := uintX_t;
-
-6.1.1 Type 1 - Few event IDs
-
-  - Aligned on 32-bit (or 8-bit if byte-packed, depending on the architecture
-    preference).
-  - Native architecture byte ordering.
-  - For "compact" selection
-    - Fixed size: 32 bits.
-  - For "extended" selection
-    - Size depends on the architecture and variant alignment.
-
-struct event_header_1 {
-  /*
-   * id: range: 0 - 30.
-   * id 31 is reserved to indicate an extended header.
-   */
-  enum : uint5_t { compact = 0 ... 30, extended = 31 } id;
-  variant <id> {
-    struct {
-      uint27_t timestamp;
-    } compact;
-    struct {
-      uint32_t id;                      /* 32-bit event IDs */
-      uint64_t timestamp;               /* 64-bit timestamps */
-    } extended;
-  } v;
-} align(32);   /* or align(8) */
-
-
-6.1.2 Type 2 - Many event IDs
-
-  - Aligned on 16-bit (or 8-bit if byte-packed, depending on the architecture
-    preference).
-  - Native architecture byte ordering.
-  - For "compact" selection
-    - Size depends on the architecture and variant alignment.
-  - For "extended" selection
-    - Size depends on the architecture and variant alignment.
-
-struct event_header_2 {
-  /*
-   * id: range: 0 - 65534.
-   * id 65535 is reserved to indicate an extended header.
-   */
-  enum : uint16_t { compact = 0 ... 65534, extended = 65535 } id;
-  variant <id> {
-    struct {
-      uint32_t timestamp;
-    } compact;
-    struct {
-      uint32_t id;                      /* 32-bit event IDs */
-      uint64_t timestamp;               /* 64-bit timestamps */ 
-    } extended;
-  } v;
-} align(16);   /* or align(8) */
-
-
-6.2 Event Context
-
-The event context contains information relative to the current event.
-The choice and meaning of this information is specified by the TSDL
-stream and event meta-data descriptions. The stream context is applied
-to all events within the stream. The stream context structure follows
-the event header. The event context is applied to specific events. Its
-structure follows the stream context structure.
-
-An example of stream-level event context is to save the event payload size with
-each event, or to save the current PID with each event.  These are declared
-within the stream declaration within the meta-data:
-
-  stream {
-    ...
-    event.context := struct {
-        uint pid;
-        uint16_t payload_size;
-    };
-  };
-
-An example of event-specific event context is to declare a bitmap of missing
-fields, only appended after the stream event context if the extended event
-header is selected. NR_FIELDS is the number of fields within the event (a
-numeric value).
-
-  event {
-    context = struct {
-      variant <id> {
-        struct { } compact;
-        struct {
-          uint1_t missing_fields[NR_FIELDS]; /* missing event fields bitmap */
-        } extended;
-      } v;
-    };
-    ...
-  }
-
-6.3 Event Payload
-
-An event payload contains fields specific to a given event type. The fields
-belonging to an event type are described in the event-specific meta-data
-within a structure type.
-
-6.3.1 Padding
-
-No padding at the end of the event payload. This differs from the ISO/C standard
-for structures, but follows the CTF standard for structures. In a trace, even
-though it makes sense to align the beginning of a structure, it really makes no
-sense to add padding at the end of the structure, because structures are usually
-not followed by a structure of the same type.
-
-This trick can be done by adding a zero-length "end" field at the end of the C
-structures, and by using the offset of this field rather than using sizeof()
-when calculating the size of a structure (see Appendix "A. Helper macros").
-
-6.3.2 Alignment
-
-The event payload is aligned on the largest alignment required by types
-contained within the payload. (This follows the ISO/C standard for structures)
-
-
-7. Trace Stream Description Language (TSDL)
-
-The Trace Stream Description Language (TSDL) allows expression of the
-binary trace streams layout in a C99-like Domain Specific Language
-(DSL).
-
-
-7.1 Meta-data
-
-The trace stream layout description is located in the trace meta-data.
-The meta-data is itself located in a stream identified by its name:
-"metadata".
-
-The meta-data description can be expressed in two different formats:
-text-only and packet-based. The text-only description facilitates
-generation of meta-data and provides a convenient way to enter the
-meta-data information by hand. The packet-based meta-data provides the
-CTF stream packet facilities (checksumming, compression, encryption,
-network-readiness) for meta-data stream generated and transported by a
-tracer.
-
-The text-only meta-data file is a plain text TSDL description.
-
-The packet-based meta-data is made of "meta-data packets", which each
-start with a meta-data packet header. The packet-based meta-data
-description is detected by reading the magic number "0x75D11D57" at the
-beginning of the file. This magic number is also used to detect the
-endianness of the architecture by trying to read the CTF magic number
-and its counterpart in reversed endianness. The events within the
-meta-data stream have no event header nor event context. Each event only
-contains a "sequence" payload, which is a sequence of bits using the
-"trace.packet.header.content_size" field as a placeholder for its length
-(the packet header size should be substracted). The formatting of this
-sequence of bits is a plain-text representation of the TSDL description.
-Each meta-data packet start with a special packet header, specific to
-the meta-data stream, which contains, exactly:
-
-struct metadata_packet_header {
-  uint32_t magic;                      /* 0x75D11D57 */
-  uint8_t  uuid[16];                   /* Unique Universal Identifier */
-  uint32_t checksum;                   /* 0 if unused */
-  uint32_t content_size;               /* in bits */
-  uint32_t packet_size;                        /* in bits */
-  uint8_t  compression_scheme;         /* 0 if unused */
-  uint8_t  encryption_scheme;          /* 0 if unused */
-  uint8_t  checksum_scheme;            /* 0 if unused */
-};
-
-The packet-based meta-data can be converted to a text-only meta-data by
-concatenating all the strings in contains.
-
-In the textual representation of the meta-data, the text contained
-within "/*" and "*/", as well as within "//" and end of line, are
-treated as comments.  Boolean values can be represented as true, TRUE,
-or 1 for true, and false, FALSE, or 0 for false. Within the string-based
-meta-data description, the trace UUID is represented as a string of
-hexadecimal digits and dashes "-". In the event packet header, the trace
-UUID is represented as an array of bytes.
-
-
-7.2 Declaration vs Definition
-
-A declaration associates a layout to a type, without specifying where
-this type is located in the event structure hierarchy (see Section 6).
-This therefore includes typedef, typealias, as well as all type
-specifiers. In certain circumstances (typedef, structure field and
-variant field), a declaration is followed by a declarator, which specify
-the newly defined type name (for typedef), or the field name (for
-declarations located within structure and variants). Array and sequence,
-declared with square brackets ("[" "]"), are part of the declarator,
-similarly to C99. The enumeration base type is specified by
-": enum_base", which is part of the type specifier. The variant tag
-name, specified between "<" ">", is also part of the type specifier.
-
-A definition associates a type to a location in the event structure
-hierarchy (see Section 6). This association is denoted by ":=", as shown
-in Section 7.3.
-
-
-7.3 TSDL Scopes
-
-TSDL uses two different types of scoping: a lexical scope is used for
-declarations and type definitions, and a dynamic scope is used for
-variants references to tag fields and for sequence references to length
-fields.
-
-7.3.1 Lexical Scope
-
-Each of "trace", "stream", "event", "struct" and "variant" have their own
-nestable declaration scope, within which types can be declared using "typedef"
-and "typealias". A root declaration scope also contains all declarations
-located outside of any of the aforementioned declarations. An inner
-declaration scope can refer to type declared within its container
-lexical scope prior to the inner declaration scope. Redefinition of a
-typedef or typealias is not valid, although hiding an upper scope
-typedef or typealias is allowed within a sub-scope.
-
-7.3.2 Dynamic Scope
-
-A dynamic scope consists in the lexical scope augmented with the
-implicit event structure definition hierarchy presented at Section 6.
-The dynamic scope is used for variant tag and sequence length
-definitions. It is used at definition time to look up the location of
-the tag field associated with a variant, and to lookup up the location
-of the length field associated with a sequence.
-
-Therefore, variants (or sequences) in lower levels in the dynamic scope
-(e.g. event context) can refer to a tag (or length) field located in
-upper levels (e.g. in the event header) by specifying, in this case, the
-associated tag with <header.field_name>. This allows, for instance, the
-event context to define a variant referring to the "id" field of the
-event header as selector.
-
-The target dynamic scope must be specified explicitly when referring to
-a field outside of the local static scope. The dynamic scope prefixes
-are thus:
-
- - Trace Packet Header: <trace.packet.header. >,
- - Stream Packet Context: <stream.packet.context. >,
- - Event Header: <stream.event.header. >,
- - Stream Event Context: <stream.event.context. >,
- - Event Context: <event.context. >,
- - Event Payload: <event.fields. >.
-
-Multiple declarations of the same field name within a single scope is
-not valid. It is however valid to re-use the same field name in
-different scopes. There is no possible conflict, because the dynamic
-scope must be specified when a variant refers to a tag field located in
-a different dynamic scope.
-
-The information available in the dynamic scopes can be thought of as the
-current tracing context. At trace production, information about the
-current context is saved into the specified scope field levels. At trace
-consumption, for each event, the current trace context is therefore
-readable by accessing the upper dynamic scopes.
-
-
-7.4 TSDL Examples
-
-The grammar representing the TSDL meta-data is presented in Appendix C.
-TSDL Grammar. This section presents a rather lighter reading that
-consists in examples of TSDL meta-data, with template values.
-
-The stream "id" can be left out if there is only one stream in the
-trace. The event "id" field can be left out if there is only one event
-in a stream.
-
-trace {
-  major = value;                               /* Trace format version */
-  minor = value;
-  uuid = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa";       /* Trace UUID */
-  byte_order = be OR le;                       /* Endianness (required) */
-  packet.header := struct {
-    uint32_t magic;
-    uint8_t  uuid[16];
-    uint32_t stream_id;
-  };
-};
-
-stream {
-  id = stream_id;
-  /* Type 1 - Few event IDs; Type 2 - Many event IDs. See section 6.1. */
-  event.header := event_header_1 OR event_header_2;
-  event.context := struct {
-    ...
-  };
-  packet.context := struct {
-    ...
-  };
-};
-
-event {
-  name = event_name;
-  id = value;                  /* Numeric identifier within the stream */
-  stream_id = stream_id;
-  context := struct {
-    ...
-  };
-  fields := struct {
-    ...
-  };
-};
-
-/* More detail on types in section 4. Types */
-
-/*
- * Named types:
- *
- * Type declarations behave similarly to the C standard.
- */
-
-typedef aliased_type_specifiers new_type_declarators;
-
-/* e.g.: typedef struct example new_type_name[10]; */
-
-/*
- * typealias
- *
- * The "typealias" declaration can be used to give a name (including
- * pointer declarator specifier) to a type. It should also be used to
- * map basic C types (float, int, unsigned long, ...) to a CTF type.
- * Typealias is a superset of "typedef": it also allows assignment of a
- * simple variable identifier to a type.
- */
-
-typealias type_class {
-  ...
-} := type_specifiers type_declarator;
-
-/*
- * e.g.: 
- * typealias integer {
- *   size = 32;
- *   align = 32;
- *   signed = false;
- * } := struct page *;
- *
- * typealias integer {
- *  size = 32;
- *  align = 32;
- *  signed = true;
- * } := int;
- */
-
-struct name {
-  ...
-};
-
-variant name {
-  ...
-};
-
-enum name : integer_type {
-  ...
-};
-
-
-/*
- * Unnamed types, contained within compound type fields, typedef or typealias.
- */
-
-struct {
-  ...
-}
-
-struct {
-  ...
-} align(value)
-
-variant {
-  ...
-}
-
-enum : integer_type {
-  ...
-}
-
-typedef type new_type[length];
-
-struct {
-  type field_name[length];
-}
-
-typedef type new_type[length_type];
-
-struct {
-  type field_name[length_type];
-}
-
-integer {
-  ...
-}
-
-floating_point {
-  ...
-}
-
-struct {
-  integer_type field_name:size;                /* GNU/C bitfield */
-}
-
-struct {
-  string field_name;
-}
-
-
-A. Helper macros
-
-The two following macros keep track of the size of a GNU/C structure without
-padding at the end by placing HEADER_END as the last field. A one byte end field
-is used for C90 compatibility (C99 flexible arrays could be used here). Note
-that this does not affect the effective structure size, which should always be
-calculated with the header_sizeof() helper.
-
-#define HEADER_END             char end_field
-#define header_sizeof(type)    offsetof(typeof(type), end_field)
-
-
-B. Stream Header Rationale
-
-An event stream is divided in contiguous event packets of variable size. These
-subdivisions allow the trace analyzer to perform a fast binary search by time
-within the stream (typically requiring to index only the event packet headers)
-without reading the whole stream. These subdivisions have a variable size to
-eliminate the need to transfer the event packet padding when partially filled
-event packets must be sent when streaming a trace for live viewing/analysis.
-An event packet can contain a certain amount of padding at the end. Dividing
-streams into event packets is also useful for network streaming over UDP and
-flight recorder mode tracing (a whole event packet can be swapped out of the
-buffer atomically for reading).
-
-The stream header is repeated at the beginning of each event packet to allow
-flexibility in terms of:
-
-  - streaming support,
-  - allowing arbitrary buffers to be discarded without making the trace
-    unreadable,
-  - allow UDP packet loss handling by either dealing with missing event packet
-    or asking for re-transmission.
-  - transparently support flight recorder mode,
-  - transparently support crash dump.
-
-
-C. TSDL Grammar
-
-/*
- * Common Trace Format (CTF) Trace Stream Description Language (TSDL) Grammar.
- *
- * Inspired from the C99 grammar:
- * http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf (Annex A)
- * and c++1x grammar (draft)
- * http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3291.pdf (Annex A)
- *
- * Specialized for CTF needs by including only constant and declarations from
- * C99 (excluding function declarations), and by adding support for variants,
- * sequences and CTF-specific specifiers. Enumeration container types
- * semantic is inspired from c++1x enum-base.
- */
-
-1) Lexical grammar
-
-1.1) Lexical elements
-
-token:
-       keyword
-       identifier
-       constant
-       string-literal
-       punctuator
-
-1.2) Keywords
-
-keyword: is one of
-
-align
-const
-char
-double
-enum
-event
-floating_point
-float
-integer
-int
-long
-short
-signed
-stream
-string
-struct
-trace
-typealias
-typedef
-unsigned
-variant
-void
-_Bool
-_Complex
-_Imaginary
-
-
-1.3) Identifiers
-
-identifier:
-       identifier-nondigit
-       identifier identifier-nondigit
-       identifier digit
-
-identifier-nondigit:
-       nondigit
-       universal-character-name
-       any other implementation-defined characters
-
-nondigit:
-       _
-       [a-zA-Z]        /* regular expression */
-
-digit:
-       [0-9]           /* regular expression */
-
-1.4) Universal character names
-
-universal-character-name:
-       \u hex-quad
-       \U hex-quad hex-quad
-
-hex-quad:
-       hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit
-
-1.5) Constants
-
-constant:
-       integer-constant
-       enumeration-constant
-       character-constant
-
-integer-constant:
-       decimal-constant integer-suffix-opt
-       octal-constant integer-suffix-opt
-       hexadecimal-constant integer-suffix-opt
-
-decimal-constant:
-       nonzero-digit
-       decimal-constant digit
-
-octal-constant:
-       0
-       octal-constant octal-digit
-
-hexadecimal-constant:
-       hexadecimal-prefix hexadecimal-digit
-       hexadecimal-constant hexadecimal-digit
-
-hexadecimal-prefix:
-       0x
-       0X
-
-nonzero-digit:
-       [1-9]
-
-integer-suffix:
-       unsigned-suffix long-suffix-opt
-       unsigned-suffix long-long-suffix
-       long-suffix unsigned-suffix-opt
-       long-long-suffix unsigned-suffix-opt
-
-unsigned-suffix:
-       u
-       U
-
-long-suffix:
-       l
-       L
-
-long-long-suffix:
-       ll
-       LL
-
-enumeration-constant:
-       identifier
-       string-literal
-
-character-constant:
-       ' c-char-sequence '
-       L' c-char-sequence '
-
-c-char-sequence:
-       c-char
-       c-char-sequence c-char
-
-c-char:
-       any member of source charset except single-quote ('), backslash
-       (\), or new-line character.
-       escape-sequence
-
-escape-sequence:
-       simple-escape-sequence
-       octal-escape-sequence
-       hexadecimal-escape-sequence
-       universal-character-name
-
-simple-escape-sequence: one of
-       \' \" \? \\ \a \b \f \n \r \t \v
-
-octal-escape-sequence:
-       \ octal-digit
-       \ octal-digit octal-digit
-       \ octal-digit octal-digit octal-digit
-
-hexadecimal-escape-sequence:
-       \x hexadecimal-digit
-       hexadecimal-escape-sequence hexadecimal-digit
-
-1.6) String literals
-
-string-literal:
-       " s-char-sequence-opt "
-       L" s-char-sequence-opt "
-
-s-char-sequence:
-       s-char
-       s-char-sequence s-char
-
-s-char:
-       any member of source charset except double-quote ("), backslash
-       (\), or new-line character.
-       escape-sequence
-
-1.7) Punctuators
-
-punctuator: one of
-       [ ] ( ) { } . -> * + - < > : ; ... = ,
-
-
-2) Phrase structure grammar
-
-primary-expression:
-       identifier
-       constant
-       string-literal
-       ( unary-expression )
-
-postfix-expression:
-       primary-expression
-       postfix-expression [ unary-expression ]
-       postfix-expression . identifier
-       postfix-expressoin -> identifier
-
-unary-expression:
-       postfix-expression
-       unary-operator postfix-expression
-
-unary-operator: one of
-       + -
-
-assignment-operator:
-       =
-
-type-assignment-operator:
-       :=
-
-constant-expression-range:
-       unary-expression ... unary-expression
-
-2.2) Declarations:
-
-declaration:
-       declaration-specifiers declarator-list-opt ;
-       ctf-specifier ;
-
-declaration-specifiers:
-       storage-class-specifier declaration-specifiers-opt
-       type-specifier declaration-specifiers-opt
-       type-qualifier declaration-specifiers-opt
-
-declarator-list:
-       declarator
-       declarator-list , declarator
-
-abstract-declarator-list:
-       abstract-declarator
-       abstract-declarator-list , abstract-declarator
-
-storage-class-specifier:
-       typedef
-
-type-specifier:
-       void
-       char
-       short
-       int
-       long
-       float
-       double
-       signed
-       unsigned
-       _Bool
-       _Complex
-       _Imaginary
-       struct-specifier
-       variant-specifier
-       enum-specifier
-       typedef-name
-       ctf-type-specifier
-
-align-attribute:
-       align ( unary-expression )
-
-struct-specifier:
-       struct identifier-opt { struct-or-variant-declaration-list-opt } align-attribute-opt
-       struct identifier align-attribute-opt
-
-struct-or-variant-declaration-list:
-       struct-or-variant-declaration
-       struct-or-variant-declaration-list struct-or-variant-declaration
-
-struct-or-variant-declaration:
-       specifier-qualifier-list struct-or-variant-declarator-list ;
-       declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list ;
-       typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list ;
-       typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list ;
-
-specifier-qualifier-list:
-       type-specifier specifier-qualifier-list-opt
-       type-qualifier specifier-qualifier-list-opt
-
-struct-or-variant-declarator-list:
-       struct-or-variant-declarator
-       struct-or-variant-declarator-list , struct-or-variant-declarator
-
-struct-or-variant-declarator:
-       declarator
-       declarator-opt : unary-expression
-
-variant-specifier:
-       variant identifier-opt variant-tag-opt { struct-or-variant-declaration-list }
-       variant identifier variant-tag
-
-variant-tag:
-       < identifier >
-
-enum-specifier:
-       enum identifier-opt { enumerator-list }
-       enum identifier-opt { enumerator-list , }
-       enum identifier
-       enum identifier-opt : declaration-specifiers { enumerator-list }
-       enum identifier-opt : declaration-specifiers { enumerator-list , }
-
-enumerator-list:
-       enumerator
-       enumerator-list , enumerator
-
-enumerator:
-       enumeration-constant
-       enumeration-constant assignment-operator unary-expression
-       enumeration-constant assignment-operator constant-expression-range
-
-type-qualifier:
-       const
-
-declarator:
-       pointer-opt direct-declarator
-
-direct-declarator:
-       identifier
-       ( declarator )
-       direct-declarator [ unary-expression ]
-
-abstract-declarator:
-       pointer-opt direct-abstract-declarator
-
-direct-abstract-declarator:
-       identifier-opt
-       ( abstract-declarator )
-       direct-abstract-declarator [ unary-expression ]
-       direct-abstract-declarator [ ]
-
-pointer:
-       * type-qualifier-list-opt
-       * type-qualifier-list-opt pointer
-
-type-qualifier-list:
-       type-qualifier
-       type-qualifier-list type-qualifier
-
-typedef-name:
-       identifier
-
-2.3) CTF-specific declarations
-
-ctf-specifier:
-       event { ctf-assignment-expression-list-opt }
-       stream { ctf-assignment-expression-list-opt }
-       trace { ctf-assignment-expression-list-opt }
-       typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list
-       typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list
-
-ctf-type-specifier:
-       floating_point { ctf-assignment-expression-list-opt }
-       integer { ctf-assignment-expression-list-opt }
-       string { ctf-assignment-expression-list-opt }
-       string
-
-ctf-assignment-expression-list:
-       ctf-assignment-expression ;
-       ctf-assignment-expression-list ctf-assignment-expression ;
-
-ctf-assignment-expression:
-       unary-expression assignment-operator unary-expression
-       unary-expression type-assignment-operator type-specifier
-       declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list
-       typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list
-       typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list
diff --git a/common-trace-format-specification.txt b/common-trace-format-specification.txt

new file mode 100644 (file)

index 0000000..86060da
--- /dev/null
+++ b/common-trace-format-specification.txt
@@ -0,0 +1,1607 @@
+Common Trace Format (CTF) Specification (v1.7)
+
+Mathieu Desnoyers, EfficiOS Inc.
+
+The goal of the present document is to specify a trace format that suits the
+needs of the embedded, telecom, high-performance and kernel communities. It is
+based on the Common Trace Format Requirements (v1.4) document. It is designed to
+allow traces to be natively generated by the Linux kernel, Linux user-space
+applications written in C/C++, and hardware components. One major element of
+CTF is the Trace Stream Description Language (TSDL) which flexibility
+enables description of various binary trace stream layouts.
+
+The latest version of this document can be found at:
+
+  git tree:   git://git.efficios.com/ctf.git
+  gitweb:     http://git.efficios.com/?p=ctf.git
+
+A reference implementation of a library to read and write this trace format is
+being implemented within the BabelTrace project, a converter between trace
+formats. The development tree is available at:
+
+  git tree:   git://git.efficios.com/babeltrace.git
+  gitweb:     http://git.efficios.com/?p=babeltrace.git
+
+The CE Workgroup of the Linux Foundation, Ericsson, and EfficiOS have
+sponsored this work.
+
+
+Table of Contents
+
+1. Preliminary definitions
+2. High-level representation of a trace
+3. Event stream
+4. Types
+   4.1 Basic types
+       4.1.1 Type inheritance
+       4.1.2 Alignment
+       4.1.3 Byte order
+       4.1.4 Size
+       4.1.5 Integers
+       4.1.6 GNU/C bitfields
+       4.1.7 Floating point
+       4.1.8 Enumerations
+4.2 Compound types
+    4.2.1 Structures
+    4.2.2 Variants (Discriminated/Tagged Unions)
+    4.2.3 Arrays
+    4.2.4 Sequences
+    4.2.5 Strings
+5. Event Packet Header
+   5.1 Event Packet Header Description
+   5.2 Event Packet Context Description
+6. Event Structure
+   6.1 Event Header
+       6.1.1 Type 1 - Few event IDs
+       6.1.2 Type 2 - Many event IDs
+   6.2 Event Context
+   6.3 Event Payload
+       6.3.1 Padding
+       6.3.2 Alignment
+7. Trace Stream Description Language (TSDL)
+   7.1 Meta-data
+   7.2 Declaration vs Definition
+   7.3 TSDL Scopes
+       7.3.1 Lexical Scope
+       7.3.2 Dynamic Scope
+   7.4 TSDL Examples
+
+
+1. Preliminary definitions
+
+  - Event Trace: An ordered sequence of events.
+  - Event Stream: An ordered sequence of events, containing a subset of the
+                  trace event types.
+  - Event Packet: A sequence of physically contiguous events within an event
+                  stream.
+  - Event: This is the basic entry in a trace. (aka: a trace record).
+    - An event identifier (ID) relates to the class (a type) of event within
+      an event stream.
+        e.g. event: irq_entry.
+    - An event (or event record) relates to a specific instance of an event
+      class.
+        e.g. event: irq_entry, at time X, on CPU Y
+  - Source Architecture: Architecture writing the trace.
+  - Reader Architecture: Architecture reading the trace.
+
+
+2. High-level representation of a trace
+
+A trace is divided into multiple event streams. Each event stream contains a
+subset of the trace event types.
+
+The final output of the trace, after its generation and optional transport over
+the network, is expected to be either on permanent or temporary storage in a
+virtual file system. Because each event stream is appended to while a trace is
+being recorded, each is associated with a separate file for output.  Therefore,
+a stored trace can be represented as a directory containing one file per stream.
+
+Meta-data description associated with the trace contains information on
+trace event types expressed in the Trace Stream Description Language
+(TSDL). This language describes:
+
+- Trace version.
+- Types available.
+- Per-trace event header description.
+- Per-stream event header description.
+- Per-stream event context description.
+- Per-event
+  - Event type to stream mapping.
+  - Event type to name mapping.
+  - Event type to ID mapping.
+  - Event context description.
+  - Event fields description.
+
+
+3. Event stream
+
+An event stream can be divided into contiguous event packets of variable
+size. These subdivisions have a variable size. An event packet can
+contain a certain amount of padding at the end. The stream header is
+repeated at the beginning of each event packet. The rationale for the
+event stream design choices is explained in Appendix B. Stream Header
+Rationale.
+
+The event stream header will therefore be referred to as the "event packet
+header" throughout the rest of this document.
+
+
+4. Types
+
+Types are organized as type classes. Each type class belong to either of two
+kind of types: basic types or compound types.
+
+4.1 Basic types
+
+A basic type is a scalar type, as described in this section. It includes
+integers, GNU/C bitfields, enumerations, and floating point values.
+
+4.1.1 Type inheritance
+
+Type specifications can be inherited to allow deriving types from a
+type class. For example, see the uint32_t named type derived from the "integer"
+type class below ("Integers" section). Types have a precise binary
+representation in the trace. A type class has methods to read and write these
+types, but must be derived into a type to be usable in an event field.
+
+4.1.2 Alignment
+
+We define "byte-packed" types as aligned on the byte size, namely 8-bit.
+We define "bit-packed" types as following on the next bit, as defined by the
+"Integers" section.
+
+Each basic type must specify its alignment, in bits. Examples of
+possible alignments are: bit-packed (align = 1), byte-packed (align =
+8), or word-aligned (e.g. align = 32 or align = 64). The choice depends
+on the architecture preference and compactness vs performance trade-offs
+of the implementation.  Architectures providing fast unaligned write
+byte-packed basic types to save space, aligning each type on byte
+boundaries (8-bit). Architectures with slow unaligned writes align types
+on specific alignment values. If no specific alignment is declared for a
+type, it is assumed to be bit-packed for integers with size not multiple
+of 8 bits and for gcc bitfields. All other basic types are byte-packed
+by default. It is however recommended to always specify the alignment
+explicitly. Alignment values must be power of two. Compound types are
+aligned as specified in their individual specification.
+
+TSDL meta-data attribute representation of a specific alignment:
+
+  align = value;                                /* value in bits */
+
+4.1.3 Byte order
+
+By default, the native endianness of the source architecture the trace is used.
+Byte order can be overridden for a basic type by specifying a "byte_order"
+attribute. Typical use-case is to specify the network byte order (big endian:
+"be") to save data captured from the network into the trace without conversion.
+If not specified, the byte order is native.
+
+TSDL meta-data representation:
+
+  byte_order = native OR network OR be OR le;  /* network and be are aliases */
+
+4.1.4 Size
+
+Type size, in bits, for integers and floats is that returned by "sizeof()" in C
+multiplied by CHAR_BIT.
+We require the size of "char" and "unsigned char" types (CHAR_BIT) to be fixed
+to 8 bits for cross-endianness compatibility.
+
+TSDL meta-data representation:
+
+  size = value;    (value is in bits)
+
+4.1.5 Integers
+
+Signed integers are represented in two-complement. Integer alignment,
+size, signedness and byte ordering are defined in the TSDL meta-data.
+Integers aligned on byte size (8-bit) and with length multiple of byte
+size (8-bit) correspond to the C99 standard integers. In addition,
+integers with alignment and/or size that are _not_ a multiple of the
+byte size are permitted; these correspond to the C99 standard bitfields,
+with the added specification that the CTF integer bitfields have a fixed
+binary representation. A MIT-licensed reference implementation of the
+CTF portable bitfields is available at:
+
+  http://git.efficios.com/?p=babeltrace.git;a=blob;f=include/babeltrace/bitfield.h
+
+Binary representation of integers:
+
+- On little and big endian:
+  - Within a byte, high bits correspond to an integer high bits, and low bits
+    correspond to low bits.
+- On little endian:
+  - Integer across multiple bytes are placed from the less significant to the
+    most significant.
+  - Consecutive integers are placed from lower bits to higher bits (even within
+    a byte).
+- On big endian:
+  - Integer across multiple bytes are placed from the most significant to the
+    less significant.
+  - Consecutive integers are placed from higher bits to lower bits (even within
+    a byte).
+
+This binary representation is derived from the bitfield implementation in GCC
+for little and big endian. However, contrary to what GCC does, integers can
+cross units boundaries (no padding is required). Padding can be explicitly
+added (see 4.1.6 GNU/C bitfields) to follow the GCC layout if needed.
+
+TSDL meta-data representation:
+
+  integer {
+    signed = true OR false;                     /* default false */
+    byte_order = native OR network OR be OR le; /* default native */
+    size = value;                               /* value in bits, no default */
+    align = value;                              /* value in bits */
+    /* based used for pretty-printing output, default: decimal. */
+    base = decimal OR dec OR OR d OR i OR u OR 10 OR hexadecimal OR hex OR x OR X OR p OR 16
+           OR octal OR oct OR o OR 8 OR binary OR b OR 2;
+    /* character encoding, default: none */
+    encoding = none or UTF8 or ASCII;
+  }
+
+Example of type inheritance (creation of a uint32_t named type):
+
+typealias integer {
+  size = 32;
+  signed = false;
+  align = 32;
+} := uint32_t;
+
+Definition of a named 5-bit signed bitfield:
+
+typealias integer {
+  size = 5;
+  signed = true;
+  align = 1;
+} := int5_t;
+
+The character encoding field can be used to specify that the integer
+must be printed as a text character when read. e.g.:
+
+typealias integer {
+  size = 8;
+  align = 8;
+  signed = false;
+  encoding = UTF8;
+} := utf_char;
+
+
+4.1.6 GNU/C bitfields
+
+The GNU/C bitfields follow closely the integer representation, with a
+particularity on alignment: if a bitfield cannot fit in the current unit, the
+unit is padded and the bitfield starts at the following unit. The unit size is
+defined by the size of the type "unit_type".
+
+TSDL meta-data representation:
+
+  unit_type name:size;
+
+As an example, the following structure declared in C compiled by GCC:
+
+struct example {
+  short a:12;
+  short b:5;
+};
+
+The example structure is aligned on the largest element (short). The second
+bitfield would be aligned on the next unit boundary, because it would not fit in
+the current unit.
+
+4.1.7 Floating point
+
+The floating point values byte ordering is defined in the TSDL meta-data.
+
+Floating point values follow the IEEE 754-2008 standard interchange formats.
+Description of the floating point values include the exponent and mantissa size
+in bits. Some requirements are imposed on the floating point values:
+
+- FLT_RADIX must be 2.
+- mant_dig is the number of digits represented in the mantissa. It is specified
+  by the ISO C99 standard, section 5.2.4, as FLT_MANT_DIG, DBL_MANT_DIG and
+  LDBL_MANT_DIG as defined by <float.h>.
+- exp_dig is the number of digits represented in the exponent. Given that
+  mant_dig is one bit more than its actual size in bits (leading 1 is not
+  needed) and also given that the sign bit always takes one bit, exp_dig can be
+  specified as:
+
+  - sizeof(float) * CHAR_BIT - FLT_MANT_DIG
+  - sizeof(double) * CHAR_BIT - DBL_MANT_DIG
+  - sizeof(long double) * CHAR_BIT - LDBL_MANT_DIG
+
+TSDL meta-data representation:
+
+floating_point {
+  exp_dig = value;
+  mant_dig = value;
+  byte_order = native OR network OR be OR le;
+  align = value;
+}
+
+Example of type inheritance:
+
+typealias floating_point {
+  exp_dig = 8;         /* sizeof(float) * CHAR_BIT - FLT_MANT_DIG */
+  mant_dig = 24;       /* FLT_MANT_DIG */
+  byte_order = native;
+  align = 32;
+} := float;
+
+TODO: define NaN, +inf, -inf behavior.
+
+Bit-packed, byte-packed or larger alignments can be used for floating
+point values, similarly to integers.
+
+4.1.8 Enumerations
+
+Enumerations are a mapping between an integer type and a table of strings. The
+numerical representation of the enumeration follows the integer type specified
+by the meta-data. The enumeration mapping table is detailed in the enumeration
+description within the meta-data. The mapping table maps inclusive value
+ranges (or single values) to strings. Instead of being limited to simple
+"value -> string" mappings, these enumerations map
+"[ start_value ... end_value ] -> string", which map inclusive ranges of
+values to strings.  An enumeration from the C language can be represented in
+this format by having the same start_value and end_value for each element, which
+is in fact a range of size 1. This single-value range is supported without
+repeating the start and end values with the value = string declaration.
+
+enum name : integer_type {
+  somestring          = start_value1 ... end_value1,
+  "other string"      = start_value2 ... end_value2,
+  yet_another_string,  /* will be assigned to end_value2 + 1 */
+  "some other string" = value,
+  ...
+};
+
+If the values are omitted, the enumeration starts at 0 and increment of 1 for
+each entry:
+
+enum name : unsigned int {
+  ZERO,
+  ONE,
+  TWO,
+  TEN = 10,
+  ELEVEN,
+};
+
+Overlapping ranges within a single enumeration are implementation defined.
+
+A nameless enumeration can be declared as a field type or as part of a typedef:
+
+enum : integer_type {
+  ...
+}
+
+Enumerations omitting the container type ": integer_type" use the "int"
+type (for compatibility with C99). The "int" type must be previously
+declared. E.g.:
+
+typealias integer { size = 32; align = 32; signed = true } := int;
+
+enum {
+  ...
+}
+
+
+4.2 Compound types
+
+Compound are aggregation of type declarations. Compound types include
+structures, variant, arrays, sequences, and strings.
+
+4.2.1 Structures
+
+Structures are aligned on the largest alignment required by basic types
+contained within the structure. (This follows the ISO/C standard for structures)
+
+TSDL meta-data representation of a named structure:
+
+struct name {
+  field_type field_name;
+  field_type field_name;
+  ...
+}; 
+
+Example:
+
+struct example {
+  integer {                       /* Nameless type */
+    size = 16;
+    signed = true;
+    align = 16;
+  } first_field_name;
+  uint64_t second_field_name;  /* Named type declared in the meta-data */
+};
+
+The fields are placed in a sequence next to each other. They each possess a
+field name, which is a unique identifier within the structure.
+
+A nameless structure can be declared as a field type or as part of a typedef:
+
+struct {
+  ...
+}
+
+Alignment for a structure compound type can be forced to a minimum value
+by adding an "align" specifier after the declaration of a structure
+body. This attribute is read as: align(value). The value is specified in
+bits. The structure will be aligned on the maximum value between this
+attribute and the alignment required by the basic types contained within
+the structure. e.g.
+
+struct {
+  ...
+} align(32)
+
+4.2.2 Variants (Discriminated/Tagged Unions)
+
+A CTF variant is a selection between different types. A CTF variant must
+always be defined within the scope of a structure or within fields
+contained within a structure (defined recursively). A "tag" enumeration
+field must appear in either the same lexical scope, prior to the variant
+field (in field declaration order), in an upper lexical scope (see
+Section 7.3.1), or in an upper dynamic scope (see Section 7.3.2). The
+type selection is indicated by the mapping from the enumeration value to
+the string used as variant type selector. The field to use as tag is
+specified by the "tag_field", specified between "< >" after the
+"variant" keyword for unnamed variants, and after "variant name" for
+named variants.
+
+The alignment of the variant is the alignment of the type as selected by the tag
+value for the specific instance of the variant. The alignment of the type
+containing the variant is independent of the variant alignment.  The size of the
+variant is the size as selected by the tag value for the specific instance of
+the variant.
+
+A named variant declaration followed by its definition within a structure
+declaration:
+
+variant name {
+  field_type sel1;
+  field_type sel2;
+  field_type sel3;
+  ...
+};
+
+struct {
+  enum : integer_type { sel1, sel2, sel3, ... } tag_field;
+  ...
+  variant name <tag_field> v;
+}
+
+An unnamed variant definition within a structure is expressed by the following
+TSDL meta-data:
+
+struct {
+  enum : integer_type { sel1, sel2, sel3, ... } tag_field;
+  ...
+  variant <tag_field> {
+    field_type sel1;
+    field_type sel2;
+    field_type sel3;
+    ...
+  } v;
+}
+
+Example of a named variant within a sequence that refers to a single tag field:
+
+variant example {
+  uint32_t a;
+  uint64_t b;
+  short c;
+};
+
+struct {
+  enum : uint2_t { a, b, c } choice;
+  unsigned int seqlen;
+  variant example <choice> v[seqlen];
+}
+
+Example of an unnamed variant:
+
+struct {
+  enum : uint2_t { a, b, c, d } choice;
+  /* Unrelated fields can be added between the variant and its tag */
+  int32_t somevalue;
+  variant <choice> {
+    uint32_t a;
+    uint64_t b;
+    short c;
+    struct {
+      unsigned int field1;
+      uint64_t field2;
+    } d;
+  } s;
+}
+
+Example of an unnamed variant within an array:
+
+struct {
+  enum : uint2_t { a, b, c } choice;
+  variant <choice> {
+    uint32_t a;
+    uint64_t b;
+    short c;
+  } v[10];
+}
+
+Example of a variant type definition within a structure, where the defined type
+is then declared within an array of structures. This variant refers to a tag
+located in an upper lexical scope. This example clearly shows that a variant
+type definition referring to the tag "x" uses the closest preceding field from
+the lexical scope of the type definition.
+
+struct {
+  enum : uint2_t { a, b, c, d } x;
+
+  typedef variant <x> {        /*
+                        * "x" refers to the preceding "x" enumeration in the
+                        * lexical scope of the type definition.
+                        */
+    uint32_t a;
+    uint64_t b;
+    short c;
+  } example_variant;
+
+  struct {
+    enum : int { x, y, z } x;  /* This enumeration is not used by "v". */
+    example_variant v;                 /*
+                                * "v" uses the "enum : uint2_t { a, b, c, d }"
+                                * tag.
+                                */
+  } a[10];
+}
+
+4.2.3 Arrays
+
+Arrays are fixed-length. Their length is declared in the type
+declaration within the meta-data. They contain an array of "inner type"
+elements, which can refer to any type not containing the type of the
+array being declared (no circular dependency). The length is the number
+of elements in an array.
+
+TSDL meta-data representation of a named array:
+
+typedef elem_type name[length];
+
+A nameless array can be declared as a field type within a structure, e.g.:
+
+  uint8_t field_name[10];
+
+Arrays are always aligned on their element alignment requirement.
+
+4.2.4 Sequences
+
+Sequences are dynamically-sized arrays. They refer to a a "length"
+unsigned integer field, which must appear in either the same lexical scope,
+prior to the sequence field (in field declaration order), in an upper
+lexical scope (see Section 7.3.1), or in an upper dynamic scope (see
+Section 7.3.2). This length field represents the number of elements in
+the sequence. The sequence per se is an array of "inner type" elements.
+
+TSDL meta-data representation for a sequence type definition:
+
+struct {
+  unsigned int length_field;
+  typedef elem_type typename[length_field];
+  typename seq_field_name;
+}
+
+A sequence can also be declared as a field type, e.g.:
+
+struct {
+  unsigned int length_field;
+  long seq_field_name[length_field];
+}
+
+Multiple sequences can refer to the same length field, and these length
+fields can be in a different upper dynamic scope:
+
+e.g., assuming the stream.event.header defines:
+
+stream {
+  ...
+  id = 1;
+  event.header := struct {
+    uint16_t seq_len;
+  };
+};
+
+event {
+  ...
+  stream_id = 1;
+  fields := struct {
+    long seq_a[stream.event.header.seq_len];
+    char seq_b[stream.event.header.seq_len];
+  };
+};
+
+The sequence elements follow the "array" specifications.
+
+4.2.5 Strings
+
+Strings are an array of bytes of variable size and are terminated by a '\0'
+"NULL" character.  Their encoding is described in the TSDL meta-data. In
+absence of encoding attribute information, the default encoding is
+UTF-8.
+
+TSDL meta-data representation of a named string type:
+
+typealias string {
+  encoding = UTF8 OR ASCII;
+} := name;
+
+A nameless string type can be declared as a field type:
+
+string field_name;     /* Use default UTF8 encoding */
+
+Strings are always aligned on byte size.
+
+5. Event Packet Header
+
+The event packet header consists of two parts: the "event packet header"
+is the same for all streams of a trace. The second part, the "event
+packet context", is described on a per-stream basis. Both are described
+in the TSDL meta-data. The packets are aligned on architecture-page-sized
+addresses.
+
+Event packet header (all fields are optional, specified by TSDL meta-data):
+
+- Magic number (CTF magic number: 0xC1FC1FC1) specifies that this is a
+  CTF packet. This magic number is optional, but when present, it should
+  come at the very beginning of the packet.
+- Trace UUID, used to ensure the event packet match the meta-data used.
+  (note: we cannot use a meta-data checksum in every cases instead of a
+   UUID because meta-data can be appended to while tracing is active)
+  This field is optional.
+- Stream ID, used as reference to stream description in meta-data.
+  This field is optional if there is only one stream description in the
+  meta-data, but becomes required if there are more than one stream in
+  the TSDL meta-data description.
+
+Event packet context (all fields are optional, specified by TSDL meta-data):
+
+- Event packet content size (in bytes).
+- Event packet size (in bytes, includes padding).
+- Event packet content checksum (optional). Checksum excludes the event packet
+  header.
+- Per-stream event packet sequence count (to deal with UDP packet loss). The
+  number of significant sequence counter bits should also be present, so
+  wrap-arounds are dealt with correctly.
+- Time-stamp at the beginning and time-stamp at the end of the event packet.
+  Both timestamps are written in the packet header, but sampled respectively
+  while (or before) writing the first event and while (or after) writing the
+  last event in the packet. The inclusive range between these timestamps should
+  include all event timestamps assigned to events contained within the packet.
+- Events discarded count
+  - Snapshot of a per-stream free-running counter, counting the number of
+    events discarded that were supposed to be written in the stream prior to
+    the first event in the event packet.
+    * Note: producer-consumer buffer full condition should fill the current
+            event packet with padding so we know exactly where events have been
+            discarded.
+- Lossless compression scheme used for the event packet content. Applied
+  directly to raw data. New types of compression can be added in following
+  versions of the format.
+  0: no compression scheme
+  1: bzip2
+  2: gzip
+  3: xz
+- Cypher used for the event packet content. Applied after compression.
+  0: no encryption
+  1: AES
+- Checksum scheme used for the event packet content. Applied after encryption.
+  0: no checksum
+  1: md5
+  2: sha1
+  3: crc32
+
+5.1 Event Packet Header Description
+
+The event packet header layout is indicated by the trace packet.header
+field. Here is a recommended structure type for the packet header with
+the fields typically expected (although these fields are each optional):
+
+struct event_packet_header {
+  uint32_t magic;
+  uint8_t  uuid[16];
+  uint32_t stream_id;
+};
+
+trace {
+  ...
+  packet.header := struct event_packet_header;
+};
+
+If the magic number is not present, tools such as "file" will have no
+mean to discover the file type.
+
+If the uuid is not present, no validation that the meta-data actually
+corresponds to the stream is performed.
+
+If the stream_id packet header field is missing, the trace can only
+contain a single stream. Its "id" field can be left out, and its events
+don't need to declare a "stream_id" field.
+
+
+5.2 Event Packet Context Description
+
+Event packet context example. These are declared within the stream declaration
+in the meta-data. All these fields are optional. If the packet size field is
+missing, the whole stream only contains a single packet. If the content
+size field is missing, the packet is filled (no padding). The content
+and packet sizes include all headers.
+
+An example event packet context type:
+
+struct event_packet_context {
+  uint64_t timestamp_begin;
+  uint64_t timestamp_end;
+  uint32_t checksum;
+  uint32_t stream_packet_count;
+  uint32_t events_discarded;
+  uint32_t cpu_id;
+  uint32_t/uint16_t content_size;
+  uint32_t/uint16_t packet_size;
+  uint8_t  stream_packet_count_bits;   /* Significant counter bits */
+  uint8_t  compression_scheme;
+  uint8_t  encryption_scheme;
+  uint8_t  checksum_scheme;
+};
+
+
+6. Event Structure
+
+The overall structure of an event is:
+
+1 - Stream Packet Context (as specified by the stream meta-data)
+ 2 - Event Header (as specified by the stream meta-data)
+  3 - Stream Event Context (as specified by the stream meta-data)
+   4 - Event Context (as specified by the event meta-data)
+    5 - Event Payload (as specified by the event meta-data)
+
+This structure defines an implicit dynamic scoping, where variants
+located in inner structures (those with a higher number in the listing
+above) can refer to the fields of outer structures (with lower number in
+the listing above). See Section 7.3 TSDL Scopes for more detail.
+
+6.1 Event Header
+
+Event headers can be described within the meta-data. We hereby propose, as an
+example, two types of events headers. Type 1 accommodates streams with less than
+31 event IDs. Type 2 accommodates streams with 31 or more event IDs.
+
+One major factor can vary between streams: the number of event IDs assigned to
+a stream. Luckily, this information tends to stay relatively constant (modulo
+event registration while trace is being recorded), so we can specify different
+representations for streams containing few event IDs and streams containing
+many event IDs, so we end up representing the event ID and time-stamp as
+densely as possible in each case.
+
+The header is extended in the rare occasions where the information cannot be
+represented in the ranges available in the standard event header. They are also
+used in the rare occasions where the data required for a field could not be
+collected: the flag corresponding to the missing field within the missing_fields
+array is then set to 1.
+
+Types uintX_t represent an X-bit unsigned integer, as declared with
+either:
+
+  typealias integer { size = X; align = X; signed = false } := uintX_t;
+
+    or
+
+  typealias integer { size = X; align = 1; signed = false } := uintX_t;
+
+6.1.1 Type 1 - Few event IDs
+
+  - Aligned on 32-bit (or 8-bit if byte-packed, depending on the architecture
+    preference).
+  - Native architecture byte ordering.
+  - For "compact" selection
+    - Fixed size: 32 bits.
+  - For "extended" selection
+    - Size depends on the architecture and variant alignment.
+
+struct event_header_1 {
+  /*
+   * id: range: 0 - 30.
+   * id 31 is reserved to indicate an extended header.
+   */
+  enum : uint5_t { compact = 0 ... 30, extended = 31 } id;
+  variant <id> {
+    struct {
+      uint27_t timestamp;
+    } compact;
+    struct {
+      uint32_t id;                      /* 32-bit event IDs */
+      uint64_t timestamp;               /* 64-bit timestamps */
+    } extended;
+  } v;
+} align(32);   /* or align(8) */
+
+
+6.1.2 Type 2 - Many event IDs
+
+  - Aligned on 16-bit (or 8-bit if byte-packed, depending on the architecture
+    preference).
+  - Native architecture byte ordering.
+  - For "compact" selection
+    - Size depends on the architecture and variant alignment.
+  - For "extended" selection
+    - Size depends on the architecture and variant alignment.
+
+struct event_header_2 {
+  /*
+   * id: range: 0 - 65534.
+   * id 65535 is reserved to indicate an extended header.
+   */
+  enum : uint16_t { compact = 0 ... 65534, extended = 65535 } id;
+  variant <id> {
+    struct {
+      uint32_t timestamp;
+    } compact;
+    struct {
+      uint32_t id;                      /* 32-bit event IDs */
+      uint64_t timestamp;               /* 64-bit timestamps */ 
+    } extended;
+  } v;
+} align(16);   /* or align(8) */
+
+
+6.2 Event Context
+
+The event context contains information relative to the current event.
+The choice and meaning of this information is specified by the TSDL
+stream and event meta-data descriptions. The stream context is applied
+to all events within the stream. The stream context structure follows
+the event header. The event context is applied to specific events. Its
+structure follows the stream context structure.
+
+An example of stream-level event context is to save the event payload size with
+each event, or to save the current PID with each event.  These are declared
+within the stream declaration within the meta-data:
+
+  stream {
+    ...
+    event.context := struct {
+        uint pid;
+        uint16_t payload_size;
+    };
+  };
+
+An example of event-specific event context is to declare a bitmap of missing
+fields, only appended after the stream event context if the extended event
+header is selected. NR_FIELDS is the number of fields within the event (a
+numeric value).
+
+  event {
+    context = struct {
+      variant <id> {
+        struct { } compact;
+        struct {
+          uint1_t missing_fields[NR_FIELDS]; /* missing event fields bitmap */
+        } extended;
+      } v;
+    };
+    ...
+  }
+
+6.3 Event Payload
+
+An event payload contains fields specific to a given event type. The fields
+belonging to an event type are described in the event-specific meta-data
+within a structure type.
+
+6.3.1 Padding
+
+No padding at the end of the event payload. This differs from the ISO/C standard
+for structures, but follows the CTF standard for structures. In a trace, even
+though it makes sense to align the beginning of a structure, it really makes no
+sense to add padding at the end of the structure, because structures are usually
+not followed by a structure of the same type.
+
+This trick can be done by adding a zero-length "end" field at the end of the C
+structures, and by using the offset of this field rather than using sizeof()
+when calculating the size of a structure (see Appendix "A. Helper macros").
+
+6.3.2 Alignment
+
+The event payload is aligned on the largest alignment required by types
+contained within the payload. (This follows the ISO/C standard for structures)
+
+
+7. Trace Stream Description Language (TSDL)
+
+The Trace Stream Description Language (TSDL) allows expression of the
+binary trace streams layout in a C99-like Domain Specific Language
+(DSL).
+
+
+7.1 Meta-data
+
+The trace stream layout description is located in the trace meta-data.
+The meta-data is itself located in a stream identified by its name:
+"metadata".
+
+The meta-data description can be expressed in two different formats:
+text-only and packet-based. The text-only description facilitates
+generation of meta-data and provides a convenient way to enter the
+meta-data information by hand. The packet-based meta-data provides the
+CTF stream packet facilities (checksumming, compression, encryption,
+network-readiness) for meta-data stream generated and transported by a
+tracer.
+
+The text-only meta-data file is a plain text TSDL description.
+
+The packet-based meta-data is made of "meta-data packets", which each
+start with a meta-data packet header. The packet-based meta-data
+description is detected by reading the magic number "0x75D11D57" at the
+beginning of the file. This magic number is also used to detect the
+endianness of the architecture by trying to read the CTF magic number
+and its counterpart in reversed endianness. The events within the
+meta-data stream have no event header nor event context. Each event only
+contains a "sequence" payload, which is a sequence of bits using the
+"trace.packet.header.content_size" field as a placeholder for its length
+(the packet header size should be substracted). The formatting of this
+sequence of bits is a plain-text representation of the TSDL description.
+Each meta-data packet start with a special packet header, specific to
+the meta-data stream, which contains, exactly:
+
+struct metadata_packet_header {
+  uint32_t magic;                      /* 0x75D11D57 */
+  uint8_t  uuid[16];                   /* Unique Universal Identifier */
+  uint32_t checksum;                   /* 0 if unused */
+  uint32_t content_size;               /* in bits */
+  uint32_t packet_size;                        /* in bits */
+  uint8_t  compression_scheme;         /* 0 if unused */
+  uint8_t  encryption_scheme;          /* 0 if unused */
+  uint8_t  checksum_scheme;            /* 0 if unused */
+};
+
+The packet-based meta-data can be converted to a text-only meta-data by
+concatenating all the strings in contains.
+
+In the textual representation of the meta-data, the text contained
+within "/*" and "*/", as well as within "//" and end of line, are
+treated as comments.  Boolean values can be represented as true, TRUE,
+or 1 for true, and false, FALSE, or 0 for false. Within the string-based
+meta-data description, the trace UUID is represented as a string of
+hexadecimal digits and dashes "-". In the event packet header, the trace
+UUID is represented as an array of bytes.
+
+
+7.2 Declaration vs Definition
+
+A declaration associates a layout to a type, without specifying where
+this type is located in the event structure hierarchy (see Section 6).
+This therefore includes typedef, typealias, as well as all type
+specifiers. In certain circumstances (typedef, structure field and
+variant field), a declaration is followed by a declarator, which specify
+the newly defined type name (for typedef), or the field name (for
+declarations located within structure and variants). Array and sequence,
+declared with square brackets ("[" "]"), are part of the declarator,
+similarly to C99. The enumeration base type is specified by
+": enum_base", which is part of the type specifier. The variant tag
+name, specified between "<" ">", is also part of the type specifier.
+
+A definition associates a type to a location in the event structure
+hierarchy (see Section 6). This association is denoted by ":=", as shown
+in Section 7.3.
+
+
+7.3 TSDL Scopes
+
+TSDL uses two different types of scoping: a lexical scope is used for
+declarations and type definitions, and a dynamic scope is used for
+variants references to tag fields and for sequence references to length
+fields.
+
+7.3.1 Lexical Scope
+
+Each of "trace", "stream", "event", "struct" and "variant" have their own
+nestable declaration scope, within which types can be declared using "typedef"
+and "typealias". A root declaration scope also contains all declarations
+located outside of any of the aforementioned declarations. An inner
+declaration scope can refer to type declared within its container
+lexical scope prior to the inner declaration scope. Redefinition of a
+typedef or typealias is not valid, although hiding an upper scope
+typedef or typealias is allowed within a sub-scope.
+
+7.3.2 Dynamic Scope
+
+A dynamic scope consists in the lexical scope augmented with the
+implicit event structure definition hierarchy presented at Section 6.
+The dynamic scope is used for variant tag and sequence length
+definitions. It is used at definition time to look up the location of
+the tag field associated with a variant, and to lookup up the location
+of the length field associated with a sequence.
+
+Therefore, variants (or sequences) in lower levels in the dynamic scope
+(e.g. event context) can refer to a tag (or length) field located in
+upper levels (e.g. in the event header) by specifying, in this case, the
+associated tag with <header.field_name>. This allows, for instance, the
+event context to define a variant referring to the "id" field of the
+event header as selector.
+
+The target dynamic scope must be specified explicitly when referring to
+a field outside of the local static scope. The dynamic scope prefixes
+are thus:
+
+ - Trace Packet Header: <trace.packet.header. >,
+ - Stream Packet Context: <stream.packet.context. >,
+ - Event Header: <stream.event.header. >,
+ - Stream Event Context: <stream.event.context. >,
+ - Event Context: <event.context. >,
+ - Event Payload: <event.fields. >.
+
+Multiple declarations of the same field name within a single scope is
+not valid. It is however valid to re-use the same field name in
+different scopes. There is no possible conflict, because the dynamic
+scope must be specified when a variant refers to a tag field located in
+a different dynamic scope.
+
+The information available in the dynamic scopes can be thought of as the
+current tracing context. At trace production, information about the
+current context is saved into the specified scope field levels. At trace
+consumption, for each event, the current trace context is therefore
+readable by accessing the upper dynamic scopes.
+
+
+7.4 TSDL Examples
+
+The grammar representing the TSDL meta-data is presented in Appendix C.
+TSDL Grammar. This section presents a rather lighter reading that
+consists in examples of TSDL meta-data, with template values.
+
+The stream "id" can be left out if there is only one stream in the
+trace. The event "id" field can be left out if there is only one event
+in a stream.
+
+trace {
+  major = value;                               /* Trace format version */
+  minor = value;
+  uuid = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa";       /* Trace UUID */
+  byte_order = be OR le;                       /* Endianness (required) */
+  packet.header := struct {
+    uint32_t magic;
+    uint8_t  uuid[16];
+    uint32_t stream_id;
+  };
+};
+
+stream {
+  id = stream_id;
+  /* Type 1 - Few event IDs; Type 2 - Many event IDs. See section 6.1. */
+  event.header := event_header_1 OR event_header_2;
+  event.context := struct {
+    ...
+  };
+  packet.context := struct {
+    ...
+  };
+};
+
+event {
+  name = event_name;
+  id = value;                  /* Numeric identifier within the stream */
+  stream_id = stream_id;
+  context := struct {
+    ...
+  };
+  fields := struct {
+    ...
+  };
+};
+
+/* More detail on types in section 4. Types */
+
+/*
+ * Named types:
+ *
+ * Type declarations behave similarly to the C standard.
+ */
+
+typedef aliased_type_specifiers new_type_declarators;
+
+/* e.g.: typedef struct example new_type_name[10]; */
+
+/*
+ * typealias
+ *
+ * The "typealias" declaration can be used to give a name (including
+ * pointer declarator specifier) to a type. It should also be used to
+ * map basic C types (float, int, unsigned long, ...) to a CTF type.
+ * Typealias is a superset of "typedef": it also allows assignment of a
+ * simple variable identifier to a type.
+ */
+
+typealias type_class {
+  ...
+} := type_specifiers type_declarator;
+
+/*
+ * e.g.: 
+ * typealias integer {
+ *   size = 32;
+ *   align = 32;
+ *   signed = false;
+ * } := struct page *;
+ *
+ * typealias integer {
+ *  size = 32;
+ *  align = 32;
+ *  signed = true;
+ * } := int;
+ */
+
+struct name {
+  ...
+};
+
+variant name {
+  ...
+};
+
+enum name : integer_type {
+  ...
+};
+
+
+/*
+ * Unnamed types, contained within compound type fields, typedef or typealias.
+ */
+
+struct {
+  ...
+}
+
+struct {
+  ...
+} align(value)
+
+variant {
+  ...
+}
+
+enum : integer_type {
+  ...
+}
+
+typedef type new_type[length];
+
+struct {
+  type field_name[length];
+}
+
+typedef type new_type[length_type];
+
+struct {
+  type field_name[length_type];
+}
+
+integer {
+  ...
+}
+
+floating_point {
+  ...
+}
+
+struct {
+  integer_type field_name:size;                /* GNU/C bitfield */
+}
+
+struct {
+  string field_name;
+}
+
+
+A. Helper macros
+
+The two following macros keep track of the size of a GNU/C structure without
+padding at the end by placing HEADER_END as the last field. A one byte end field
+is used for C90 compatibility (C99 flexible arrays could be used here). Note
+that this does not affect the effective structure size, which should always be
+calculated with the header_sizeof() helper.
+
+#define HEADER_END             char end_field
+#define header_sizeof(type)    offsetof(typeof(type), end_field)
+
+
+B. Stream Header Rationale
+
+An event stream is divided in contiguous event packets of variable size. These
+subdivisions allow the trace analyzer to perform a fast binary search by time
+within the stream (typically requiring to index only the event packet headers)
+without reading the whole stream. These subdivisions have a variable size to
+eliminate the need to transfer the event packet padding when partially filled
+event packets must be sent when streaming a trace for live viewing/analysis.
+An event packet can contain a certain amount of padding at the end. Dividing
+streams into event packets is also useful for network streaming over UDP and
+flight recorder mode tracing (a whole event packet can be swapped out of the
+buffer atomically for reading).
+
+The stream header is repeated at the beginning of each event packet to allow
+flexibility in terms of:
+
+  - streaming support,
+  - allowing arbitrary buffers to be discarded without making the trace
+    unreadable,
+  - allow UDP packet loss handling by either dealing with missing event packet
+    or asking for re-transmission.
+  - transparently support flight recorder mode,
+  - transparently support crash dump.
+
+
+C. TSDL Grammar
+
+/*
+ * Common Trace Format (CTF) Trace Stream Description Language (TSDL) Grammar.
+ *
+ * Inspired from the C99 grammar:
+ * http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf (Annex A)
+ * and c++1x grammar (draft)
+ * http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3291.pdf (Annex A)
+ *
+ * Specialized for CTF needs by including only constant and declarations from
+ * C99 (excluding function declarations), and by adding support for variants,
+ * sequences and CTF-specific specifiers. Enumeration container types
+ * semantic is inspired from c++1x enum-base.
+ */
+
+1) Lexical grammar
+
+1.1) Lexical elements
+
+token:
+       keyword
+       identifier
+       constant
+       string-literal
+       punctuator
+
+1.2) Keywords
+
+keyword: is one of
+
+align
+const
+char
+double
+enum
+event
+floating_point
+float
+integer
+int
+long
+short
+signed
+stream
+string
+struct
+trace
+typealias
+typedef
+unsigned
+variant
+void
+_Bool
+_Complex
+_Imaginary
+
+
+1.3) Identifiers
+
+identifier:
+       identifier-nondigit
+       identifier identifier-nondigit
+       identifier digit
+
+identifier-nondigit:
+       nondigit
+       universal-character-name
+       any other implementation-defined characters
+
+nondigit:
+       _
+       [a-zA-Z]        /* regular expression */
+
+digit:
+       [0-9]           /* regular expression */
+
+1.4) Universal character names
+
+universal-character-name:
+       \u hex-quad
+       \U hex-quad hex-quad
+
+hex-quad:
+       hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit
+
+1.5) Constants
+
+constant:
+       integer-constant
+       enumeration-constant
+       character-constant
+
+integer-constant:
+       decimal-constant integer-suffix-opt
+       octal-constant integer-suffix-opt
+       hexadecimal-constant integer-suffix-opt
+
+decimal-constant:
+       nonzero-digit
+       decimal-constant digit
+
+octal-constant:
+       0
+       octal-constant octal-digit
+
+hexadecimal-constant:
+       hexadecimal-prefix hexadecimal-digit
+       hexadecimal-constant hexadecimal-digit
+
+hexadecimal-prefix:
+       0x
+       0X
+
+nonzero-digit:
+       [1-9]
+
+integer-suffix:
+       unsigned-suffix long-suffix-opt
+       unsigned-suffix long-long-suffix
+       long-suffix unsigned-suffix-opt
+       long-long-suffix unsigned-suffix-opt
+
+unsigned-suffix:
+       u
+       U
+
+long-suffix:
+       l
+       L
+
+long-long-suffix:
+       ll
+       LL
+
+enumeration-constant:
+       identifier
+       string-literal
+
+character-constant:
+       ' c-char-sequence '
+       L' c-char-sequence '
+
+c-char-sequence:
+       c-char
+       c-char-sequence c-char
+
+c-char:
+       any member of source charset except single-quote ('), backslash
+       (\), or new-line character.
+       escape-sequence
+
+escape-sequence:
+       simple-escape-sequence
+       octal-escape-sequence
+       hexadecimal-escape-sequence
+       universal-character-name
+
+simple-escape-sequence: one of
+       \' \" \? \\ \a \b \f \n \r \t \v
+
+octal-escape-sequence:
+       \ octal-digit
+       \ octal-digit octal-digit
+       \ octal-digit octal-digit octal-digit
+
+hexadecimal-escape-sequence:
+       \x hexadecimal-digit
+       hexadecimal-escape-sequence hexadecimal-digit
+
+1.6) String literals
+
+string-literal:
+       " s-char-sequence-opt "
+       L" s-char-sequence-opt "
+
+s-char-sequence:
+       s-char
+       s-char-sequence s-char
+
+s-char:
+       any member of source charset except double-quote ("), backslash
+       (\), or new-line character.
+       escape-sequence
+
+1.7) Punctuators
+
+punctuator: one of
+       [ ] ( ) { } . -> * + - < > : ; ... = ,
+
+
+2) Phrase structure grammar
+
+primary-expression:
+       identifier
+       constant
+       string-literal
+       ( unary-expression )
+
+postfix-expression:
+       primary-expression
+       postfix-expression [ unary-expression ]
+       postfix-expression . identifier
+       postfix-expressoin -> identifier
+
+unary-expression:
+       postfix-expression
+       unary-operator postfix-expression
+
+unary-operator: one of
+       + -
+
+assignment-operator:
+       =
+
+type-assignment-operator:
+       :=
+
+constant-expression-range:
+       unary-expression ... unary-expression
+
+2.2) Declarations:
+
+declaration:
+       declaration-specifiers declarator-list-opt ;
+       ctf-specifier ;
+
+declaration-specifiers:
+       storage-class-specifier declaration-specifiers-opt
+       type-specifier declaration-specifiers-opt
+       type-qualifier declaration-specifiers-opt
+
+declarator-list:
+       declarator
+       declarator-list , declarator
+
+abstract-declarator-list:
+       abstract-declarator
+       abstract-declarator-list , abstract-declarator
+
+storage-class-specifier:
+       typedef
+
+type-specifier:
+       void
+       char
+       short
+       int
+       long
+       float
+       double
+       signed
+       unsigned
+       _Bool
+       _Complex
+       _Imaginary
+       struct-specifier
+       variant-specifier
+       enum-specifier
+       typedef-name
+       ctf-type-specifier
+
+align-attribute:
+       align ( unary-expression )
+
+struct-specifier:
+       struct identifier-opt { struct-or-variant-declaration-list-opt } align-attribute-opt
+       struct identifier align-attribute-opt
+
+struct-or-variant-declaration-list:
+       struct-or-variant-declaration
+       struct-or-variant-declaration-list struct-or-variant-declaration
+
+struct-or-variant-declaration:
+       specifier-qualifier-list struct-or-variant-declarator-list ;
+       declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list ;
+       typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list ;
+       typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list ;
+
+specifier-qualifier-list:
+       type-specifier specifier-qualifier-list-opt
+       type-qualifier specifier-qualifier-list-opt
+
+struct-or-variant-declarator-list:
+       struct-or-variant-declarator
+       struct-or-variant-declarator-list , struct-or-variant-declarator
+
+struct-or-variant-declarator:
+       declarator
+       declarator-opt : unary-expression
+
+variant-specifier:
+       variant identifier-opt variant-tag-opt { struct-or-variant-declaration-list }
+       variant identifier variant-tag
+
+variant-tag:
+       < identifier >
+
+enum-specifier:
+       enum identifier-opt { enumerator-list }
+       enum identifier-opt { enumerator-list , }
+       enum identifier
+       enum identifier-opt : declaration-specifiers { enumerator-list }
+       enum identifier-opt : declaration-specifiers { enumerator-list , }
+
+enumerator-list:
+       enumerator
+       enumerator-list , enumerator
+
+enumerator:
+       enumeration-constant
+       enumeration-constant assignment-operator unary-expression
+       enumeration-constant assignment-operator constant-expression-range
+
+type-qualifier:
+       const
+
+declarator:
+       pointer-opt direct-declarator
+
+direct-declarator:
+       identifier
+       ( declarator )
+       direct-declarator [ unary-expression ]
+
+abstract-declarator:
+       pointer-opt direct-abstract-declarator
+
+direct-abstract-declarator:
+       identifier-opt
+       ( abstract-declarator )
+       direct-abstract-declarator [ unary-expression ]
+       direct-abstract-declarator [ ]
+
+pointer:
+       * type-qualifier-list-opt
+       * type-qualifier-list-opt pointer
+
+type-qualifier-list:
+       type-qualifier
+       type-qualifier-list type-qualifier
+
+typedef-name:
+       identifier
+
+2.3) CTF-specific declarations
+
+ctf-specifier:
+       event { ctf-assignment-expression-list-opt }
+       stream { ctf-assignment-expression-list-opt }
+       trace { ctf-assignment-expression-list-opt }
+       typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list
+       typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list
+
+ctf-type-specifier:
+       floating_point { ctf-assignment-expression-list-opt }
+       integer { ctf-assignment-expression-list-opt }
+       string { ctf-assignment-expression-list-opt }
+       string
+
+ctf-assignment-expression-list:
+       ctf-assignment-expression ;
+       ctf-assignment-expression-list ctf-assignment-expression ;
+
+ctf-assignment-expression:
+       unary-expression assignment-operator unary-expression
+       unary-expression type-assignment-operator type-specifier
+       declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list
+       typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list
+       typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list
author	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
	Wed, 22 Jun 2011 18:21:36 +0000 (14:21 -0400)
committer	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
	Wed, 22 Jun 2011 18:21:36 +0000 (14:21 -0400)
common-trace-format-proposal.txt	[deleted file]	patch \| blob \| blame \| history
common-trace-format-specification.txt	[new file with mode: 0644]	patch \| blob