common-trace-format-specification.md

   1 # Common Trace Format (CTF) Specification (v1.8.2)
   2
   3 **Author**: Mathieu Desnoyers, [EfficiOS Inc.](http://www.efficios.com/)
   4
   5 The goal of the present document is to specify a trace format that suits
   6 the needs of the embedded, telecom, high-performance and kernel
   7 communities. It is based on the
   8 [Common Trace Format Requirements (v1.4)](http://git.efficios.com/?p=ctf.git;a=blob_plain;f=common-trace-format-reqs.txt;hb=master)
   9 document. It is designed to allow traces to be natively generated by the
  10 Linux kernel, Linux user space applications written in C/C++, and
  11 hardware components. One major element of CTF is the Trace Stream
  12 Description Language (TSDL) which flexibility enables description of
  13 various binary trace stream layouts.
  14
  15 The latest version of this document can be found at:
  16
  17   * Git: `git clone git://git.efficios.com/ctf.git`
  18   * [Gitweb](http://git.efficios.com/?p=ctf.git)
  19
  20 A reference implementation of a library to read and write this trace
  21 format is being implemented within the
  22 [Babeltrace](http://www.efficios.com/babeltrace) project, a converter
  23 between trace formats. The development tree is available at:
  24
  25   * Git: `git clone git://git.efficios.com/babeltrace.git`
  26   * [Gitweb](http://git.efficios.com/?p=babeltrace.git)
  27
  28 The [CE Workgroup](http://www.linuxfoundation.org/collaborate/workgroups/celf)
  29 of the Linux Foundation, [Ericsson](http://www.ericsson.com/), and
  30 [EfficiOS](http://www.efficios.com/) have sponsored this work.
  31
  32 **Contents**:
  33
  34     1. Preliminary definitions
  35     2. High-level representation of a trace
  36     3. Event stream
  37     4. Types
  38       4.1 Basic types
  39         4.1.1 Type inheritance
  40         4.1.2 Alignment
  41         4.1.3 Byte order
  42         4.1.4 Size
  43         4.1.5 Integers
  44         4.1.6 GNU/C bitfields
  45         4.1.7 Floating point
  46         4.1.8 Enumerations
  47       4.2 Compound types
  48         4.2.1 Structures
  49         4.2.2 Variants (discriminated/tagged unions)
  50         4.2.3 Arrays
  51         4.2.4 Sequences
  52         4.2.5 Strings
  53     5. Event packet header
  54       5.1 Event packet header description
  55       5.2 Event packet context description
  56     6. Event structure
  57       6.1 Event header
  58         6.1.1 Type 1: few event IDs
  59         6.1.2 Type 2: many event IDs
  60       6.2 Stream event context and event context
  61       6.3 Event payload
  62         6.3.1 Padding
  63         6.3.2 Alignment
  64     7. Trace Stream Description Language (TSDL)
  65       7.1 Metadata
  66       7.2 Declaration vs definition
  67       7.3 TSDL scopes
  68         7.3.1 Lexical scope
  69         7.3.2 Static and dynamic scopes
  70       7.4 TSDL examples
  71     8. Clocks
  72     A. Helper macros
  73     B. Stream header rationale
  74     C. TSDL Grammar
  75       C.1 Lexical grammar
  76         C.1.1 Lexical elements
  77         C.1.2 Keywords
  78         C.1.3 Identifiers
  79         C.1.4 Universal character names
  80         C.1.5 Constants
  81         C.1.6 String literals
  82         C.1.7 Punctuators
  83       C.2 Phrase structure grammar
  84         C.2.2 Declarations:
  85         C.2.3 CTF-specific declarations
  86
  87
  88 ## 1. Preliminary definitions
  89
  90   * **Event trace**: an ordered sequence of events.
  91   * **Event stream**: an ordered sequence of events, containing a
  92     subset of the trace event types.
  93   * **Event packet**: a sequence of physically contiguous events within
  94     an event stream.
  95   * **Event**: this is the basic entry in a trace. Also known as
  96     a _trace record_.
  97     * An **event identifier** (ID) relates to the class (a type) of
  98       event within an event stream, e.g. event `irq_entry`.
  99     * An **event** (or event record) relates to a specific instance of
 100       an event class, e.g. event `irq_entry`, at time _X_, on CPU _Y_.
 101   * Source architecture: architecture writing the trace.
 102   * Reader architecture: architecture reading the trace.
 103
 104
 105 ## 2. High-level representation of a trace
 106
 107 A _trace_ is divided into multiple event streams. Each event stream
 108 contains a subset of the trace event types.
 109
 110 The final output of the trace, after its generation and optional
 111 transport over the network, is expected to be either on permanent or
 112 temporary storage in a virtual file system. Because each event stream
 113 is appended to while a trace is being recorded, each is associated with
 114 a distinct set of files for output. Therefore, a stored trace can be
 115 represented as a directory containing zero, one or more files
 116 per stream.
 117
 118 Metadata description associated with the trace contains information on
 119 trace event types expressed in the _Trace Stream Description Language_
 120 (TSDL). This language describes:
 121
 122   * Trace version
 123   * Types available
 124   * Per-trace event header description
 125   * Per-stream event header description
 126   * Per-stream event context description
 127   * Per-event
 128     * Event type to stream mapping
 129     * Event type to name mapping
 130     * Event type to ID mapping
 131     * Event context description
 132     * Event fields description
 133
 134
 135 ## 3. Event stream
 136
 137 An _event stream_ can be divided into contiguous event packets of
 138 variable size. An event packet can contain a certain amount of padding
 139 at the end. The stream header is repeated at the beginning of each
 140 event packet. The rationale for the event stream design choices is
 141 explained in [Stream header rationale](#specB).
 142
 143 The event stream header will therefore be referred to as the
 144 _event packet header_ throughout the rest of this document.
 145
 146
 147 ## 4. Types
 148
 149 Types are organized as type classes. Each type class belong to either
 150 of two kind of types: _basic types_ or _compound types_.
 151
 152
 153 ### 4.1 Basic types
 154
 155 A basic type is a scalar type, as described in this section. It
 156 includes integers, GNU/C bitfields, enumerations, and floating
 157 point values.
 158
 159
 160 #### 4.1.1 Type inheritance
 161
 162 Type specifications can be inherited to allow deriving types from a
 163 type class. For example, see the uint32_t named type derived from the
 164 [_integer_ type](#spec4.1.5) class. Types have a precise binary
 165 representation in the trace. A type class has methods to read and write
 166 these types, but must be derived into a type to be usable in an event
 167 field.
 168
 169
 170 #### 4.1.2 Alignment
 171
 172 We define _byte-packed_ types as aligned on the byte size, namely 8-bit.
 173 We define _bit-packed_ types as following on the next bit, as defined
 174 by the [Integers](#spec4.1.5) section.
 175
 176 Each basic type must specify its alignment, in bits. Examples of
 177 possible alignments are: bit-packed (`align = 1`), byte-packed
 178 (`align = 8`), or word-aligned (e.g. `align = 32` or `align = 64`).
 179 The choice depends on the architecture preference and compactness vs
 180 performance trade-offs of the implementation. Architectures providing
 181 fast unaligned write byte-packed basic types to save space, aligning
 182 each type on byte boundaries (8-bit). Architectures with slow unaligned
 183 writes align types on specific alignment values. If no specific
 184 alignment is declared for a type, it is assumed to be bit-packed for
 185 integers with size not multiple of 8 bits and for gcc bitfields. All
 186 other basic types are byte-packed by default. It is however recommended
 187 to always specify the alignment explicitly. Alignment values must be
 188 power of two. Compound types are aligned as specified in their
 189 individual specification.
 190
 191 The base offset used for field alignment is the start of the packet
 192 containing the field. For instance, a field aligned on 32-bit needs to
 193 be at an offset multiple of 32-bit from the start of the packet that
 194 contains it.
 195
 196 TSDL metadata attribute representation of a specific alignment:
 197
 198 ~~~ tsdl
 199 align = /* value in bits */;
 200 ~~~
 201
 202 #### 4.1.3 Byte order
 203
 204 By default, byte order of a basic type is the byte order described in
 205 the trace description. It can be overridden by specifying a
 206 `byte_order` attribute for a basic type.  Typical use-case is to specify
 207 the network byte order (big endian: `be`) to save data captured from
 208 the network into the trace without conversion.
 209
 210 TSDL metadata representation:
 211
 212 ~~~ tsdl
 213 /* network and be are aliases */
 214 byte_order = /* native OR network OR be OR le */;
 215 ~~~
 216
 217 The `native` keyword selects the byte order described in the trace
 218 description. The `network` byte order is an alias for big endian.
 219
 220 Even though the trace description section is not per se a type, for
 221 sake of clarity, it should be noted that `native` and `network` byte
 222 orders are only allowed within type declaration. The `byte_order`
 223 specified in the trace description section only accepts `be` or `le`
 224 values.
 225
 226
 227 #### 4.1.4 Size
 228
 229 Type size, in bits, for integers and floats is that returned by
 230 `sizeof()` in C multiplied by `CHAR_BIT`. We require the size of `char`
 231 and `unsigned char` types (`CHAR_BIT`) to be fixed to 8 bits for
 232 cross-endianness compatibility.
 233
 234 TSDL metadata representation:
 235
 236 ~~~ tsdl
 237 size = /* value is in bits */;
 238 ~~~
 239
 240
 241 #### 4.1.5 Integers
 242
 243 Signed integers are represented in two-complement. Integer alignment,
 244 size, signedness and byte ordering are defined in the TSDL metadata.
 245 Integers aligned on byte size (8-bit) and with length multiple of byte
 246 size (8-bit) correspond to the C99 standard integers. In addition,
 247 integers with alignment and/or size that are _not_ a multiple of the
 248 byte size are permitted; these correspond to the C99 standard bitfields,
 249 with the added specification that the CTF integer bitfields have a fixed
 250 binary representation. Integer size needs to be a positive integer.
 251 Integers of size 0 are **forbidden**. An MIT-licensed reference
 252 implementation of the CTF portable bitfields is available
 253 [here](http://git.efficios.com/?p=babeltrace.git;a=blob;f=include/babeltrace/bitfield.h).
 254
 255 Binary representation of integers:
 256
 257   * On little and big endian:
 258     * Within a byte, high bits correspond to an integer high bits, and
 259       low bits correspond to low bits
 260   * On little endian:
 261     * Integer across multiple bytes are placed from the less significant
 262       to the most significant
 263     * Consecutive integers are placed from lower bits to higher bits
 264       (even within a byte)
 265   * On big endian:
 266     * Integer across multiple bytes are placed from the most significant
 267       to the less significant
 268     * Consecutive integers are placed from higher bits to lower bits
 269       (even within a byte)
 270
 271 This binary representation is derived from the bitfield implementation
 272 in GCC for little and big endian. However, contrary to what GCC does,
 273 integers can cross units boundaries (no padding is required). Padding
 274 can be [explicitly added](#spec4.1.6) to follow the GCC layout if needed.
 275
 276 TSDL metadata representation:
 277
 278 ~~~ tsdl
 279 integer {
 280     signed = /* true OR false */;                     /* default: false */
 281     byte_order = /* native OR network OR be OR le */; /* default: native */
 282     size = /* value in bits */;                       /* no default */
 283     align = /* value in bits */;
 284
 285     /* base used for pretty-printing output; default: decimal */
 286     base = /* decimal OR dec OR d OR i OR u OR 10 OR hexadecimal OR hex
 287               OR x OR X OR p OR 16 OR octal OR oct OR o OR 8 OR binary
 288               OR b OR 2 */;
 289
 290     /* character encoding */
 291     encoding = /* none or UTF8 or ASCII */;           /* default: none */
 292 }
 293 ~~~
 294
 295 Example of type inheritance (creation of a `uint32_t` named type):
 296
 297 ~~~ tsdl
 298 typealias integer {
 299     size = 32;
 300     signed = false;
 301     align = 32;
 302 } := uint32_t;
 303 ~~~
 304
 305 Definition of a named 5-bit signed bitfield:
 306
 307 ~~~ tsdl
 308 typealias integer {
 309     size = 5;
 310     signed = true;
 311     align = 1;
 312 } := int5_t;
 313 ~~~
 314
 315 The character encoding field can be used to specify that the integer
 316 must be printed as a text character when read. e.g.:
 317
 318 ~~~ tsdl
 319 typealias integer {
 320     size = 8;
 321     align = 8;
 322     signed = false;
 323     encoding = UTF8;
 324 } := utf_char;
 325 ~~~
 326
 327 #### 4.1.6 GNU/C bitfields
 328
 329 The GNU/C bitfields follow closely the integer representation, with a
 330 particularity on alignment: if a bitfield cannot fit in the current
 331 unit, the unit is padded and the bitfield starts at the following unit.
 332 The unit size is defined by the size of the type `unit_type`.
 333
 334 TSDL metadata representation:
 335
 336 ~~~ tsdl
 337 unit_type name:size;
 338 ~~~
 339
 340 As an example, the following structure declared in C compiled by GCC:
 341
 342 ~~~ tsdl
 343 struct example {
 344     short a:12;
 345     short b:5;
 346 };
 347 ~~~
 348
 349 The example structure is aligned on the largest element (short). The
 350 second bitfield would be aligned on the next unit boundary, because it
 351 would not fit in the current unit.
 352
 353
 354 #### 4.1.7 Floating point
 355
 356 The floating point values byte ordering is defined in the TSDL metadata.
 357
 358 Floating point values follow the IEEE 754-2008 standard interchange
 359 formats. Description of the floating point values include the exponent
 360 and mantissa size in bits. Some requirements are imposed on the
 361 floating point values:
 362
 363 * `FLT_RADIX` must be 2.
 364 * `mant_dig` is the number of digits represented in the mantissa. It is
 365   specified by the ISO C99 standard, section 5.2.4, as `FLT_MANT_DIG`,
 366   `DBL_MANT_DIG` and `LDBL_MANT_DIG` as defined by `<float.h>`.
 367 * `exp_dig` is the number of digits represented in the exponent. Given
 368   that `mant_dig` is one bit more than its actual size in bits (leading
 369   1 is not needed) and also given that the sign bit always takes one
 370   bit, `exp_dig` can be specified as:
 371   * `sizeof(float) * CHAR_BIT - FLT_MANT_DIG`
 372   * `sizeof(double) * CHAR_BIT - DBL_MANT_DIG`
 373   * `sizeof(long double) * CHAR_BIT - LDBL_MANT_DIG`
 374
 375 TSDL metadata representation:
 376
 377 ~~~ tsdl
 378 floating_point {
 379     exp_dig = /* value */;
 380     mant_dig = /* value */;
 381     byte_order = /* native OR network OR be OR le */;
 382     align = /* value */;
 383 }
 384 ~~~
 385
 386 Example of type inheritance:
 387
 388 ~~~ tsdl
 389 typealias floating_point {
 390     exp_dig = 8;         /* sizeof(float) * CHAR_BIT - FLT_MANT_DIG */
 391     mant_dig = 24;       /* FLT_MANT_DIG */
 392     byte_order = native;
 393     align = 32;
 394 } := float;
 395 ~~~
 396
 397 TODO: define NaN, +inf, -inf behavior.
 398
 399 Bit-packed, byte-packed or larger alignments can be used for floating
 400 point values, similarly to integers.
 401
 402
 403 #### 4.1.8 Enumerations
 404
 405 Enumerations are a mapping between an integer type and a table of
 406 strings. The numerical representation of the enumeration follows the
 407 integer type specified by the metadata. The enumeration mapping table
 408 is detailed in the enumeration description within the metadata. The
 409 mapping table maps inclusive value ranges (or single values) to strings.
 410 Instead of being limited to simple `value -> string` mappings, these
 411 enumerations map `[ start_value ... end_value ] -> string`, which map
 412 inclusive ranges of values to strings. An enumeration from the C
 413 language can be represented in this format by having the same
 414 `start_value` and `end_value` for each mapping, which is in fact a
 415 range of size 1. This single-value range is supported without repeating
 416 the start and end values with the `value = string` declaration.
 417 Enumerations need to contain at least one entry.
 418
 419 ~~~ tsdl
 420 enum name : integer_type {
 421     somestring          = /* start_value1 */ ... /* end_value1 */,
 422     "other string"      = /* start_value2 */ ... /* end_value2 */,
 423     yet_another_string,   /* will be assigned to end_value2 + 1 */
 424     "some other string" = /* value */,
 425     /* ... */
 426 }
 427 ~~~
 428
 429 If the values are omitted, the enumeration starts at 0 and increment
 430 of 1 for each entry. An entry with omitted value that follows a range
 431 entry takes as value the `end_value` of the previous range + 1:
 432
 433 ~~~ tsdl
 434 enum name : unsigned int {
 435     ZERO,
 436     ONE,
 437     TWO,
 438     TEN = 10,
 439     ELEVEN,
 440 }
 441 ~~~
 442
 443 Overlapping ranges within a single enumeration are implementation
 444 defined.
 445
 446 A nameless enumeration can be declared as a field type or as part of
 447 a `typedef`:
 448
 449 ~~~ tsdl
 450 enum : integer_type {
 451     /* ... */
 452 }
 453 ~~~
 454
 455 Enumerations omitting the container type `: integer_type` use the `int`
 456 type (for compatibility with C99). The `int` type _must be_ previously
 457 declared, e.g.:
 458
 459 ~~~ tsdl
 460 typealias integer { size = 32; align = 32; signed = true; } := int;
 461
 462 enum {
 463     /* ... */
 464 }
 465 ~~~
 466
 467 ### 4.2 Compound types
 468
 469 Compound are aggregation of type declarations. Compound types include
 470 structures, variant, arrays, sequences, and strings.
 471
 472
 473 #### 4.2.1 Structures
 474
 475 Structures are aligned on the largest alignment required by basic types
 476 contained within the structure. (This follows the ISO/C standard for
 477 structures)
 478
 479 TSDL metadata representation of a named structure:
 480
 481 ~~~ tsdl
 482 struct name {
 483     field_type field_name;
 484     field_type field_name;
 485     /* ... */
 486 };
 487 ~~~
 488
 489 Example:
 490
 491 ~~~ tsdl
 492 struct example {
 493     integer {                   /* nameless type */
 494         size = 16;
 495         signed = true;
 496         align = 16;
 497     } first_field_name;
 498     uint64_t second_field_name; /* named type declared in the metadata */
 499 };
 500 ~~~
 501
 502 The fields are placed in a sequence next to each other. They each
 503 possess a field name, which is a unique identifier within the structure.
 504 The identifier is not allowed to use any [reserved keyword](#specC.1.2).
 505 Replacing reserved keywords with underscore-prefixed field names is
 506 **recommended**. Fields starting with an underscore should have their
 507 leading underscore removed by the CTF trace readers.
 508
 509 A nameless structure can be declared as a field type or as part of
 510 a `typedef`:
 511
 512 ~~~ tsdl
 513 struct {
 514     /* ... */
 515 }
 516 ~~~
 517
 518 Alignment for a structure compound type can be forced to a minimum
 519 value by adding an `align` specifier after the declaration of a
 520 structure body. This attribute is read as: `align(value)`. The value is
 521 specified in bits. The structure will be aligned on the maximum value
 522 between this attribute and the alignment required by the basic types
 523 contained within the structure. e.g.
 524
 525 ~~~ tsdl
 526 struct {
 527     /* ... */
 528 } align(32)
 529 ~~~
 530
 531 #### 4.2.2 Variants (discriminated/tagged unions)
 532
 533 A CTF variant is a selection between different types. A CTF variant must
 534 always be defined within the scope of a structure or within fields
 535 contained within a structure (defined recursively). A _tag_ enumeration
 536 field must appear in either the same static scope, prior to the variant
 537 field (in field declaration order), in an upper static scope, or in an
 538 upper dynamic scope (see [Static and dynamic scopes](#spec7.3.2)).
 539 The type selection is indicated by the mapping from the enumeration
 540 value to the string used as variant type selector. The field to use as
 541 tag is specified by the `tag_field`, specified between `< >` after the
 542 `variant` keyword for unnamed variants, and after _variant name_ for
 543 named variants. It is not required that each enumeration mapping appears
 544 as variant type tag field. It is also not required that each variant
 545 type tag appears as enumeration mapping. However, it is required that
 546 any enumeration mapping encountered within a stream has a matching
 547 variant type tag field.
 548
 549 The alignment of the variant is the alignment of the type as selected
 550 by the tag value for the specific instance of the variant. The size of
 551 the variant is the size as selected by the tag value for the specific
 552 instance of the variant.
 553
 554 The alignment of the type containing the variant is independent of the
 555 variant alignment. For instance, if a structure contains two fields, a
 556 32-bit integer, aligned on 32 bits, and a variant, which contains two
 557 choices: either a 32-bit field, aligned on 32 bits, or a 64-bit field,
 558 aligned on 64 bits, the alignment of the outmost structure will be
 559 32-bit (the alignment of its largest field, disregarding the alignment
 560 of the variant). The alignment of the variant will depend on the
 561 selector: if the variant's 32-bit field is selected, its alignment will
 562 be 32-bit, or 64-bit otherwise. It is important to note that variants
 563 are specifically tailored for compactness in a stream. Therefore, the
 564 relative offsets of compound type fields can vary depending on the
 565 offset at which the compound type starts if it contains a variant
 566 that itself contains a type with alignment larger than the largest field
 567 contained within the compound type. This is caused by the fact that the
 568 compound type may contain the enumeration that select the variant's
 569 choice, and therefore the alignment to be applied to the compound type
 570 cannot be determined before encountering the enumeration.
 571
 572 Each variant type selector possess a field name, which is a unique
 573 identifier within the variant. The identifier is not allowed to use any
 574 [reserved keyword](#C.1.2). Replacing reserved keywords with
 575 underscore-prefixed field names is recommended. Fields starting with an
 576 underscore should have their leading underscore removed by the CTF trace
 577 readers.
 578
 579 A named variant declaration followed by its definition within a
 580 structure declaration:
 581
 582 ~~~ tsdl
 583 variant name {
 584     field_type sel1;
 585     field_type sel2;
 586     field_type sel3;
 587     /* ... */
 588 };
 589
 590 struct {
 591     enum : integer_type { sel1, sel2, sel3, /* ... */ } tag_field;
 592     /* ... */
 593     variant name <tag_field> v;
 594 }
 595 ~~~
 596
 597 An unnamed variant definition within a structure is expressed by the
 598 following TSDL metadata:
 599
 600 ~~~ tsdl
 601 struct {
 602     enum : integer_type { sel1, sel2, sel3, /* ... */ } tag_field;
 603     /* ... */
 604     variant <tag_field> {
 605         field_type sel1;
 606         field_type sel2;
 607         field_type sel3;
 608         /* ... */
 609     } v;
 610 }
 611 ~~~
 612
 613 Example of a named variant within a sequence that refers to a single
 614 tag field:
 615
 616 ~~~ tsdl
 617 variant example {
 618     uint32_t a;
 619     uint64_t b;
 620     short c;
 621 };
 622
 623 struct {
 624     enum : uint2_t { a, b, c } choice;
 625     unsigned int seqlen;
 626     variant example <choice> v[seqlen];
 627 }
 628 ~~~
 629
 630 Example of an unnamed variant:
 631
 632 ~~~ tsdl
 633 struct {
 634     enum : uint2_t { a, b, c, d } choice;
 635
 636     /* Unrelated fields can be added between the variant and its tag */
 637     int32_t somevalue;
 638     variant <choice> {
 639         uint32_t a;
 640         uint64_t b;
 641         short c;
 642         struct {
 643             unsigned int field1;
 644             uint64_t field2;
 645         } d;
 646     } s;
 647 }
 648 ~~~
 649
 650 Example of an unnamed variant within an array:
 651
 652 ~~~ tsdl
 653 struct {
 654     enum : uint2_t { a, b, c } choice;
 655     variant <choice> {
 656         uint32_t a;
 657         uint64_t b;
 658         short c;
 659     } v[10];
 660 }
 661 ~~~
 662
 663 Example of a variant type definition within a structure, where the
 664 defined type is then declared within an array of structures. This
 665 variant refers to a tag located in an upper static scope. This example
 666 clearly shows that a variant type definition referring to the tag `x`
 667 uses the closest preceding field from the static scope of the type
 668 definition.
 669
 670 ~~~ tsdl
 671 struct {
 672     enum : uint2_t { a, b, c, d } x;
 673
 674     /*
 675      * "x" refers to the preceding "x" enumeration in the
 676      * static scope of the type definition.
 677      */
 678     typedef variant <x> {
 679       uint32_t a;
 680       uint64_t b;
 681       short c;
 682     } example_variant;
 683
 684     struct {
 685       enum : int { x, y, z } x; /* This enumeration is not used by "v". */
 686
 687       /* "v" uses the "enum : uint2_t { a, b, c, d }" tag. */
 688       example_variant v;
 689     } a[10];
 690 }
 691 ~~~
 692
 693
 694 #### 4.2.3 Arrays
 695
 696 Arrays are fixed-length. Their length is declared in the type
 697 declaration within the metadata. They contain an array of _inner type_
 698 elements, which can refer to any type not containing the type of the
 699 array being declared (no circular dependency). The length is the number
 700 of elements in an array.
 701
 702 TSDL metadata representation of a named array:
 703
 704 ~~~ tsdl
 705 typedef elem_type name[/* length */];
 706 ~~~
 707
 708 A nameless array can be declared as a field type within a
 709 structure, e.g.:
 710
 711 ~~~ tsdl
 712 uint8_t field_name[10];
 713 ~~~
 714
 715 Arrays are always aligned on their element alignment requirement.
 716
 717
 718 #### 4.2.4 Sequences
 719
 720 Sequences are dynamically-sized arrays. They refer to a _length_
 721 unsigned integer field, which must appear in either the same static
 722 scope, prior to the sequence field (in field declaration order),
 723 in an upper static scope, or in an upper dynamic scope
 724 (see [Static and dynamic scopes](#spec7.3.2)). This length field represents
 725 the number of elements in the sequence. The sequence per se is an
 726 array of _inner type_ elements.
 727
 728 TSDL metadata representation for a sequence type definition:
 729
 730 ~~~ tsdl
 731 struct {
 732     unsigned int length_field;
 733     typedef elem_type typename[length_field];
 734     typename seq_field_name;
 735 }
 736 ~~~
 737
 738 A sequence can also be declared as a field type, e.g.:
 739
 740 ~~~ tsdl
 741 struct {
 742     unsigned int length_field;
 743     long seq_field_name[length_field];
 744 }
 745 ~~~
 746
 747 Multiple sequences can refer to the same length field, and these length
 748 fields can be in a different upper dynamic scope, e.g., assuming the
 749 `stream.event.header` defines:
 750
 751 ~~~ tsdl
 752 stream {
 753     /* ... */
 754     id = 1;
 755     event.header := struct {
 756         uint16_t seq_len;
 757     };
 758 };
 759
 760 event {
 761     /* ... */
 762     stream_id = 1;
 763     fields := struct {
 764         long seq_a[stream.event.header.seq_len];
 765         char seq_b[stream.event.header.seq_len];
 766     };
 767 };
 768 ~~~
 769
 770 The sequence elements follow the [array](#spec4.2.3) specifications.
 771
 772
 773 #### 4.2.5 Strings
 774
 775 Strings are an array of _bytes_ of variable size and are terminated by
 776 a `'\0'` "NULL" character. Their encoding is described in the TSDL
 777 metadata. In absence of encoding attribute information, the default
 778 encoding is UTF-8.
 779
 780 TSDL metadata representation of a named string type:
 781
 782 ~~~ tsdl
 783 typealias string {
 784     encoding = /* UTF8 OR ASCII */;
 785 } := name;
 786 ~~~
 787
 788 A nameless string type can be declared as a field type:
 789
 790 ~~~ tsdl
 791 string field_name; /* use default UTF8 encoding */
 792 ~~~
 793
 794 Strings are always aligned on byte size.
 795
 796
 797 ## 5. Event packet header
 798
 799 The event packet header consists of two parts: the
 800 _event packet header_ is the same for all streams of a trace. The
 801 second part, the _event packet context_, is described on a per-stream
 802 basis. Both are described in the TSDL metadata.
 803
 804 Event packet header (all fields are optional, specified by
 805 TSDL metadata):
 806
 807   * **Magic number** (CTF magic number: 0xC1FC1FC1) specifies that this is
 808     a CTF packet. This magic number is optional, but when present, it
 809     should come at the very beginning of the packet.
 810   * **Trace UUID**, used to ensure the event packet match the metadata used.
 811     Note: we cannot use a metadata checksum in every cases instead of a
 812     UUID because metadata can be appended to while tracing is active.
 813     This field is optional.
 814   * **Stream ID**, used as reference to stream description in metadata.
 815     This field is optional if there is only one stream description in
 816     the metadata, but becomes required if there are more than one
 817     stream in the TSDL metadata description.
 818
 819 Event packet context (all fields are optional, specified by
 820 TSDL metadata):
 821
 822   * Event packet **content size** (in bits).
 823   * Event packet **size** (in bits, includes padding).
 824   * Event packet content checksum. Checksum excludes the event packet
 825     header.
 826   * Per-stream event **packet sequence count** (to deal with UDP packet
 827     loss). The number of significant sequence counter bits should also
 828     be present, so wrap-arounds are dealt with correctly.
 829   * Time-stamp at the beginning and timestamp at the end of the event
 830     packet. Both timestamps are written in the packet header, but
 831     sampled respectively while (or before) writing the first event and
 832     while (or after) writing the last event in the packet. The inclusive
 833     range between these timestamps should include all event timestamps
 834     assigned to events contained within the packet. The timestamp at the
 835     beginning of an event packet is guaranteed to be below or equal the
 836     timestamp at the end of that event packet. The timestamp at the end
 837     of an event packet is guaranteed to be below or equal the
 838     timestamps at the end of any following packet within the same stream.
 839     See [Clocks](#spec8) for more detail.
 840   * **Events discarded count**. Snapshot of a per-stream
 841     free-running counter, counting the number of events discarded that
 842     were supposed to be written in the stream after the last event in
 843     the event packet. Note: producer-consumer buffer full condition can
 844     fill the current event packet with padding so we know exactly where
 845     events have been discarded. However, if the buffer full condition
 846     chooses not to fill the current event packet with padding, all we
 847     know about the timestamp range in which the events have been
 848     discarded is that it is somewhere between the beginning and the end
 849     of the packet.
 850   * Lossless **compression scheme** used for the event packet content.
 851     Applied directly to raw data. New types of compression can be added
 852     in following versions of the format.
 853     * 0: no compression scheme
 854     * 1: bzip2
 855     * 2: gzip
 856     * 3: xz
 857   * **Cypher** used for the event packet content. Applied after
 858     compression.
 859     * 0: no encryption
 860     * 1: AES
 861   * **Checksum scheme** used for the event packet content. Applied after
 862     encryption.
 863     * 0: no checksum
 864     * 1: md5
 865     * 2: sha1
 866     * 3: crc32
 867
 868
 869 ### 5.1 Event packet header description
 870
 871 The event packet header layout is indicated by the
 872 `trace.packet.header` field. Here is a recommended structure type for
 873 the packet header with the fields typically expected (although these
 874 fields are each optional):
 875
 876 ~~~ tsdl
 877 struct event_packet_header {
 878     uint32_t magic;
 879     uint8_t  uuid[16];
 880     uint32_t stream_id;
 881 };
 882
 883 trace {
 884     /* ... */
 885     packet.header := struct event_packet_header;
 886 };
 887 ~~~
 888
 889 If the magic number (`magic` field) is not present,
 890 tools such as `file` will have no mean to discover the file type.
 891
 892 If the `uuid` field is not present, no validation that the metadata
 893 actually corresponds to the stream is performed.
 894
 895 If the `stream_id` packet header field is missing, the trace can only
 896 contain a single stream. Its `id` field can be left out, and its events
 897 don't need to declare a `stream_id` field.
 898
 899
 900 ### 5.2 Event packet context description
 901
 902 Event packet context example. These are declared within the stream
 903 declaration in the metadata. All these fields are optional. If the
 904 packet size field is missing, the whole stream only contains a single
 905 packet. If the content size field is missing, the packet is filled
 906 (no padding). The content and packet sizes include all headers.
 907
 908 An example event packet context type:
 909
 910 ~~~ tsdl
 911 struct event_packet_context {
 912     uint64_t timestamp_begin;
 913     uint64_t timestamp_end;
 914     uint32_t checksum;
 915     uint32_t stream_packet_count;
 916     uint32_t events_discarded;
 917     uint32_t cpu_id;
 918     uint64_t content_size;
 919     uint64_t packet_size;
 920     uint8_t  compression_scheme;
 921     uint8_t  encryption_scheme;
 922     uint8_t  checksum_scheme;
 923 };
 924 ~~~
 925
 926
 927 ## 6. Event Structure
 928
 929 The overall structure of an event is:
 930
 931   1. Event header (as specified by the stream metadata)
 932   2. Stream event context (as specified by the stream metadata)
 933   3. Event context (as specified by the event metadata)
 934   4. Event payload (as specified by the event metadata)
 935
 936 This structure defines an implicit dynamic scoping, where variants
 937 located in inner structures (those with a higher number in the listing
 938 above) can refer to the fields of outer structures (with lower number
 939 in the listing above). See [TSDL scopes](#spec7.3) for more detail.
 940
 941 The total length of an event is defined as the difference between the
 942 end of its event payload and the end of the previous event's event
 943 payload. Therefore, it includes the event header alignment padding, and
 944 all its fields and their respective alignment padding. Events of length
 945 0 are forbidden.
 946
 947
 948 ### 6.1 Event header
 949
 950 Event headers can be described within the metadata. We hereby propose,
 951 as an example, two types of events headers. Type 1 accommodates streams
 952 with less than 31 event IDs. Type 2 accommodates streams with 31 or
 953 more event IDs.
 954
 955 One major factor can vary between streams: the number of event IDs
 956 assigned to a stream. Luckily, this information tends to stay
 957 relatively constant (modulo event registration while trace is being
 958 recorded), so we can specify different representations for streams
 959 containing few event IDs and streams containing many event IDs, so we
 960 end up representing the event ID and timestamp as densely as possible
 961 in each case.
 962
 963 The header is extended in the rare occasions where the information
 964 cannot be represented in the ranges available in the standard event
 965 header. They are also used in the rare occasions where the data
 966 required for a field could not be collected: the flag corresponding to
 967 the missing field within the `missing_fields` array is then set to 1.
 968
 969 Types `uintX_t` represent an `X`-bit unsigned integer, as declared with
 970 either:
 971
 972 ~~~ tsdl
 973 typealias integer {
 974     size = /* X */;
 975     align = /* X */;
 976     signed = false;
 977 } := uintX_t;
 978 ~~~
 979
 980 or
 981
 982 ~~~ tsdl
 983 typealias integer {
 984     size = /* X */;
 985     align = 1;
 986     signed = false;
 987 } := uintX_t;
 988 ~~~
 989
 990 For more information about timestamp fields, see [Clocks](#spec8).
 991
 992
 993 #### 6.1.1 Type 1: few event IDs
 994
 995   * Aligned on 32-bit (or 8-bit if byte-packed, depending on the
 996     architecture preference)
 997   * Native architecture byte ordering
 998   * For `compact` selection, fixed size of 32 bits
 999   * For "extended" selection, size depends on the architecture and
1000     variant alignment
1001
1002 ~~~ tsdl
1003 struct event_header_1 {
1004     /*
1005      * id: range: 0 - 30.
1006      * id 31 is reserved to indicate an extended header.
1007      */
1008     enum : uint5_t { compact = 0 ... 30, extended = 31 } id;
1009     variant <id> {
1010         struct {
1011             uint27_t timestamp;
1012         } compact;
1013         struct {
1014             uint32_t id;        /* 32-bit event IDs */
1015             uint64_t timestamp; /* 64-bit timestamps */
1016         } extended;
1017     } v;
1018 } align(32); /* or align(8) */
1019 ~~~
1020
1021
1022 #### 6.1.2 Type 2: many event IDs
1023
1024   * Aligned on 16-bit (or 8-bit if byte-packed, depending on the
1025     architecture preference)
1026   * Native architecture byte ordering
1027   * For `compact` selection, size depends on the architecture and
1028     variant alignment
1029   * For `extended` selection, size depends on the architecture and
1030     variant alignment
1031
1032 ~~~ tsdl
1033 struct event_header_2 {
1034     /*
1035      * id: range: 0 - 65534.
1036      * id 65535 is reserved to indicate an extended header.
1037      */
1038     enum : uint16_t { compact = 0 ... 65534, extended = 65535 } id;
1039     variant <id> {
1040         struct {
1041             uint32_t timestamp;
1042         } compact;
1043         struct {
1044             uint32_t id;        /* 32-bit event IDs */
1045             uint64_t timestamp; /* 64-bit timestamps */
1046         } extended;
1047     } v;
1048 } align(16); /* or align(8) */
1049 ~~~
1050
1051
1052 ### 6.2 Stream event context and event context
1053
1054 The event context contains information relative to the current event.
1055 The choice and meaning of this information is specified by the TSDL
1056 stream and event metadata descriptions. The stream context is applied
1057 to all events within the stream. The stream context structure follows
1058 the event header. The event context is applied to specific events. Its
1059 structure follows the stream context structure.
1060
1061 An example of stream-level event context is to save the event payload
1062 size with each event, or to save the current PID with each event.
1063 These are declared within the stream declaration within the metadata:
1064
1065 ~~~ tsdl
1066 stream {
1067     /* ... */
1068     event.context := struct {
1069         uint pid;
1070         uint16_t payload_size;
1071     };
1072 };
1073 ~~~
1074
1075 An example of event-specific event context is to declare a bitmap of
1076 missing fields, only appended after the stream event context if the
1077 extended event header is selected. `NR_FIELDS` is the number of fields
1078 within the event (a numeric value).
1079
1080 ~~~ tsdl
1081 event {
1082     context := struct {
1083         variant <id> {
1084             struct { } compact;
1085             struct {
1086                 /* missing event fields bitmap */
1087                 uint1_t missing_fields[NR_FIELDS];
1088             } extended;
1089         } v;
1090     };
1091     /* ... */
1092 }
1093 ~~~
1094
1095
1096 ### 6.3 Event payload
1097
1098 An event payload contains fields specific to a given event type. The
1099 fields belonging to an event type are described in the event-specific
1100 metadata within a structure type.
1101
1102
1103 #### 6.3.1 Padding
1104
1105 No padding at the end of the event payload. This differs from the ISO/C
1106 standard for structures, but follows the CTF standard for structures.
1107 In a trace, even though it makes sense to align the beginning of a
1108 structure, it really makes no sense to add padding at the end of the
1109 structure, because structures are usually not followed by a structure
1110 of the same type.
1111
1112 This trick can be done by adding a zero-length `end` field at the end
1113 of the C structures, and by using the offset of this field rather than
1114 using `sizeof()` when calculating the size of a structure
1115 (see [Helper macros](#specA)).
1116
1117
1118 #### 6.3.2 Alignment
1119
1120 The event payload is aligned on the largest alignment required by types
1121 contained within the payload. This follows the ISO/C standard for
1122 structures.
1123
1124
1125 ## 7. Trace Stream Description Language (TSDL)
1126
1127 The Trace Stream Description Language (TSDL) allows expression of the
1128 binary trace streams layout in a C99-like Domain Specific Language
1129 (DSL).
1130
1131
1132 ### 7.1 Meta-data
1133
1134 The trace stream layout description is located in the trace metadata.
1135 The metadata is itself located in a stream identified by its name:
1136 `metadata`.
1137
1138 The metadata description can be expressed in two different formats:
1139 text-only and packet-based. The text-only description facilitates
1140 generation of metadata and provides a convenient way to enter the
1141 metadata information by hand. The packet-based metadata provides the
1142 CTF stream packet facilities (checksumming, compression, encryption,
1143 network-readiness) for metadata stream generated and transported by a
1144 tracer.
1145
1146 The text-only metadata file is a plain-text TSDL description. This file
1147 must begin with the following characters to identify the file as a CTF
1148 TSDL text-based metadata file (without the double-quotes):
1149
1150 ~~~ text
1151 "/* CTF"
1152 ~~~
1153
1154 It must be followed by a space, and the version of the specification
1155 followed by the CTF trace, e.g.:
1156
1157 ~~~ text
1158 " 1.8"
1159 ~~~
1160
1161 These characters allow automated discovery of file type and CTF
1162 specification version. They are interpreted as a the beginning of a
1163 comment by the TSDL metadata parser. The comment can be continued to
1164 contain extra commented characters before it is closed.
1165
1166 The packet-based metadata is made of _metadata packets_, which each
1167 start with a metadata packet header. The packet-based metadata
1168 description is detected by reading the magic number 0x75D11D57 at the
1169 beginning of the file. This magic number is also used to detect the
1170 endianness of the architecture by trying to read the CTF magic number
1171 and its counterpart in reversed endianness. The events within the
1172 metadata stream have no event header nor event context. Each event only
1173 contains a special _sequence_ payload, which is a sequence of bits which
1174 length is implicitly calculated by using the
1175 `trace.packet.header.content_size` field, minus the packet header size.
1176 The formatting of this sequence of bits is a plain-text representation
1177 of the TSDL description. Each metadata packet start with a special
1178 packet header, specific to the metadata stream, which contains,
1179 exactly:
1180
1181 ~~~ tsdl
1182 struct metadata_packet_header {
1183     uint32_t magic;              /* 0x75D11D57 */
1184     uint8_t  uuid[16];           /* Unique Universal Identifier */
1185     uint32_t checksum;           /* 0 if unused */
1186     uint32_t content_size;       /* in bits */
1187     uint32_t packet_size;        /* in bits */
1188     uint8_t  compression_scheme; /* 0 if unused */
1189     uint8_t  encryption_scheme;  /* 0 if unused */
1190     uint8_t  checksum_scheme;    /* 0 if unused */
1191     uint8_t  major;              /* CTF spec version major number */
1192     uint8_t  minor;              /* CTF spec version minor number */
1193 };
1194 ~~~
1195
1196 The packet-based metadata can be converted to a text-only metadata by
1197 concatenating all the strings it contains.
1198
1199 In the textual representation of the metadata, the text contained
1200 within `/*` and `*/`, as well as within `//` and end of line, are
1201 treated as comments. Boolean values can be represented as `true`,
1202 `TRUE`, or `1` for true, and `false`, `FALSE`, or `0` for false. Within
1203 the string-based metadata description, the trace UUID is represented as
1204 a string of hexadecimal digits and dashes `-`. In the event packet
1205 header, the trace UUID is represented as an array of bytes.
1206
1207
1208 ### 7.2 Declaration vs definition
1209
1210 A declaration associates a layout to a type, without specifying where
1211 this type is located in the event [structure hierarchy](#spec6).
1212 This therefore includes `typedef`, `typealias`, as well as all type
1213 specifiers. In certain circumstances (`typedef`, structure field and
1214 variant field), a declaration is followed by a declarator, which specify
1215 the newly defined type name (for `typedef`), or the field name (for
1216 declarations located within structure and variants). Array and sequence,
1217 declared with square brackets (`[` `]`), are part of the declarator,
1218 similarly to C99. The enumeration base type is specified by
1219 `: enum_base`, which is part of the type specifier. The variant tag
1220 name, specified between `<` `>`, is also part of the type specifier.
1221
1222 A definition associates a type to a location in the event
1223 [structure hierarchy](#spec6). This association is denoted by `:=`,
1224 as shown in [TSDL scopes](#spec7.3).
1225
1226
1227 ### 7.3 TSDL scopes
1228
1229 TSDL uses three different types of scoping: a lexical scope is used for
1230 declarations and type definitions, and static and dynamic scopes are
1231 used for variants references to tag fields (with relative and absolute
1232 path lookups) and for sequence references to length fields.
1233
1234
1235 #### 7.3.1 Lexical Scope
1236
1237 Each of `trace`, `env`, `stream`, `event`, `struct` and `variant` have
1238 their own nestable declaration scope, within which types can be declared
1239 using `typedef` and `typealias`. A root declaration scope also contains
1240 all declarations located outside of any of the aforementioned
1241 declarations. An inner declaration scope can refer to type declared
1242 within its container lexical scope prior to the inner declaration scope.
1243 Redefinition of a typedef or typealias is not valid, although hiding an
1244 upper scope typedef or typealias is allowed within a sub-scope.
1245
1246
1247 #### 7.3.2 Static and dynamic scopes
1248
1249 A local static scope consists in the scope generated by the declaration
1250 of fields within a compound type. A static scope is a local static scope
1251 augmented with the nested sub-static-scopes it contains.
1252
1253 A dynamic scope consists in the static scope augmented with the
1254 implicit [event structure](#spec6) definition hierarchy.
1255
1256 Multiple declarations of the same field name within a local static scope
1257 is not valid. It is however valid to re-use the same field name in
1258 different local scopes.
1259
1260 Nested static and dynamic scopes form lookup paths. These are used for
1261 variant tag and sequence length references. They are used at the variant
1262 and sequence definition site to look up the location of the tag field
1263 associated with a variant, and to lookup up the location of the length
1264 field associated with a sequence.
1265
1266 Variants and sequences can refer to a tag field either using a relative
1267 path or an absolute path. The relative path is relative to the scope in
1268 which the variant or sequence performing the lookup is located.
1269 Relative paths are only allowed to lookup within the same static scope,
1270 which includes its nested static scopes. Lookups targeting parent static
1271 scopes need to be performed with an absolute path.
1272
1273 Absolute path lookups use the full path including the dynamic scope
1274 followed by a `.` and then the static scope. Therefore, variants (or
1275 sequences) in lower levels in the dynamic scope (e.g., event context)
1276 can refer to a tag (or length) field located in upper levels
1277 (e.g., in the event header) by specifying, in this case, the associated
1278 tag with `<stream.event.header.field_name>`. This allows, for instance,
1279 the event context to define a variant referring to the `id` field of
1280 the event header as selector.
1281
1282 The dynamic scope prefixes are thus:
1283
1284   * Trace environment: `<env. >`
1285   * Trace packet header: `<trace.packet.header. >`
1286   * Stream packet context: `<stream.packet.context. >`
1287   * Event header: `<stream.event.header. >`
1288   * Stream event context: `<stream.event.context. >`
1289   * Event context: `<event.context. >`
1290   * Event payload: `<event.fields. >`
1291
1292 The target dynamic scope must be specified explicitly when referring to
1293 a field outside of the static scope (absolute scope reference). No
1294 conflict can occur between relative and dynamic paths, because the
1295 keywords `trace`, `stream`, and `event` are reserved, and thus not
1296 permitted as field names. It is recommended that field names clashing
1297 with CTF and C99 reserved keywords use an underscore prefix to
1298 eliminate the risk of generating a description containing an invalid
1299 field name. Consequently, fields starting with an underscore should have
1300 their leading underscore removed by the CTF trace readers.
1301
1302 The information available in the dynamic scopes can be thought of as the
1303 current tracing context. At trace production, information about the
1304 current context is saved into the specified scope field levels. At trace
1305 consumption, for each event, the current trace context is therefore
1306 readable by accessing the upper dynamic scopes.
1307
1308
1309 ### 7.4 TSDL examples
1310
1311 The grammar representing the TSDL metadata is presented in
1312 [TSDL grammar](#specC). This section presents a rather lighter reading that
1313 consists in examples of TSDL metadata, with template values.
1314
1315 The stream ID can be left out if there is only one stream in the
1316 trace. The event `id` field can be left out if there is only one event
1317 in a stream.
1318
1319 ~~~ tsdl
1320 trace {
1321     major = /* value */;            /* CTF spec version major number */
1322     minor = /* value */;            /* CTF spec version minor number */
1323     uuid = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa";  /* Trace UUID */
1324     byte_order = /* be OR le */;    /* Endianness (required) */
1325     packet.header := struct {
1326         uint32_t magic;
1327         uint8_t  uuid[16];
1328         uint32_t stream_id;
1329     };
1330 };
1331
1332 /*
1333  * The "env" (environment) scope contains assignment expressions. The
1334  * field names and content are implementation-defined.
1335  */
1336 env {
1337     pid = /* value */;    /* example */
1338     proc_name = "name";   /* example */
1339     /* ... */
1340 };
1341
1342 stream {
1343     id = /* stream_id */;
1344     /* Type 1 - Few event IDs; Type 2 - Many event IDs. See section 6.1. */
1345     event.header := /* event_header_1 OR event_header_2 */;
1346     event.context := struct {
1347         /* ... */
1348     };
1349     packet.context := struct {
1350         /* ... */
1351     };
1352 };
1353
1354 event {
1355     name = "event_name";
1356     id = /* value */;            /* Numeric identifier within the stream */
1357     stream_id = /* stream_id */;
1358     loglevel = /* value */;
1359     model.emf.uri = "string";
1360     context := struct {
1361         /* ... */
1362     };
1363     fields := struct {
1364         /* ... */
1365     };
1366 };
1367
1368 callsite {
1369     name = "event_name";
1370     func = "func_name";
1371     file = "myfile.c";
1372     line = 39;
1373     ip = 0x40096c;
1374 };
1375 ~~~
1376
1377 More detail on [types](#spec4):
1378
1379 ~~~ tsdl
1380 /*
1381  * Named types:
1382  *
1383  * Type declarations behave similarly to the C standard.
1384  */
1385
1386 typedef aliased_type_specifiers new_type_declarators;
1387
1388 /* e.g.: typedef struct example new_type_name[10]; */
1389
1390 /*
1391  * typealias
1392  *
1393  * The "typealias" declaration can be used to give a name (including
1394  * pointer declarator specifier) to a type. It should also be used to
1395  * map basic C types (float, int, unsigned long, ...) to a CTF type.
1396  * Typealias is a superset of "typedef": it also allows assignment of a
1397  * simple variable identifier to a type.
1398  */
1399
1400 typealias type_class {
1401     /* ... */
1402 } := type_specifiers type_declarator;
1403
1404 /*
1405  * e.g.:
1406  * typealias integer {
1407  *   size = 32;
1408  *   align = 32;
1409  *   signed = false;
1410  * } := struct page *;
1411  *
1412  * typealias integer {
1413  *  size = 32;
1414  *  align = 32;
1415  *  signed = true;
1416  * } := int;
1417  */
1418
1419 struct name {
1420     /* ... */
1421 };
1422
1423 variant name {
1424     /* ... */
1425 };
1426
1427 enum name : integer_type {
1428     /* ... */
1429 };
1430 ~~~
1431
1432 Unnamed types, contained within compound type fields, `typedef` or
1433 `typealias`:
1434
1435 ~~~ tsdl
1436 struct {
1437     /* ... */
1438 }
1439 ~~~
1440
1441 ~~~ tsdl
1442 struct {
1443     /* ... */
1444 } align(value)
1445 ~~~
1446
1447 ~~~ tsdl
1448 variant {
1449     /* ... */
1450 }
1451 ~~~
1452
1453 ~~~ tsdl
1454 enum : integer_type {
1455     /* ... */
1456 }
1457 ~~~
1458
1459 ~~~ tsdl
1460 typedef type new_type[length];
1461
1462 struct {
1463     type field_name[length];
1464 }
1465 ~~~
1466
1467 ~~~ tsdl
1468 typedef type new_type[length_type];
1469
1470 struct {
1471     type field_name[length_type];
1472 }
1473 ~~~
1474
1475 ~~~ tsdl
1476 integer {
1477     /* ... */
1478 }
1479 ~~~
1480
1481 ~~~ tsdl
1482 floating_point {
1483     /* ... */
1484 }
1485 ~~~
1486
1487 ~~~ tsdl
1488 struct {
1489     integer_type field_name:size;   /* GNU/C bitfield */
1490 }
1491 ~~~
1492
1493 ~~~ tsdl
1494 struct {
1495     string field_name;
1496 }
1497 ~~~
1498
1499
1500 ## 8. Clocks
1501
1502 Clock metadata allows to describe the clock topology of the system, as
1503 well as to detail each clock parameter. In absence of clock description,
1504 it is assumed that all fields named `timestamp` use the same clock
1505 source, which increments once per nanosecond.
1506
1507 Describing a clock and how it is used by streams is threefold: first,
1508 the clock and clock topology should be described in a `clock`
1509 description block, e.g.:
1510
1511 ~~~ tsdl
1512 clock {
1513     name = cycle_counter_sync;
1514     uuid = "62189bee-96dc-11e0-91a8-cfa3d89f3923";
1515     description = "Cycle counter synchronized across CPUs";
1516     freq = 1000000000;           /* frequency, in Hz */
1517     /* precision in seconds is: 1000 * (1/freq) */
1518     precision = 1000;
1519     /*
1520      * clock value offset from Epoch is:
1521      * offset_s + (offset * (1/freq))
1522      */
1523     offset_s = 1326476837;
1524     offset = 897235420;
1525     absolute = FALSE;
1526 };
1527 ~~~
1528
1529 The mandatory `name` field specifies the name of the clock identifier,
1530 which can later be used as a reference. The optional field `uuid` is
1531 the unique identifier of the clock. It can be used to correlate
1532 different traces that use the same clock. An optional textual
1533 description string can be added with the `description` field. The
1534 `freq` field is the initial frequency of the clock, in Hz. If the
1535 `freq` field is not present, the frequency is assumed to be 1000000000
1536 (providing clock increment of 1 ns). The optional `precision` field
1537 details the uncertainty on the clock measurements, in (1/freq) units.
1538 The `offset_s` and `offset` fields indicate the offset from
1539 POSIX.1 Epoch, 1970-01-01 00:00:00 +0000 (UTC), to the zero of value
1540 of the clock. The `offset_s` field is in seconds. The `offset` field is
1541 in (1/freq) units. If any of the `offset_s` or `offset` field is not
1542 present, it is assigned the 0 value. The field `absolute` is `TRUE` if
1543 the clock is a global reference across different clock UUID
1544 (e.g. NTP time). Otherwise, `absolute` is `FALSE`, and the clock can
1545 be considered as synchronized only with other clocks that have the same
1546 UUID.
1547
1548 Secondly, a reference to this clock should be added within an integer
1549 type:
1550
1551 ~~~ tsdl
1552 typealias integer {
1553     size = 64; align = 1; signed = false;
1554     map = clock.cycle_counter_sync.value;
1555 } := uint64_ccnt_t;
1556 ~~~
1557
1558 Thirdly, stream declarations can reference the clock they use as a
1559 timestamp source:
1560
1561 ~~~ tsdl
1562 struct packet_context {
1563     uint64_ccnt_t ccnt_begin;
1564     uint64_ccnt_t ccnt_end;
1565     /* ... */
1566 };
1567
1568 stream {
1569     /* ... */
1570     event.header := struct {
1571         uint64_ccnt_t timestamp;
1572         /* ... */
1573     };
1574     packet.context := struct packet_context;
1575 };
1576 ~~~
1577
1578 For a N-bit integer type referring to a clock, if the integer overflows
1579 compared to the N low order bits of the clock prior value found in the
1580 same stream, then it is assumed that one, and only one, overflow
1581 occurred. It is therefore important that events encoding time on a small
1582 number of bits happen frequently enough to detect when more than one
1583 N-bit overflow occurs.
1584
1585 In a packet context, clock field names ending with `_begin` and `_end`
1586 have a special meaning: this refers to the timestamps at, respectively,
1587 the beginning and the end of each packet.
1588
1589
1590 ## A. Helper macros
1591
1592 The two following macros keep track of the size of a GNU/C structure
1593 without padding at the end by placing HEADER_END as the last field.
1594 A one byte end field is used for C90 compatibility (C99 flexible arrays
1595 could be used here). Note that this does not affect the effective
1596 structure size, which should always be calculated with the
1597 `header_sizeof()` helper.
1598
1599 ~~~ c
1600 #define HEADER_END          char end_field
1601 #define header_sizeof(type) offsetof(typeof(type), end_field)
1602 ~~~
1603
1604 ## B. Stream header rationale
1605
1606 An event stream is divided in contiguous event packets of variable
1607 size. These subdivisions allow the trace analyzer to perform a fast
1608 binary search by time within the stream (typically requiring to index
1609 only the event packet headers) without reading the whole stream. These
1610 subdivisions have a variable size to eliminate the need to transfer the
1611 event packet padding when partially filled event packets must be sent
1612 when streaming a trace for live viewing/analysis. An event packet can
1613 contain a certain amount of padding at the end. Dividing streams into
1614 event packets is also useful for network streaming over UDP and flight
1615 recorder mode tracing (a whole event packet can be swapped out of the
1616 buffer atomically for reading).
1617
1618 The stream header is repeated at the beginning of each event packet to
1619 allow flexibility in terms of:
1620
1621   * streaming support
1622   * allowing arbitrary buffers to be discarded without making the trace
1623     unreadable
1624   * allow UDP packet loss handling by either dealing with missing event packet
1625     or asking for re-transmission
1626   * transparently support flight recorder mode
1627   * transparently support crash dump
1628
1629
1630 ## C. TSDL Grammar
1631
1632 ~~~ c
1633 /*
1634  * Common Trace Format (CTF) Trace Stream Description Language (TSDL) Grammar.
1635  *
1636  * Inspired from the C99 grammar:
1637  * http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf (Annex A)
1638  * and c++1x grammar (draft)
1639  * http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3291.pdf (Annex A)
1640  *
1641  * Specialized for CTF needs by including only constant and declarations from
1642  * C99 (excluding function declarations), and by adding support for variants,
1643  * sequences and CTF-specific specifiers. Enumeration container types
1644  * semantic is inspired from c++1x enum-base.
1645  */
1646 ~~~
1647
1648
1649 ### C.1 Lexical grammar
1650
1651
1652 #### C.1.1 Lexical elements
1653
1654 ~~~ text
1655 token:
1656     keyword
1657     identifier
1658     constant
1659     string-literal
1660     punctuator
1661 ~~~
1662
1663 #### C.1.2 Keywords
1664
1665 ~~~ text
1666 keyword: is one of
1667
1668 align
1669 callsite
1670 const
1671 char
1672 clock
1673 double
1674 enum
1675 env
1676 event
1677 floating_point
1678 float
1679 integer
1680 int
1681 long
1682 short
1683 signed
1684 stream
1685 string
1686 struct
1687 trace
1688 typealias
1689 typedef
1690 unsigned
1691 variant
1692 void
1693 _Bool
1694 _Complex
1695 _Imaginary
1696 ~~~
1697
1698
1699 #### C.1.3 Identifiers
1700
1701 ~~~ text
1702 identifier:
1703     identifier-nondigit
1704     identifier identifier-nondigit
1705     identifier digit
1706
1707 identifier-nondigit:
1708     nondigit
1709     universal-character-name
1710     any other implementation-defined characters
1711
1712 nondigit:
1713     _
1714     [a-zA-Z]    /* regular expression */
1715
1716 digit:
1717     [0-9]        /* regular expression */
1718 ~~~
1719
1720
1721 #### C.1.4 Universal character names
1722
1723 ~~~ text
1724 universal-character-name:
1725     \u hex-quad
1726     \U hex-quad hex-quad
1727
1728 hex-quad:
1729     hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit
1730 ~~~
1731
1732
1733 ##### C.1.5 Constants
1734
1735 ~~~ text
1736 constant:
1737     integer-constant
1738     enumeration-constant
1739     character-constant
1740
1741 integer-constant:
1742     decimal-constant integer-suffix-opt
1743     octal-constant integer-suffix-opt
1744     hexadecimal-constant integer-suffix-opt
1745
1746 decimal-constant:
1747     nonzero-digit
1748     decimal-constant digit
1749
1750 octal-constant:
1751     0
1752     octal-constant octal-digit
1753
1754 hexadecimal-constant:
1755     hexadecimal-prefix hexadecimal-digit
1756     hexadecimal-constant hexadecimal-digit
1757
1758 hexadecimal-prefix:
1759     0x
1760     0X
1761
1762 nonzero-digit:
1763     [1-9]
1764
1765 integer-suffix:
1766     unsigned-suffix long-suffix-opt
1767     unsigned-suffix long-long-suffix
1768     long-suffix unsigned-suffix-opt
1769     long-long-suffix unsigned-suffix-opt
1770
1771 unsigned-suffix:
1772     u
1773     U
1774
1775 long-suffix:
1776     l
1777     L
1778
1779 long-long-suffix:
1780     ll
1781     LL
1782
1783 enumeration-constant:
1784     identifier
1785     string-literal
1786
1787 character-constant:
1788     ' c-char-sequence '
1789     L' c-char-sequence '
1790
1791 c-char-sequence:
1792     c-char
1793     c-char-sequence c-char
1794
1795 c-char:
1796     any member of source charset except single-quote ('), backslash
1797     (\), or new-line character.
1798     escape-sequence
1799
1800 escape-sequence:
1801     simple-escape-sequence
1802     octal-escape-sequence
1803     hexadecimal-escape-sequence
1804     universal-character-name
1805
1806 simple-escape-sequence: one of
1807     \' \" \? \\ \a \b \f \n \r \t \v
1808
1809 octal-escape-sequence:
1810     \ octal-digit
1811     \ octal-digit octal-digit
1812     \ octal-digit octal-digit octal-digit
1813
1814 hexadecimal-escape-sequence:
1815     \x hexadecimal-digit
1816     hexadecimal-escape-sequence hexadecimal-digit
1817 ~~~
1818
1819
1820 #### C.1.6 String literals
1821
1822 ~~~ text
1823 string-literal:
1824     " s-char-sequence-opt "
1825     L" s-char-sequence-opt "
1826
1827 s-char-sequence:
1828     s-char
1829     s-char-sequence s-char
1830
1831 s-char:
1832     any member of source charset except double-quote ("), backslash
1833     (\), or new-line character.
1834     escape-sequence
1835 ~~~
1836
1837
1838 #### C.1.7 Punctuators
1839
1840 ~~~ text
1841 punctuator: one of
1842     [ ] ( ) { } . -> * + - < > : ; ... = ,
1843 ~~~
1844
1845
1846 ### C.2 Phrase structure grammar
1847
1848 ~~~ text
1849 primary-expression:
1850     identifier
1851     constant
1852     string-literal
1853     ( unary-expression )
1854
1855 postfix-expression:
1856     primary-expression
1857     postfix-expression [ unary-expression ]
1858     postfix-expression . identifier
1859     postfix-expressoin -> identifier
1860
1861 unary-expression:
1862     postfix-expression
1863     unary-operator postfix-expression
1864
1865 unary-operator: one of
1866     + -
1867
1868 assignment-operator:
1869     =
1870
1871 type-assignment-operator:
1872     :=
1873
1874 constant-expression-range:
1875     unary-expression ... unary-expression
1876 ~~~
1877
1878
1879 #### C.2.2 Declarations:
1880
1881 ~~~ text
1882 declaration:
1883     declaration-specifiers declarator-list-opt ;
1884     ctf-specifier ;
1885
1886 declaration-specifiers:
1887     storage-class-specifier declaration-specifiers-opt
1888     type-specifier declaration-specifiers-opt
1889     type-qualifier declaration-specifiers-opt
1890
1891 declarator-list:
1892     declarator
1893     declarator-list , declarator
1894
1895 abstract-declarator-list:
1896     abstract-declarator
1897     abstract-declarator-list , abstract-declarator
1898
1899 storage-class-specifier:
1900     typedef
1901
1902 type-specifier:
1903     void
1904     char
1905     short
1906     int
1907     long
1908     float
1909     double
1910     signed
1911     unsigned
1912     _Bool
1913     _Complex
1914     _Imaginary
1915     struct-specifier
1916     variant-specifier
1917     enum-specifier
1918     typedef-name
1919     ctf-type-specifier
1920
1921 align-attribute:
1922     align ( unary-expression )
1923
1924 struct-specifier:
1925     struct identifier-opt { struct-or-variant-declaration-list-opt } align-attribute-opt
1926     struct identifier align-attribute-opt
1927
1928 struct-or-variant-declaration-list:
1929     struct-or-variant-declaration
1930     struct-or-variant-declaration-list struct-or-variant-declaration
1931
1932 struct-or-variant-declaration:
1933     specifier-qualifier-list struct-or-variant-declarator-list ;
1934     declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list ;
1935     typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list ;
1936     typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list ;
1937
1938 specifier-qualifier-list:
1939     type-specifier specifier-qualifier-list-opt
1940     type-qualifier specifier-qualifier-list-opt
1941
1942 struct-or-variant-declarator-list:
1943     struct-or-variant-declarator
1944     struct-or-variant-declarator-list , struct-or-variant-declarator
1945
1946 struct-or-variant-declarator:
1947     declarator
1948     declarator-opt : unary-expression
1949
1950 variant-specifier:
1951     variant identifier-opt variant-tag-opt { struct-or-variant-declaration-list }
1952     variant identifier variant-tag
1953
1954 variant-tag:
1955     < unary-expression >
1956
1957 enum-specifier:
1958     enum identifier-opt { enumerator-list }
1959     enum identifier-opt { enumerator-list , }
1960     enum identifier
1961     enum identifier-opt : declaration-specifiers { enumerator-list }
1962     enum identifier-opt : declaration-specifiers { enumerator-list , }
1963
1964 enumerator-list:
1965     enumerator
1966     enumerator-list , enumerator
1967
1968 enumerator:
1969     enumeration-constant
1970     enumeration-constant assignment-operator unary-expression
1971     enumeration-constant assignment-operator constant-expression-range
1972
1973 type-qualifier:
1974     const
1975
1976 declarator:
1977     pointer-opt direct-declarator
1978
1979 direct-declarator:
1980     identifier
1981     ( declarator )
1982     direct-declarator [ unary-expression ]
1983
1984 abstract-declarator:
1985     pointer-opt direct-abstract-declarator
1986
1987 direct-abstract-declarator:
1988     identifier-opt
1989     ( abstract-declarator )
1990     direct-abstract-declarator [ unary-expression ]
1991     direct-abstract-declarator [ ]
1992
1993 pointer:
1994     * type-qualifier-list-opt
1995     * type-qualifier-list-opt pointer
1996
1997 type-qualifier-list:
1998     type-qualifier
1999     type-qualifier-list type-qualifier
2000
2001 typedef-name:
2002     identifier
2003 ~~~
2004
2005
2006 #### C.2.3 CTF-specific declarations
2007
2008 ~~~ text
2009 ctf-specifier:
2010     clock { ctf-assignment-expression-list-opt }
2011     event { ctf-assignment-expression-list-opt }
2012     stream { ctf-assignment-expression-list-opt }
2013     env { ctf-assignment-expression-list-opt }
2014     trace { ctf-assignment-expression-list-opt }
2015     callsite { ctf-assignment-expression-list-opt }
2016     typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list
2017     typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list
2018
2019 ctf-type-specifier:
2020     floating_point { ctf-assignment-expression-list-opt }
2021     integer { ctf-assignment-expression-list-opt }
2022     string { ctf-assignment-expression-list-opt }
2023     string
2024
2025 ctf-assignment-expression-list:
2026     ctf-assignment-expression ;
2027     ctf-assignment-expression-list ctf-assignment-expression ;
2028
2029 ctf-assignment-expression:
2030     unary-expression assignment-operator unary-expression
2031     unary-expression type-assignment-operator type-specifier
2032     declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list
2033     typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list
2034     typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list
2035 ~~~