common-trace-format-specification.md

   1 # Common Trace Format (CTF) Specification (v1.8.2)
   2
   3 **Author**: Mathieu Desnoyers, [EfficiOS Inc.](http://www.efficios.com/)
   4
   5 The goal of the present document is to specify a trace format that suits
   6 the needs of the embedded, telecom, high-performance and kernel
   7 communities. It is based on the
   8 [Common Trace Format Requirements (v1.4)](http://git.efficios.com/?p=ctf.git;a=blob_plain;f=common-trace-format-reqs.txt;hb=master)
   9 document. It is designed to allow traces to be natively generated by the
  10 Linux kernel, Linux user space applications written in C/C++, and
  11 hardware components. One major element of CTF is the Trace Stream
  12 Description Language (TSDL) which flexibility enables description of
  13 various binary trace stream layouts.
  14
  15 The latest version of this document can be found at:
  16
  17   * Git: `git clone git://git.efficios.com/ctf.git`
  18   * [Gitweb](http://git.efficios.com/?p=ctf.git)
  19
  20 A reference implementation of a library to read and write this trace
  21 format is being implemented within the
  22 [Babeltrace](http://www.efficios.com/babeltrace) project, a converter
  23 between trace formats. The development tree is available at:
  24
  25   * Git: `git clone git://git.efficios.com/babeltrace.git`
  26   * [Gitweb](http://git.efficios.com/?p=babeltrace.git)
  27
  28 The [CE Workgroup](http://www.linuxfoundation.org/collaborate/workgroups/celf)
  29 of the Linux Foundation, [Ericsson](http://www.ericsson.com/), and
  30 [EfficiOS](http://www.efficios.com/) have sponsored this work.
  31
  32 **Contents**:
  33
  34     1. Preliminary definitions
  35     2. High-level representation of a trace
  36     3. Event stream
  37     4. Types
  38       4.1 Basic types
  39         4.1.1 Type inheritance
  40         4.1.2 Alignment
  41         4.1.3 Byte order
  42         4.1.4 Size
  43         4.1.5 Integers
  44         4.1.6 GNU/C bitfields
  45         4.1.7 Floating point
  46         4.1.8 Enumerations
  47       4.2 Compound types
  48         4.2.1 Structures
  49         4.2.2 Variants (discriminated/tagged unions)
  50         4.2.3 Arrays
  51         4.2.4 Sequences
  52         4.2.5 Strings
  53     5. Event packet header
  54       5.1 Event packet header description
  55       5.2 Event packet context description
  56     6. Event structure
  57       6.1 Event header
  58         6.1.1 Type 1: few event IDs
  59         6.1.2 Type 2: many event IDs
  60       6.2 Stream event context and event context
  61       6.3 Event payload
  62         6.3.1 Padding
  63         6.3.2 Alignment
  64     7. Trace Stream Description Language (TSDL)
  65       7.1 Metadata
  66       7.2 Declaration vs definition
  67       7.3 TSDL scopes
  68         7.3.1 Lexical scope
  69         7.3.2 Static and dynamic scopes
  70       7.4 TSDL examples
  71     8. Clocks
  72     A. Helper macros
  73     B. Stream header rationale
  74     C. TSDL Grammar
  75       C.1 Lexical grammar
  76         C.1.1 Lexical elements
  77         C.1.2 Keywords
  78         C.1.3 Identifiers
  79         C.1.4 Universal character names
  80         C.1.5 Constants
  81         C.1.6 String literals
  82         C.1.7 Punctuators
  83       C.2 Phrase structure grammar
  84         C.2.2 Declarations:
  85         C.2.3 CTF-specific declarations
  86
  87
  88 ## 1. Preliminary definitions
  89
  90   * **Event trace**: an ordered sequence of events.
  91   * **Event stream**: an ordered sequence of events, containing a
  92     subset of the trace event types.
  93   * **Event packet**: a sequence of physically contiguous events within
  94     an event stream.
  95   * **Event**: this is the basic entry in a trace. Also known as
  96     a _trace record_.
  97     * An **event identifier** (ID) relates to the class (a type) of
  98       event within an event stream, e.g. event `irq_entry`.
  99     * An **event** (or event record) relates to a specific instance of
 100       an event class, e.g. event `irq_entry`, at time _X_, on CPU _Y_.
 101   * Source architecture: architecture writing the trace.
 102   * Reader architecture: architecture reading the trace.
 103
 104
 105 ## 2. High-level representation of a trace
 106
 107 A _trace_ is divided into multiple event streams. Each event stream
 108 contains a subset of the trace event types.
 109
 110 The final output of the trace, after its generation and optional
 111 transport over the network, is expected to be either on permanent or
 112 temporary storage in a virtual file system. Because each event stream
 113 is appended to while a trace is being recorded, each is associated with
 114 a distinct set of files for output. Therefore, a stored trace can be
 115 represented as a directory containing zero, one or more files
 116 per stream.
 117
 118 Metadata description associated with the trace contains information on
 119 trace event types expressed in the _Trace Stream Description Language_
 120 (TSDL). This language describes:
 121
 122   * Trace version
 123   * Types available
 124   * Per-trace event header description
 125   * Per-stream event header description
 126   * Per-stream event context description
 127   * Per-event
 128     * Event type to stream mapping
 129     * Event type to name mapping
 130     * Event type to ID mapping
 131     * Event context description
 132     * Event fields description
 133
 134
 135 ## 3. Event stream
 136
 137 An _event stream_ can be divided into contiguous event packets of
 138 variable size. An event packet can contain a certain amount of padding
 139 at the end. The stream header is repeated at the beginning of each
 140 event packet. The rationale for the event stream design choices is
 141 explained in [Stream header rationale](#specB).
 142
 143 The event stream header will therefore be referred to as the
 144 _event packet header_ throughout the rest of this document.
 145
 146
 147 ## 4. Types
 148
 149 Types are organized as type classes. Each type class belong to either
 150 of two kind of types: _basic types_ or _compound types_.
 151
 152
 153 ### 4.1 Basic types
 154
 155 A basic type is a scalar type, as described in this section. It
 156 includes integers, GNU/C bitfields, enumerations, and floating
 157 point values.
 158
 159
 160 #### 4.1.1 Type inheritance
 161
 162 Type specifications can be inherited to allow deriving types from a
 163 type class. For example, see the uint32_t named type derived from the
 164 [_integer_ type](#spec4.1.5) class. Types have a precise binary
 165 representation in the trace. A type class has methods to read and write
 166 these types, but must be derived into a type to be usable in an event
 167 field.
 168
 169
 170 #### 4.1.2 Alignment
 171
 172 We define _byte-packed_ types as aligned on the byte size, namely 8-bit.
 173 We define _bit-packed_ types as following on the next bit, as defined
 174 by the [Integers](#spec4.1.5) section.
 175
 176 Each basic type must specify its alignment, in bits. Examples of
 177 possible alignments are: bit-packed (`align = 1`), byte-packed
 178 (`align = 8`), or word-aligned (e.g. `align = 32` or `align = 64`).
 179 The choice depends on the architecture preference and compactness vs
 180 performance trade-offs of the implementation. Architectures providing
 181 fast unaligned write byte-packed basic types to save space, aligning
 182 each type on byte boundaries (8-bit). Architectures with slow unaligned
 183 writes align types on specific alignment values. If no specific
 184 alignment is declared for a type, it is assumed to be bit-packed for
 185 integers with size not multiple of 8 bits and for gcc bitfields. All
 186 other basic types are byte-packed by default. It is however recommended
 187 to always specify the alignment explicitly. Alignment values must be
 188 power of two. Compound types are aligned as specified in their
 189 individual specification.
 190
 191 The base offset used for field alignment is the start of the packet
 192 containing the field. For instance, a field aligned on 32-bit needs to
 193 be at an offset multiple of 32-bit from the start of the packet that
 194 contains it.
 195
 196 TSDL metadata attribute representation of a specific alignment:
 197
 198 ~~~ tsdl
 199 align = /* value in bits */;
 200 ~~~
 201
 202 #### 4.1.3 Byte order
 203
 204 By default, byte order of a basic type is the byte order described in
 205 the trace description. It can be overridden by specifying a
 206 `byte_order` attribute for a basic type.  Typical use-case is to specify
 207 the network byte order (big endian: `be`) to save data captured from
 208 the network into the trace without conversion.
 209
 210 TSDL metadata representation:
 211
 212 ~~~ tsdl
 213 /* network and be are aliases */
 214 byte_order = /* native OR network OR be OR le */;
 215 ~~~
 216
 217 The `native` keyword selects the byte order described in the trace
 218 description. The `network` byte order is an alias for big endian.
 219
 220 Even though the trace description section is not per se a type, for
 221 sake of clarity, it should be noted that `native` and `network` byte
 222 orders are only allowed within type declaration. The `byte_order`
 223 specified in the trace description section only accepts `be` or `le`
 224 values.
 225
 226
 227 #### 4.1.4 Size
 228
 229 Type size, in bits, for integers and floats is that returned by
 230 `sizeof()` in C multiplied by `CHAR_BIT`. We require the size of `char`
 231 and `unsigned char` types (`CHAR_BIT`) to be fixed to 8 bits for
 232 cross-endianness compatibility.
 233
 234 TSDL metadata representation:
 235
 236 ~~~ tsdl
 237 size = /* value is in bits */;
 238 ~~~
 239
 240
 241 #### 4.1.5 Integers
 242
 243 Signed integers are represented in two-complement. Integer alignment,
 244 size, signedness and byte ordering are defined in the TSDL metadata.
 245 Integers aligned on byte size (8-bit) and with length multiple of byte
 246 size (8-bit) correspond to the C99 standard integers. In addition,
 247 integers with alignment and/or size that are _not_ a multiple of the
 248 byte size are permitted; these correspond to the C99 standard bitfields,
 249 with the added specification that the CTF integer bitfields have a fixed
 250 binary representation. Integer size needs to be a positive integer.
 251 Integers of size 0 are **forbidden**. An MIT-licensed reference
 252 implementation of the CTF portable bitfields is available
 253 [here](http://git.efficios.com/?p=babeltrace.git;a=blob;f=include/babeltrace/bitfield.h).
 254
 255 Binary representation of integers:
 256
 257   * On little and big endian:
 258     * Within a byte, high bits correspond to an integer high bits, and
 259       low bits correspond to low bits
 260   * On little endian:
 261     * Integer across multiple bytes are placed from the less significant
 262       to the most significant
 263     * Consecutive integers are placed from lower bits to higher bits
 264       (even within a byte)
 265   * On big endian:
 266     * Integer across multiple bytes are placed from the most significant
 267       to the less significant
 268     * Consecutive integers are placed from higher bits to lower bits
 269       (even within a byte)
 270
 271 This binary representation is derived from the bitfield implementation
 272 in GCC for little and big endian. However, contrary to what GCC does,
 273 integers can cross units boundaries (no padding is required). Padding
 274 can be [explicitly added](#spec4.1.6) to follow the GCC layout if needed.
 275
 276 TSDL metadata representation:
 277
 278 ~~~ tsdl
 279 integer {
 280     signed = /* true OR false */;                     /* default: false */
 281     byte_order = /* native OR network OR be OR le */; /* default: native */
 282     size = /* value in bits */;                       /* no default */
 283     align = /* value in bits */;
 284
 285     /* base used for pretty-printing output; default: decimal */
 286     base = /* decimal OR dec OR d OR i OR u OR 10 OR hexadecimal OR hex
 287               OR x OR X OR p OR 16 OR octal OR oct OR o OR 8 OR binary
 288               OR b OR 2 */;
 289
 290     /* character encoding */
 291     encoding = /* none or UTF8 or ASCII */;           /* default: none */
 292 }
 293 ~~~
 294
 295 Example of type inheritance (creation of a `uint32_t` named type):
 296
 297 ~~~ tsdl
 298 typealias integer {
 299     size = 32;
 300     signed = false;
 301     align = 32;
 302 } := uint32_t;
 303 ~~~
 304
 305 Definition of a named 5-bit signed bitfield:
 306
 307 ~~~ tsdl
 308 typealias integer {
 309     size = 5;
 310     signed = true;
 311     align = 1;
 312 } := int5_t;
 313 ~~~
 314
 315 The character encoding field can be used to specify that the integer
 316 must be printed as a text character when read. e.g.:
 317
 318 ~~~ tsdl
 319 typealias integer {
 320     size = 8;
 321     align = 8;
 322     signed = false;
 323     encoding = UTF8;
 324 } := utf_char;
 325 ~~~
 326
 327 #### 4.1.6 GNU/C bitfields
 328
 329 The GNU/C bitfields follow closely the integer representation, with a
 330 particularity on alignment: if a bitfield cannot fit in the current
 331 unit, the unit is padded and the bitfield starts at the following unit.
 332 The unit size is defined by the size of the type `unit_type`.
 333
 334 TSDL metadata representation:
 335
 336 ~~~ tsdl
 337 unit_type name:size;
 338 ~~~
 339
 340 As an example, the following structure declared in C compiled by GCC:
 341
 342 ~~~ tsdl
 343 struct example {
 344     short a:12;
 345     short b:5;
 346 };
 347 ~~~
 348
 349 The example structure is aligned on the largest element (short). The
 350 second bitfield would be aligned on the next unit boundary, because it
 351 would not fit in the current unit.
 352
 353
 354 #### 4.1.7 Floating point
 355
 356 The floating point values byte ordering is defined in the TSDL metadata.
 357
 358 Floating point values follow the IEEE 754-2008 standard interchange
 359 formats. Description of the floating point values include the exponent
 360 and mantissa size in bits. Some requirements are imposed on the
 361 floating point values:
 362
 363 * `FLT_RADIX` must be 2.
 364 * `mant_dig` is the number of digits represented in the mantissa. It is
 365   specified by the ISO C99 standard, section 5.2.4, as `FLT_MANT_DIG`,
 366   `DBL_MANT_DIG` and `LDBL_MANT_DIG` as defined by `<float.h>`.
 367 * `exp_dig` is the number of digits represented in the exponent. Given
 368   that `mant_dig` is one bit more than its actual size in bits (leading
 369   1 is not needed) and also given that the sign bit always takes one
 370   bit, `exp_dig` can be specified as:
 371   * `sizeof(float) * CHAR_BIT - FLT_MANT_DIG`
 372   * `sizeof(double) * CHAR_BIT - DBL_MANT_DIG`
 373   * `sizeof(long double) * CHAR_BIT - LDBL_MANT_DIG`
 374
 375 TSDL metadata representation:
 376
 377 ~~~ tsdl
 378 floating_point {
 379     exp_dig = /* value */;
 380     mant_dig = /* value */;
 381     byte_order = /* native OR network OR be OR le */;
 382     align = /* value */;
 383 }
 384 ~~~
 385
 386 Example of type inheritance:
 387
 388 ~~~ tsdl
 389 typealias floating_point {
 390     exp_dig = 8;         /* sizeof(float) * CHAR_BIT - FLT_MANT_DIG */
 391     mant_dig = 24;       /* FLT_MANT_DIG */
 392     byte_order = native;
 393     align = 32;
 394 } := float;
 395 ~~~
 396
 397 TODO: define NaN, +inf, -inf behavior.
 398
 399 Bit-packed, byte-packed or larger alignments can be used for floating
 400 point values, similarly to integers.
 401
 402
 403 #### 4.1.8 Enumerations
 404
 405 Enumerations are a mapping between an integer type and a table of
 406 strings. The numerical representation of the enumeration follows the
 407 integer type specified by the metadata. The enumeration mapping table
 408 is detailed in the enumeration description within the metadata. The
 409 mapping table maps inclusive value ranges (or single values) to strings.
 410 Instead of being limited to simple `value -> string` mappings, these
 411 enumerations map `[ start_value ... end_value ] -> string`, which map
 412 inclusive ranges of values to strings. An enumeration from the C
 413 language can be represented in this format by having the same
 414 `start_value` and `end_value` for each mapping, which is in fact a
 415 range of size 1. This single-value range is supported without repeating
 416 the start and end values with the `value = string` declaration.
 417 Enumerations need to contain at least one entry.
 418
 419 ~~~ tsdl
 420 enum name : integer_type {
 421     somestring          = /* start_value1 */ ... /* end_value1 */,
 422     "other string"      = /* start_value2 */ ... /* end_value2 */,
 423     yet_another_string,   /* will be assigned to end_value2 + 1 */
 424     "some other string" = /* value */,
 425     /* ... */
 426 }
 427 ~~~
 428
 429 If the values are omitted, the enumeration starts at 0 and increment
 430 of 1 for each entry. An entry with omitted value that follows a range
 431 entry takes as value the `end_value` of the previous range + 1:
 432
 433 ~~~ tsdl
 434 enum name : unsigned int {
 435     ZERO,
 436     ONE,
 437     TWO,
 438     TEN = 10,
 439     ELEVEN,
 440 }
 441 ~~~
 442
 443 Overlapping ranges within a single enumeration are implementation
 444 defined.
 445
 446 A nameless enumeration can be declared as a field type or as part of
 447 a `typedef`:
 448
 449 ~~~ tsdl
 450 enum : integer_type {
 451     /* ... */
 452 }
 453 ~~~
 454
 455 Enumerations omitting the container type `: integer_type` use the `int`
 456 type (for compatibility with C99). The `int` type _must be_ previously
 457 declared, e.g.:
 458
 459 ~~~ tsdl
 460 typealias integer { size = 32; align = 32; signed = true; } := int;
 461
 462 enum {
 463     /* ... */
 464 }
 465 ~~~
 466
 467 An enumeration field can have an integral value for which the associated
 468 enumeration type does not map to a string.
 469
 470 ### 4.2 Compound types
 471
 472 Compound are aggregation of type declarations. Compound types include
 473 structures, variant, arrays, sequences, and strings.
 474
 475
 476 #### 4.2.1 Structures
 477
 478 Structures are aligned on the largest alignment required by basic types
 479 contained within the structure. (This follows the ISO/C standard for
 480 structures)
 481
 482 TSDL metadata representation of a named structure:
 483
 484 ~~~ tsdl
 485 struct name {
 486     field_type field_name;
 487     field_type field_name;
 488     /* ... */
 489 };
 490 ~~~
 491
 492 Example:
 493
 494 ~~~ tsdl
 495 struct example {
 496     integer {                   /* nameless type */
 497         size = 16;
 498         signed = true;
 499         align = 16;
 500     } first_field_name;
 501     uint64_t second_field_name; /* named type declared in the metadata */
 502 };
 503 ~~~
 504
 505 The fields are placed in a sequence next to each other. They each
 506 possess a field name, which is a unique identifier within the structure.
 507 The identifier is not allowed to use any [reserved keyword](#specC.1.2).
 508 Replacing reserved keywords with underscore-prefixed field names is
 509 **recommended**. Fields starting with an underscore should have their
 510 leading underscore removed by the CTF trace readers.
 511
 512 A nameless structure can be declared as a field type or as part of
 513 a `typedef`:
 514
 515 ~~~ tsdl
 516 struct {
 517     /* ... */
 518 }
 519 ~~~
 520
 521 Alignment for a structure compound type can be forced to a minimum
 522 value by adding an `align` specifier after the declaration of a
 523 structure body. This attribute is read as: `align(value)`. The value is
 524 specified in bits. The structure will be aligned on the maximum value
 525 between this attribute and the alignment required by the basic types
 526 contained within the structure. e.g.
 527
 528 ~~~ tsdl
 529 struct {
 530     /* ... */
 531 } align(32)
 532 ~~~
 533
 534 #### 4.2.2 Variants (discriminated/tagged unions)
 535
 536 A CTF variant is a selection between different types. A CTF variant must
 537 always be defined within the scope of a structure or within fields
 538 contained within a structure (defined recursively). A _tag_ enumeration
 539 field must appear in either the same static scope, prior to the variant
 540 field (in field declaration order), in an upper static scope, or in an
 541 upper dynamic scope (see [Static and dynamic scopes](#spec7.3.2)).
 542 The type selection is indicated by the mapping from the enumeration
 543 value to the string used as variant type selector. The field to use as
 544 tag is specified by the `tag_field`, specified between `< >` after the
 545 `variant` keyword for unnamed variants, and after _variant name_ for
 546 named variants. It is not required that each enumeration mapping appears
 547 as variant type tag field. It is also not required that each variant
 548 type tag appears as enumeration mapping. However, it is required that
 549 any enumeration mapping encountered within a stream has a matching
 550 variant type tag field.
 551
 552 The alignment of the variant is the alignment of the type as selected
 553 by the tag value for the specific instance of the variant. The size of
 554 the variant is the size as selected by the tag value for the specific
 555 instance of the variant.
 556
 557 The alignment of the type containing the variant is independent of the
 558 variant alignment. For instance, if a structure contains two fields, a
 559 32-bit integer, aligned on 32 bits, and a variant, which contains two
 560 choices: either a 32-bit field, aligned on 32 bits, or a 64-bit field,
 561 aligned on 64 bits, the alignment of the outmost structure will be
 562 32-bit (the alignment of its largest field, disregarding the alignment
 563 of the variant). The alignment of the variant will depend on the
 564 selector: if the variant's 32-bit field is selected, its alignment will
 565 be 32-bit, or 64-bit otherwise. It is important to note that variants
 566 are specifically tailored for compactness in a stream. Therefore, the
 567 relative offsets of compound type fields can vary depending on the
 568 offset at which the compound type starts if it contains a variant
 569 that itself contains a type with alignment larger than the largest field
 570 contained within the compound type. This is caused by the fact that the
 571 compound type may contain the enumeration that select the variant's
 572 choice, and therefore the alignment to be applied to the compound type
 573 cannot be determined before encountering the enumeration.
 574
 575 Each variant type selector possess a field name, which is a unique
 576 identifier within the variant. The identifier is not allowed to use any
 577 [reserved keyword](#C.1.2). Replacing reserved keywords with
 578 underscore-prefixed field names is recommended. Fields starting with an
 579 underscore should have their leading underscore removed by the CTF trace
 580 readers.
 581
 582 A named variant declaration followed by its definition within a
 583 structure declaration:
 584
 585 ~~~ tsdl
 586 variant name {
 587     field_type sel1;
 588     field_type sel2;
 589     field_type sel3;
 590     /* ... */
 591 };
 592
 593 struct {
 594     enum : integer_type { sel1, sel2, sel3, /* ... */ } tag_field;
 595     /* ... */
 596     variant name <tag_field> v;
 597 }
 598 ~~~
 599
 600 An unnamed variant definition within a structure is expressed by the
 601 following TSDL metadata:
 602
 603 ~~~ tsdl
 604 struct {
 605     enum : integer_type { sel1, sel2, sel3, /* ... */ } tag_field;
 606     /* ... */
 607     variant <tag_field> {
 608         field_type sel1;
 609         field_type sel2;
 610         field_type sel3;
 611         /* ... */
 612     } v;
 613 }
 614 ~~~
 615
 616 Example of a named variant within a sequence that refers to a single
 617 tag field:
 618
 619 ~~~ tsdl
 620 variant example {
 621     uint32_t a;
 622     uint64_t b;
 623     short c;
 624 };
 625
 626 struct {
 627     enum : uint2_t { a, b, c } choice;
 628     unsigned int seqlen;
 629     variant example <choice> v[seqlen];
 630 }
 631 ~~~
 632
 633 Example of an unnamed variant:
 634
 635 ~~~ tsdl
 636 struct {
 637     enum : uint2_t { a, b, c, d } choice;
 638
 639     /* Unrelated fields can be added between the variant and its tag */
 640     int32_t somevalue;
 641     variant <choice> {
 642         uint32_t a;
 643         uint64_t b;
 644         short c;
 645         struct {
 646             unsigned int field1;
 647             uint64_t field2;
 648         } d;
 649     } s;
 650 }
 651 ~~~
 652
 653 Example of an unnamed variant within an array:
 654
 655 ~~~ tsdl
 656 struct {
 657     enum : uint2_t { a, b, c } choice;
 658     variant <choice> {
 659         uint32_t a;
 660         uint64_t b;
 661         short c;
 662     } v[10];
 663 }
 664 ~~~
 665
 666 Example of a variant type definition within a structure, where the
 667 defined type is then declared within an array of structures. This
 668 variant refers to a tag located in an upper static scope. This example
 669 clearly shows that a variant type definition referring to the tag `x`
 670 uses the closest preceding field from the static scope of the type
 671 definition.
 672
 673 ~~~ tsdl
 674 struct {
 675     enum : uint2_t { a, b, c, d } x;
 676
 677     /*
 678      * "x" refers to the preceding "x" enumeration in the
 679      * static scope of the type definition.
 680      */
 681     typedef variant <x> {
 682       uint32_t a;
 683       uint64_t b;
 684       short c;
 685     } example_variant;
 686
 687     struct {
 688       enum : int { x, y, z } x; /* This enumeration is not used by "v". */
 689
 690       /* "v" uses the "enum : uint2_t { a, b, c, d }" tag. */
 691       example_variant v;
 692     } a[10];
 693 }
 694 ~~~
 695
 696
 697 #### 4.2.3 Arrays
 698
 699 Arrays are fixed-length. Their length is declared in the type
 700 declaration within the metadata. They contain an array of _inner type_
 701 elements, which can refer to any type not containing the type of the
 702 array being declared (no circular dependency). The length is the number
 703 of elements in an array.
 704
 705 TSDL metadata representation of a named array:
 706
 707 ~~~ tsdl
 708 typedef elem_type name[/* length */];
 709 ~~~
 710
 711 A nameless array can be declared as a field type within a
 712 structure, e.g.:
 713
 714 ~~~ tsdl
 715 uint8_t field_name[10];
 716 ~~~
 717
 718 Arrays are always aligned on their element alignment requirement.
 719
 720
 721 #### 4.2.4 Sequences
 722
 723 Sequences are dynamically-sized arrays. They refer to a _length_
 724 unsigned integer field, which must appear in either the same static
 725 scope, prior to the sequence field (in field declaration order),
 726 in an upper static scope, or in an upper dynamic scope
 727 (see [Static and dynamic scopes](#spec7.3.2)). This length field represents
 728 the number of elements in the sequence. The sequence per se is an
 729 array of _inner type_ elements.
 730
 731 TSDL metadata representation for a sequence type definition:
 732
 733 ~~~ tsdl
 734 struct {
 735     unsigned int length_field;
 736     typedef elem_type typename[length_field];
 737     typename seq_field_name;
 738 }
 739 ~~~
 740
 741 A sequence can also be declared as a field type, e.g.:
 742
 743 ~~~ tsdl
 744 struct {
 745     unsigned int length_field;
 746     long seq_field_name[length_field];
 747 }
 748 ~~~
 749
 750 Multiple sequences can refer to the same length field, and these length
 751 fields can be in a different upper dynamic scope, e.g., assuming the
 752 `stream.event.header` defines:
 753
 754 ~~~ tsdl
 755 stream {
 756     /* ... */
 757     id = 1;
 758     event.header := struct {
 759         uint16_t seq_len;
 760     };
 761 };
 762
 763 event {
 764     /* ... */
 765     stream_id = 1;
 766     fields := struct {
 767         long seq_a[stream.event.header.seq_len];
 768         char seq_b[stream.event.header.seq_len];
 769     };
 770 };
 771 ~~~
 772
 773 The sequence elements follow the [array](#spec4.2.3) specifications.
 774
 775
 776 #### 4.2.5 Strings
 777
 778 Strings are an array of _bytes_ of variable size and are terminated by
 779 a `'\0'` "NULL" character. Their encoding is described in the TSDL
 780 metadata. In absence of encoding attribute information, the default
 781 encoding is UTF-8.
 782
 783 TSDL metadata representation of a named string type:
 784
 785 ~~~ tsdl
 786 typealias string {
 787     encoding = /* UTF8 OR ASCII */;
 788 } := name;
 789 ~~~
 790
 791 A nameless string type can be declared as a field type:
 792
 793 ~~~ tsdl
 794 string field_name; /* use default UTF8 encoding */
 795 ~~~
 796
 797 Strings are always aligned on byte size.
 798
 799
 800 ## 5. Event packet header
 801
 802 The event packet header consists of two parts: the
 803 _event packet header_ is the same for all streams of a trace. The
 804 second part, the _event packet context_, is described on a per-stream
 805 basis. Both are described in the TSDL metadata.
 806
 807 Event packet header (all fields are optional, specified by
 808 TSDL metadata):
 809
 810   * **Magic number** (CTF magic number: 0xC1FC1FC1) specifies that this is
 811     a CTF packet. This magic number is optional, but when present, it
 812     should come at the very beginning of the packet.
 813   * **Trace UUID**, used to ensure the event packet match the metadata used.
 814     Note: we cannot use a metadata checksum in every cases instead of a
 815     UUID because metadata can be appended to while tracing is active.
 816     This field is optional.
 817   * **Stream ID**, used as reference to stream description in metadata.
 818     This field is optional if there is only one stream description in
 819     the metadata, but becomes required if there are more than one
 820     stream in the TSDL metadata description.
 821
 822 Event packet context (all fields are optional, specified by
 823 TSDL metadata):
 824
 825   * Event packet **content size** (in bits).
 826   * Event packet **size** (in bits, includes padding).
 827   * Event packet content checksum. Checksum excludes the event packet
 828     header.
 829   * Per-stream event **packet sequence count** (to deal with UDP packet
 830     loss). The number of significant sequence counter bits should also
 831     be present, so wrap-arounds are dealt with correctly.
 832   * Time-stamp at the beginning and timestamp at the end of the event
 833     packet. Both timestamps are written in the packet header, but
 834     sampled respectively while (or before) writing the first event and
 835     while (or after) writing the last event in the packet. The inclusive
 836     range between these timestamps should include all event timestamps
 837     assigned to events contained within the packet. The timestamp at the
 838     beginning of an event packet is guaranteed to be below or equal the
 839     timestamp at the end of that event packet. The timestamp at the
 840     beginning of an event packet is guaranteed to be above or equal the
 841     timestamps at the beginning of any prior packet within the same
 842     stream. The timestamp at the end of an event packet is guaranteed to
 843     be below or equal the timestamps at the end of any following packet
 844     within the same stream. See [Clocks](#spec8) for more detail.
 845   * **Events discarded count**. Snapshot of a per-stream
 846     free-running counter, counting the number of events discarded that
 847     were supposed to be written in the stream after the last event in
 848     the event packet. Note: producer-consumer buffer full condition can
 849     fill the current event packet with padding so we know exactly where
 850     events have been discarded. However, if the buffer full condition
 851     chooses not to fill the current event packet with padding, all we
 852     know about the timestamp range in which the events have been
 853     discarded is that it is somewhere between the beginning and the end
 854     of the packet.
 855   * Lossless **compression scheme** used for the event packet content.
 856     Applied directly to raw data. New types of compression can be added
 857     in following versions of the format.
 858     * 0: no compression scheme
 859     * 1: bzip2
 860     * 2: gzip
 861     * 3: xz
 862   * **Cypher** used for the event packet content. Applied after
 863     compression.
 864     * 0: no encryption
 865     * 1: AES
 866   * **Checksum scheme** used for the event packet content. Applied after
 867     encryption.
 868     * 0: no checksum
 869     * 1: md5
 870     * 2: sha1
 871     * 3: crc32
 872
 873
 874 ### 5.1 Event packet header description
 875
 876 The event packet header layout is indicated by the
 877 `trace.packet.header` field. Here is a recommended structure type for
 878 the packet header with the fields typically expected (although these
 879 fields are each optional):
 880
 881 ~~~ tsdl
 882 struct event_packet_header {
 883     uint32_t magic;
 884     uint8_t  uuid[16];
 885     uint32_t stream_id;
 886 };
 887
 888 trace {
 889     /* ... */
 890     packet.header := struct event_packet_header;
 891 };
 892 ~~~
 893
 894 If the magic number (`magic` field) is not present,
 895 tools such as `file` will have no mean to discover the file type.
 896
 897 If the `uuid` field is not present, no validation that the metadata
 898 actually corresponds to the stream is performed.
 899
 900 If the `stream_id` packet header field is missing, the trace can only
 901 contain a single stream. Its `id` field can be left out, and its events
 902 don't need to declare a `stream_id` field.
 903
 904
 905 ### 5.2 Event packet context description
 906
 907 Event packet context example. These are declared within the stream
 908 declaration in the metadata. All these fields are optional. If the
 909 packet size field is missing, the whole stream only contains a single
 910 packet. If the content size field is missing, the packet is filled
 911 (no padding). The content and packet sizes include all headers.
 912
 913 An example event packet context type:
 914
 915 ~~~ tsdl
 916 struct event_packet_context {
 917     uint64_t timestamp_begin;
 918     uint64_t timestamp_end;
 919     uint32_t checksum;
 920     uint32_t stream_packet_count;
 921     uint32_t events_discarded;
 922     uint32_t cpu_id;
 923     uint64_t content_size;
 924     uint64_t packet_size;
 925     uint8_t  compression_scheme;
 926     uint8_t  encryption_scheme;
 927     uint8_t  checksum_scheme;
 928 };
 929 ~~~
 930
 931
 932 ## 6. Event Structure
 933
 934 The overall structure of an event is:
 935
 936   1. Event header (as specified by the stream metadata)
 937   2. Stream event context (as specified by the stream metadata)
 938   3. Event context (as specified by the event metadata)
 939   4. Event payload (as specified by the event metadata)
 940
 941 This structure defines an implicit dynamic scoping, where variants
 942 located in inner structures (those with a higher number in the listing
 943 above) can refer to the fields of outer structures (with lower number
 944 in the listing above). See [TSDL scopes](#spec7.3) for more detail.
 945
 946 The total length of an event is defined as the difference between the
 947 end of its event payload and the end of the previous event's event
 948 payload. Therefore, it includes the event header alignment padding, and
 949 all its fields and their respective alignment padding. Events of length
 950 0 are forbidden.
 951
 952
 953 ### 6.1 Event header
 954
 955 Event headers can be described within the metadata. We hereby propose,
 956 as an example, two types of events headers. Type 1 accommodates streams
 957 with less than 31 event IDs. Type 2 accommodates streams with 31 or
 958 more event IDs.
 959
 960 One major factor can vary between streams: the number of event IDs
 961 assigned to a stream. Luckily, this information tends to stay
 962 relatively constant (modulo event registration while trace is being
 963 recorded), so we can specify different representations for streams
 964 containing few event IDs and streams containing many event IDs, so we
 965 end up representing the event ID and timestamp as densely as possible
 966 in each case.
 967
 968 The header is extended in the rare occasions where the information
 969 cannot be represented in the ranges available in the standard event
 970 header. They are also used in the rare occasions where the data
 971 required for a field could not be collected: the flag corresponding to
 972 the missing field within the `missing_fields` array is then set to 1.
 973
 974 Types `uintX_t` represent an `X`-bit unsigned integer, as declared with
 975 either:
 976
 977 ~~~ tsdl
 978 typealias integer {
 979     size = /* X */;
 980     align = /* X */;
 981     signed = false;
 982 } := uintX_t;
 983 ~~~
 984
 985 or
 986
 987 ~~~ tsdl
 988 typealias integer {
 989     size = /* X */;
 990     align = 1;
 991     signed = false;
 992 } := uintX_t;
 993 ~~~
 994
 995 For more information about timestamp fields, see [Clocks](#spec8).
 996
 997
 998 #### 6.1.1 Type 1: few event IDs
 999
1000   * Aligned on 32-bit (or 8-bit if byte-packed, depending on the
1001     architecture preference)
1002   * Native architecture byte ordering
1003   * For `compact` selection, fixed size of 32 bits
1004   * For "extended" selection, size depends on the architecture and
1005     variant alignment
1006
1007 ~~~ tsdl
1008 struct event_header_1 {
1009     /*
1010      * id: range: 0 - 30.
1011      * id 31 is reserved to indicate an extended header.
1012      */
1013     enum : uint5_t { compact = 0 ... 30, extended = 31 } id;
1014     variant <id> {
1015         struct {
1016             uint27_t timestamp;
1017         } compact;
1018         struct {
1019             uint32_t id;        /* 32-bit event IDs */
1020             uint64_t timestamp; /* 64-bit timestamps */
1021         } extended;
1022     } v;
1023 } align(32); /* or align(8) */
1024 ~~~
1025
1026
1027 #### 6.1.2 Type 2: many event IDs
1028
1029   * Aligned on 16-bit (or 8-bit if byte-packed, depending on the
1030     architecture preference)
1031   * Native architecture byte ordering
1032   * For `compact` selection, size depends on the architecture and
1033     variant alignment
1034   * For `extended` selection, size depends on the architecture and
1035     variant alignment
1036
1037 ~~~ tsdl
1038 struct event_header_2 {
1039     /*
1040      * id: range: 0 - 65534.
1041      * id 65535 is reserved to indicate an extended header.
1042      */
1043     enum : uint16_t { compact = 0 ... 65534, extended = 65535 } id;
1044     variant <id> {
1045         struct {
1046             uint32_t timestamp;
1047         } compact;
1048         struct {
1049             uint32_t id;        /* 32-bit event IDs */
1050             uint64_t timestamp; /* 64-bit timestamps */
1051         } extended;
1052     } v;
1053 } align(16); /* or align(8) */
1054 ~~~
1055
1056
1057 ### 6.2 Stream event context and event context
1058
1059 The event context contains information relative to the current event.
1060 The choice and meaning of this information is specified by the TSDL
1061 stream and event metadata descriptions. The stream context is applied
1062 to all events within the stream. The stream context structure follows
1063 the event header. The event context is applied to specific events. Its
1064 structure follows the stream context structure.
1065
1066 An example of stream-level event context is to save the event payload
1067 size with each event, or to save the current PID with each event.
1068 These are declared within the stream declaration within the metadata:
1069
1070 ~~~ tsdl
1071 stream {
1072     /* ... */
1073     event.context := struct {
1074         uint pid;
1075         uint16_t payload_size;
1076     };
1077 };
1078 ~~~
1079
1080 An example of event-specific event context is to declare a bitmap of
1081 missing fields, only appended after the stream event context if the
1082 extended event header is selected. `NR_FIELDS` is the number of fields
1083 within the event (a numeric value).
1084
1085 ~~~ tsdl
1086 event {
1087     context := struct {
1088         variant <id> {
1089             struct { } compact;
1090             struct {
1091                 /* missing event fields bitmap */
1092                 uint1_t missing_fields[NR_FIELDS];
1093             } extended;
1094         } v;
1095     };
1096     /* ... */
1097 }
1098 ~~~
1099
1100
1101 ### 6.3 Event payload
1102
1103 An event payload contains fields specific to a given event type. The
1104 fields belonging to an event type are described in the event-specific
1105 metadata within a structure type.
1106
1107
1108 #### 6.3.1 Padding
1109
1110 No padding at the end of the event payload. This differs from the ISO/C
1111 standard for structures, but follows the CTF standard for structures.
1112 In a trace, even though it makes sense to align the beginning of a
1113 structure, it really makes no sense to add padding at the end of the
1114 structure, because structures are usually not followed by a structure
1115 of the same type.
1116
1117 This trick can be done by adding a zero-length `end` field at the end
1118 of the C structures, and by using the offset of this field rather than
1119 using `sizeof()` when calculating the size of a structure
1120 (see [Helper macros](#specA)).
1121
1122
1123 #### 6.3.2 Alignment
1124
1125 The event payload is aligned on the largest alignment required by types
1126 contained within the payload. This follows the ISO/C standard for
1127 structures.
1128
1129
1130 ## 7. Trace Stream Description Language (TSDL)
1131
1132 The Trace Stream Description Language (TSDL) allows expression of the
1133 binary trace streams layout in a C99-like Domain Specific Language
1134 (DSL).
1135
1136
1137 ### 7.1 Meta-data
1138
1139 The trace stream layout description is located in the trace metadata.
1140 The metadata is itself located in a stream identified by its name:
1141 `metadata`.
1142
1143 The metadata description can be expressed in two different formats:
1144 text-only and packet-based. The text-only description facilitates
1145 generation of metadata and provides a convenient way to enter the
1146 metadata information by hand. The packet-based metadata provides the
1147 CTF stream packet facilities (checksumming, compression, encryption,
1148 network-readiness) for metadata stream generated and transported by a
1149 tracer.
1150
1151 The text-only metadata file is a plain-text TSDL description. This file
1152 must begin with the following characters to identify the file as a CTF
1153 TSDL text-based metadata file (without the double-quotes):
1154
1155 ~~~ text
1156 "/* CTF"
1157 ~~~
1158
1159 It must be followed by a space, and the version of the specification
1160 followed by the CTF trace, e.g.:
1161
1162 ~~~ text
1163 " 1.8"
1164 ~~~
1165
1166 These characters allow automated discovery of file type and CTF
1167 specification version. They are interpreted as a the beginning of a
1168 comment by the TSDL metadata parser. The comment can be continued to
1169 contain extra commented characters before it is closed.
1170
1171 The packet-based metadata is made of _metadata packets_, which each
1172 start with a metadata packet header. The packet-based metadata
1173 description is detected by reading the magic number 0x75D11D57 at the
1174 beginning of the file. This magic number is also used to detect the
1175 endianness of the architecture by trying to read the CTF magic number
1176 and its counterpart in reversed endianness. The events within the
1177 metadata stream have no event header nor event context. Each event only
1178 contains a special _sequence_ payload, which is a sequence of bits which
1179 length is implicitly calculated by using the
1180 `trace.packet.header.content_size` field, minus the packet header size.
1181 The formatting of this sequence of bits is a plain-text representation
1182 of the TSDL description. Each metadata packet start with a special
1183 packet header, specific to the metadata stream, which contains,
1184 exactly:
1185
1186 ~~~ tsdl
1187 struct metadata_packet_header {
1188     uint32_t magic;              /* 0x75D11D57 */
1189     uint8_t  uuid[16];           /* Unique Universal Identifier */
1190     uint32_t checksum;           /* 0 if unused */
1191     uint32_t content_size;       /* in bits */
1192     uint32_t packet_size;        /* in bits */
1193     uint8_t  compression_scheme; /* 0 if unused */
1194     uint8_t  encryption_scheme;  /* 0 if unused */
1195     uint8_t  checksum_scheme;    /* 0 if unused */
1196     uint8_t  major;              /* CTF spec version major number */
1197     uint8_t  minor;              /* CTF spec version minor number */
1198 };
1199 ~~~
1200
1201 The packet-based metadata can be converted to a text-only metadata by
1202 concatenating all the strings it contains.
1203
1204 In the textual representation of the metadata, the text contained
1205 within `/*` and `*/`, as well as within `//` and end of line, are
1206 treated as comments. Boolean values can be represented as `true`,
1207 `TRUE`, or `1` for true, and `false`, `FALSE`, or `0` for false. Within
1208 the string-based metadata description, the trace UUID is represented as
1209 a string of hexadecimal digits and dashes `-`. In the event packet
1210 header, the trace UUID is represented as an array of bytes.
1211
1212
1213 ### 7.2 Declaration vs definition
1214
1215 A declaration associates a layout to a type, without specifying where
1216 this type is located in the event [structure hierarchy](#spec6).
1217 This therefore includes `typedef`, `typealias`, as well as all type
1218 specifiers. In certain circumstances (`typedef`, structure field and
1219 variant field), a declaration is followed by a declarator, which specify
1220 the newly defined type name (for `typedef`), or the field name (for
1221 declarations located within structure and variants). Array and sequence,
1222 declared with square brackets (`[` `]`), are part of the declarator,
1223 similarly to C99. The enumeration base type is specified by
1224 `: enum_base`, which is part of the type specifier. The variant tag
1225 name, specified between `<` `>`, is also part of the type specifier.
1226
1227 A definition associates a type to a location in the event
1228 [structure hierarchy](#spec6). This association is denoted by `:=`,
1229 as shown in [TSDL scopes](#spec7.3).
1230
1231
1232 ### 7.3 TSDL scopes
1233
1234 TSDL uses three different types of scoping: a lexical scope is used for
1235 declarations and type definitions, and static and dynamic scopes are
1236 used for variants references to tag fields (with relative and absolute
1237 path lookups) and for sequence references to length fields.
1238
1239
1240 #### 7.3.1 Lexical Scope
1241
1242 Each of `trace`, `env`, `stream`, `event`, `struct` and `variant` have
1243 their own nestable declaration scope, within which types can be declared
1244 using `typedef` and `typealias`. A root declaration scope also contains
1245 all declarations located outside of any of the aforementioned
1246 declarations. An inner declaration scope can refer to type declared
1247 within its container lexical scope prior to the inner declaration scope.
1248 Redefinition of a typedef or typealias is not valid, although hiding an
1249 upper scope typedef or typealias is allowed within a sub-scope.
1250
1251
1252 #### 7.3.2 Static and dynamic scopes
1253
1254 A local static scope consists in the scope generated by the declaration
1255 of fields within a compound type. A static scope is a local static scope
1256 augmented with the nested sub-static-scopes it contains.
1257
1258 A dynamic scope consists in the static scope augmented with the
1259 implicit [event structure](#spec6) definition hierarchy.
1260
1261 Multiple declarations of the same field name within a local static scope
1262 is not valid. It is however valid to re-use the same field name in
1263 different local scopes.
1264
1265 Nested static and dynamic scopes form lookup paths. These are used for
1266 variant tag and sequence length references. They are used at the variant
1267 and sequence definition site to look up the location of the tag field
1268 associated with a variant, and to lookup up the location of the length
1269 field associated with a sequence.
1270
1271 Variants and sequences can refer to a tag field either using a relative
1272 path or an absolute path. The relative path is relative to the scope in
1273 which the variant or sequence performing the lookup is located.
1274 Relative paths are only allowed to lookup within the same static scope,
1275 which includes its nested static scopes. Lookups targeting parent static
1276 scopes need to be performed with an absolute path.
1277
1278 Absolute path lookups use the full path including the dynamic scope
1279 followed by a `.` and then the static scope. Therefore, variants (or
1280 sequences) in lower levels in the dynamic scope (e.g., event context)
1281 can refer to a tag (or length) field located in upper levels
1282 (e.g., in the event header) by specifying, in this case, the associated
1283 tag with `<stream.event.header.field_name>`. This allows, for instance,
1284 the event context to define a variant referring to the `id` field of
1285 the event header as selector.
1286
1287 The dynamic scope prefixes are thus:
1288
1289   * Trace environment: `<env. >`
1290   * Trace packet header: `<trace.packet.header. >`
1291   * Stream packet context: `<stream.packet.context. >`
1292   * Event header: `<stream.event.header. >`
1293   * Stream event context: `<stream.event.context. >`
1294   * Event context: `<event.context. >`
1295   * Event payload: `<event.fields. >`
1296
1297 The target dynamic scope must be specified explicitly when referring to
1298 a field outside of the static scope (absolute scope reference). No
1299 conflict can occur between relative and dynamic paths, because the
1300 keywords `trace`, `stream`, and `event` are reserved, and thus not
1301 permitted as field names. It is recommended that field names clashing
1302 with CTF and C99 reserved keywords use an underscore prefix to
1303 eliminate the risk of generating a description containing an invalid
1304 field name. Consequently, fields starting with an underscore should have
1305 their leading underscore removed by the CTF trace readers.
1306
1307 The information available in the dynamic scopes can be thought of as the
1308 current tracing context. At trace production, information about the
1309 current context is saved into the specified scope field levels. At trace
1310 consumption, for each event, the current trace context is therefore
1311 readable by accessing the upper dynamic scopes.
1312
1313
1314 ### 7.4 TSDL examples
1315
1316 The grammar representing the TSDL metadata is presented in
1317 [TSDL grammar](#specC). This section presents a rather lighter reading that
1318 consists in examples of TSDL metadata, with template values.
1319
1320 The stream ID can be left out if there is only one stream in the
1321 trace. The event `id` field can be left out if there is only one event
1322 in a stream.
1323
1324 ~~~ tsdl
1325 trace {
1326     major = /* value */;            /* CTF spec version major number */
1327     minor = /* value */;            /* CTF spec version minor number */
1328     uuid = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa";  /* Trace UUID */
1329     byte_order = /* be OR le */;    /* Endianness (required) */
1330     packet.header := struct {
1331         uint32_t magic;
1332         uint8_t  uuid[16];
1333         uint32_t stream_id;
1334     };
1335 };
1336
1337 /*
1338  * The "env" (environment) scope contains assignment expressions. The
1339  * field names and content are implementation-defined.
1340  */
1341 env {
1342     pid = /* value */;    /* example */
1343     proc_name = "name";   /* example */
1344     /* ... */
1345 };
1346
1347 stream {
1348     id = /* stream_id */;
1349     /* Type 1 - Few event IDs; Type 2 - Many event IDs. See section 6.1. */
1350     event.header := /* event_header_1 OR event_header_2 */;
1351     event.context := struct {
1352         /* ... */
1353     };
1354     packet.context := struct {
1355         /* ... */
1356     };
1357 };
1358
1359 event {
1360     name = "event_name";
1361     id = /* value */;            /* Numeric identifier within the stream */
1362     stream_id = /* stream_id */;
1363     loglevel = /* value */;
1364     model.emf.uri = "string";
1365     context := struct {
1366         /* ... */
1367     };
1368     fields := struct {
1369         /* ... */
1370     };
1371 };
1372
1373 callsite {
1374     name = "event_name";
1375     func = "func_name";
1376     file = "myfile.c";
1377     line = 39;
1378     ip = 0x40096c;
1379 };
1380 ~~~
1381
1382 More detail on [types](#spec4):
1383
1384 ~~~ tsdl
1385 /*
1386  * Named types:
1387  *
1388  * Type declarations behave similarly to the C standard.
1389  */
1390
1391 typedef aliased_type_specifiers new_type_declarators;
1392
1393 /* e.g.: typedef struct example new_type_name[10]; */
1394
1395 /*
1396  * typealias
1397  *
1398  * The "typealias" declaration can be used to give a name (including
1399  * pointer declarator specifier) to a type. It should also be used to
1400  * map basic C types (float, int, unsigned long, ...) to a CTF type.
1401  * Typealias is a superset of "typedef": it also allows assignment of a
1402  * simple variable identifier to a type.
1403  */
1404
1405 typealias type_class {
1406     /* ... */
1407 } := type_specifiers type_declarator;
1408
1409 /*
1410  * e.g.:
1411  * typealias integer {
1412  *   size = 32;
1413  *   align = 32;
1414  *   signed = false;
1415  * } := struct page *;
1416  *
1417  * typealias integer {
1418  *  size = 32;
1419  *  align = 32;
1420  *  signed = true;
1421  * } := int;
1422  */
1423
1424 struct name {
1425     /* ... */
1426 };
1427
1428 variant name {
1429     /* ... */
1430 };
1431
1432 enum name : integer_type {
1433     /* ... */
1434 };
1435 ~~~
1436
1437 Unnamed types, contained within compound type fields, `typedef` or
1438 `typealias`:
1439
1440 ~~~ tsdl
1441 struct {
1442     /* ... */
1443 }
1444 ~~~
1445
1446 ~~~ tsdl
1447 struct {
1448     /* ... */
1449 } align(value)
1450 ~~~
1451
1452 ~~~ tsdl
1453 variant {
1454     /* ... */
1455 }
1456 ~~~
1457
1458 ~~~ tsdl
1459 enum : integer_type {
1460     /* ... */
1461 }
1462 ~~~
1463
1464 ~~~ tsdl
1465 typedef type new_type[length];
1466
1467 struct {
1468     type field_name[length];
1469 }
1470 ~~~
1471
1472 ~~~ tsdl
1473 typedef type new_type[length_type];
1474
1475 struct {
1476     type field_name[length_type];
1477 }
1478 ~~~
1479
1480 ~~~ tsdl
1481 integer {
1482     /* ... */
1483 }
1484 ~~~
1485
1486 ~~~ tsdl
1487 floating_point {
1488     /* ... */
1489 }
1490 ~~~
1491
1492 ~~~ tsdl
1493 struct {
1494     integer_type field_name:size;   /* GNU/C bitfield */
1495 }
1496 ~~~
1497
1498 ~~~ tsdl
1499 struct {
1500     string field_name;
1501 }
1502 ~~~
1503
1504
1505 ## 8. Clocks
1506
1507 Clock metadata allows to describe the clock topology of the system, as
1508 well as to detail each clock parameter. In absence of clock description,
1509 it is assumed that all fields named `timestamp` use the same clock
1510 source, which increments once per nanosecond.
1511
1512 Describing a clock and how it is used by streams is threefold: first,
1513 the clock and clock topology should be described in a `clock`
1514 description block, e.g.:
1515
1516 ~~~ tsdl
1517 clock {
1518     name = cycle_counter_sync;
1519     uuid = "62189bee-96dc-11e0-91a8-cfa3d89f3923";
1520     description = "Cycle counter synchronized across CPUs";
1521     freq = 1000000000;           /* frequency, in Hz */
1522     /* precision in seconds is: 1000 * (1/freq) */
1523     precision = 1000;
1524     /*
1525      * clock value offset from Epoch is:
1526      * offset_s + (offset * (1/freq))
1527      */
1528     offset_s = 1326476837;
1529     offset = 897235420;
1530     absolute = FALSE;
1531 };
1532 ~~~
1533
1534 The mandatory `name` field specifies the name of the clock identifier,
1535 which can later be used as a reference. The optional field `uuid` is
1536 the unique identifier of the clock. It can be used to correlate
1537 different traces that use the same clock. An optional textual
1538 description string can be added with the `description` field. The
1539 `freq` field is the initial frequency of the clock, in Hz. If the
1540 `freq` field is not present, the frequency is assumed to be 1000000000
1541 (providing clock increment of 1 ns). The optional `precision` field
1542 details the uncertainty on the clock measurements, in (1/freq) units.
1543 The `offset_s` and `offset` fields indicate the offset from
1544 POSIX.1 Epoch, 1970-01-01 00:00:00 +0000 (UTC), to the zero of value
1545 of the clock. The `offset_s` field is in seconds. The `offset` field is
1546 in (1/freq) units. If any of the `offset_s` or `offset` field is not
1547 present, it is assigned the 0 value. The field `absolute` is `TRUE` if
1548 the clock is a global reference across different clock UUID
1549 (e.g. NTP time). Otherwise, `absolute` is `FALSE`, and the clock can
1550 be considered as synchronized only with other clocks that have the same
1551 UUID.
1552
1553 Secondly, a reference to this clock should be added within an integer
1554 type:
1555
1556 ~~~ tsdl
1557 typealias integer {
1558     size = 64; align = 1; signed = false;
1559     map = clock.cycle_counter_sync.value;
1560 } := uint64_ccnt_t;
1561 ~~~
1562
1563 Thirdly, stream declarations can reference the clock they use as a
1564 timestamp source:
1565
1566 ~~~ tsdl
1567 struct packet_context {
1568     uint64_ccnt_t ccnt_begin;
1569     uint64_ccnt_t ccnt_end;
1570     /* ... */
1571 };
1572
1573 stream {
1574     /* ... */
1575     event.header := struct {
1576         uint64_ccnt_t timestamp;
1577         /* ... */
1578     };
1579     packet.context := struct packet_context;
1580 };
1581 ~~~
1582
1583 For a N-bit integer type referring to a clock, if the integer overflows
1584 compared to the N low order bits of the clock prior value found in the
1585 same stream, then it is assumed that one, and only one, overflow
1586 occurred. It is therefore important that events encoding time on a small
1587 number of bits happen frequently enough to detect when more than one
1588 N-bit overflow occurs.
1589
1590 In a packet context, clock field names ending with `_begin` and `_end`
1591 have a special meaning: this refers to the timestamps at, respectively,
1592 the beginning and the end of each packet.
1593
1594
1595 ## A. Helper macros
1596
1597 The two following macros keep track of the size of a GNU/C structure
1598 without padding at the end by placing HEADER_END as the last field.
1599 A one byte end field is used for C90 compatibility (C99 flexible arrays
1600 could be used here). Note that this does not affect the effective
1601 structure size, which should always be calculated with the
1602 `header_sizeof()` helper.
1603
1604 ~~~ c
1605 #define HEADER_END          char end_field
1606 #define header_sizeof(type) offsetof(typeof(type), end_field)
1607 ~~~
1608
1609 ## B. Stream header rationale
1610
1611 An event stream is divided in contiguous event packets of variable
1612 size. These subdivisions allow the trace analyzer to perform a fast
1613 binary search by time within the stream (typically requiring to index
1614 only the event packet headers) without reading the whole stream. These
1615 subdivisions have a variable size to eliminate the need to transfer the
1616 event packet padding when partially filled event packets must be sent
1617 when streaming a trace for live viewing/analysis. An event packet can
1618 contain a certain amount of padding at the end. Dividing streams into
1619 event packets is also useful for network streaming over UDP and flight
1620 recorder mode tracing (a whole event packet can be swapped out of the
1621 buffer atomically for reading).
1622
1623 The stream header is repeated at the beginning of each event packet to
1624 allow flexibility in terms of:
1625
1626   * streaming support
1627   * allowing arbitrary buffers to be discarded without making the trace
1628     unreadable
1629   * allow UDP packet loss handling by either dealing with missing event packet
1630     or asking for re-transmission
1631   * transparently support flight recorder mode
1632   * transparently support crash dump
1633
1634
1635 ## C. TSDL Grammar
1636
1637 ~~~ c
1638 /*
1639  * Common Trace Format (CTF) Trace Stream Description Language (TSDL) Grammar.
1640  *
1641  * Inspired from the C99 grammar:
1642  * http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf (Annex A)
1643  * and c++1x grammar (draft)
1644  * http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3291.pdf (Annex A)
1645  *
1646  * Specialized for CTF needs by including only constant and declarations from
1647  * C99 (excluding function declarations), and by adding support for variants,
1648  * sequences and CTF-specific specifiers. Enumeration container types
1649  * semantic is inspired from c++1x enum-base.
1650  */
1651 ~~~
1652
1653
1654 ### C.1 Lexical grammar
1655
1656
1657 #### C.1.1 Lexical elements
1658
1659 ~~~ text
1660 token:
1661     keyword
1662     identifier
1663     constant
1664     string-literal
1665     punctuator
1666 ~~~
1667
1668 #### C.1.2 Keywords
1669
1670 ~~~ text
1671 keyword: is one of
1672
1673 align
1674 callsite
1675 const
1676 char
1677 clock
1678 double
1679 enum
1680 env
1681 event
1682 floating_point
1683 float
1684 integer
1685 int
1686 long
1687 short
1688 signed
1689 stream
1690 string
1691 struct
1692 trace
1693 typealias
1694 typedef
1695 unsigned
1696 variant
1697 void
1698 _Bool
1699 _Complex
1700 _Imaginary
1701 ~~~
1702
1703
1704 #### C.1.3 Identifiers
1705
1706 ~~~ text
1707 identifier:
1708     identifier-nondigit
1709     identifier identifier-nondigit
1710     identifier digit
1711
1712 identifier-nondigit:
1713     nondigit
1714     universal-character-name
1715     any other implementation-defined characters
1716
1717 nondigit:
1718     _
1719     [a-zA-Z]    /* regular expression */
1720
1721 digit:
1722     [0-9]        /* regular expression */
1723 ~~~
1724
1725
1726 #### C.1.4 Universal character names
1727
1728 ~~~ text
1729 universal-character-name:
1730     \u hex-quad
1731     \U hex-quad hex-quad
1732
1733 hex-quad:
1734     hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit
1735 ~~~
1736
1737
1738 ##### C.1.5 Constants
1739
1740 ~~~ text
1741 constant:
1742     integer-constant
1743     enumeration-constant
1744     character-constant
1745
1746 integer-constant:
1747     decimal-constant integer-suffix-opt
1748     octal-constant integer-suffix-opt
1749     hexadecimal-constant integer-suffix-opt
1750
1751 decimal-constant:
1752     nonzero-digit
1753     decimal-constant digit
1754
1755 octal-constant:
1756     0
1757     octal-constant octal-digit
1758
1759 hexadecimal-constant:
1760     hexadecimal-prefix hexadecimal-digit
1761     hexadecimal-constant hexadecimal-digit
1762
1763 hexadecimal-prefix:
1764     0x
1765     0X
1766
1767 nonzero-digit:
1768     [1-9]
1769
1770 integer-suffix:
1771     unsigned-suffix long-suffix-opt
1772     unsigned-suffix long-long-suffix
1773     long-suffix unsigned-suffix-opt
1774     long-long-suffix unsigned-suffix-opt
1775
1776 unsigned-suffix:
1777     u
1778     U
1779
1780 long-suffix:
1781     l
1782     L
1783
1784 long-long-suffix:
1785     ll
1786     LL
1787
1788 enumeration-constant:
1789     identifier
1790     string-literal
1791
1792 character-constant:
1793     ' c-char-sequence '
1794     L' c-char-sequence '
1795
1796 c-char-sequence:
1797     c-char
1798     c-char-sequence c-char
1799
1800 c-char:
1801     any member of source charset except single-quote ('), backslash
1802     (\), or new-line character.
1803     escape-sequence
1804
1805 escape-sequence:
1806     simple-escape-sequence
1807     octal-escape-sequence
1808     hexadecimal-escape-sequence
1809     universal-character-name
1810
1811 simple-escape-sequence: one of
1812     \' \" \? \\ \a \b \f \n \r \t \v
1813
1814 octal-escape-sequence:
1815     \ octal-digit
1816     \ octal-digit octal-digit
1817     \ octal-digit octal-digit octal-digit
1818
1819 hexadecimal-escape-sequence:
1820     \x hexadecimal-digit
1821     hexadecimal-escape-sequence hexadecimal-digit
1822 ~~~
1823
1824
1825 #### C.1.6 String literals
1826
1827 ~~~ text
1828 string-literal:
1829     " s-char-sequence-opt "
1830     L" s-char-sequence-opt "
1831
1832 s-char-sequence:
1833     s-char
1834     s-char-sequence s-char
1835
1836 s-char:
1837     any member of source charset except double-quote ("), backslash
1838     (\), or new-line character.
1839     escape-sequence
1840 ~~~
1841
1842
1843 #### C.1.7 Punctuators
1844
1845 ~~~ text
1846 punctuator: one of
1847     [ ] ( ) { } . -> * + - < > : ; ... = ,
1848 ~~~
1849
1850
1851 ### C.2 Phrase structure grammar
1852
1853 ~~~ text
1854 primary-expression:
1855     identifier
1856     constant
1857     string-literal
1858     ( unary-expression )
1859
1860 postfix-expression:
1861     primary-expression
1862     postfix-expression [ unary-expression ]
1863     postfix-expression . identifier
1864     postfix-expressoin -> identifier
1865
1866 unary-expression:
1867     postfix-expression
1868     unary-operator postfix-expression
1869
1870 unary-operator: one of
1871     + -
1872
1873 assignment-operator:
1874     =
1875
1876 type-assignment-operator:
1877     :=
1878
1879 constant-expression-range:
1880     unary-expression ... unary-expression
1881 ~~~
1882
1883
1884 #### C.2.2 Declarations:
1885
1886 ~~~ text
1887 declaration:
1888     declaration-specifiers declarator-list-opt ;
1889     ctf-specifier ;
1890
1891 declaration-specifiers:
1892     storage-class-specifier declaration-specifiers-opt
1893     type-specifier declaration-specifiers-opt
1894     type-qualifier declaration-specifiers-opt
1895
1896 declarator-list:
1897     declarator
1898     declarator-list , declarator
1899
1900 abstract-declarator-list:
1901     abstract-declarator
1902     abstract-declarator-list , abstract-declarator
1903
1904 storage-class-specifier:
1905     typedef
1906
1907 type-specifier:
1908     void
1909     char
1910     short
1911     int
1912     long
1913     float
1914     double
1915     signed
1916     unsigned
1917     _Bool
1918     _Complex
1919     _Imaginary
1920     struct-specifier
1921     variant-specifier
1922     enum-specifier
1923     typedef-name
1924     ctf-type-specifier
1925
1926 align-attribute:
1927     align ( unary-expression )
1928
1929 struct-specifier:
1930     struct identifier-opt { struct-or-variant-declaration-list-opt } align-attribute-opt
1931     struct identifier align-attribute-opt
1932
1933 struct-or-variant-declaration-list:
1934     struct-or-variant-declaration
1935     struct-or-variant-declaration-list struct-or-variant-declaration
1936
1937 struct-or-variant-declaration:
1938     specifier-qualifier-list struct-or-variant-declarator-list ;
1939     declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list ;
1940     typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list ;
1941     typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list ;
1942
1943 specifier-qualifier-list:
1944     type-specifier specifier-qualifier-list-opt
1945     type-qualifier specifier-qualifier-list-opt
1946
1947 struct-or-variant-declarator-list:
1948     struct-or-variant-declarator
1949     struct-or-variant-declarator-list , struct-or-variant-declarator
1950
1951 struct-or-variant-declarator:
1952     declarator
1953     declarator-opt : unary-expression
1954
1955 variant-specifier:
1956     variant identifier-opt variant-tag-opt { struct-or-variant-declaration-list }
1957     variant identifier variant-tag
1958
1959 variant-tag:
1960     < unary-expression >
1961
1962 enum-specifier:
1963     enum identifier-opt { enumerator-list }
1964     enum identifier-opt { enumerator-list , }
1965     enum identifier
1966     enum identifier-opt : declaration-specifiers { enumerator-list }
1967     enum identifier-opt : declaration-specifiers { enumerator-list , }
1968
1969 enumerator-list:
1970     enumerator
1971     enumerator-list , enumerator
1972
1973 enumerator:
1974     enumeration-constant
1975     enumeration-constant assignment-operator unary-expression
1976     enumeration-constant assignment-operator constant-expression-range
1977
1978 type-qualifier:
1979     const
1980
1981 declarator:
1982     pointer-opt direct-declarator
1983
1984 direct-declarator:
1985     identifier
1986     ( declarator )
1987     direct-declarator [ unary-expression ]
1988
1989 abstract-declarator:
1990     pointer-opt direct-abstract-declarator
1991
1992 direct-abstract-declarator:
1993     identifier-opt
1994     ( abstract-declarator )
1995     direct-abstract-declarator [ unary-expression ]
1996     direct-abstract-declarator [ ]
1997
1998 pointer:
1999     * type-qualifier-list-opt
2000     * type-qualifier-list-opt pointer
2001
2002 type-qualifier-list:
2003     type-qualifier
2004     type-qualifier-list type-qualifier
2005
2006 typedef-name:
2007     identifier
2008 ~~~
2009
2010
2011 #### C.2.3 CTF-specific declarations
2012
2013 ~~~ text
2014 ctf-specifier:
2015     clock { ctf-assignment-expression-list-opt }
2016     event { ctf-assignment-expression-list-opt }
2017     stream { ctf-assignment-expression-list-opt }
2018     env { ctf-assignment-expression-list-opt }
2019     trace { ctf-assignment-expression-list-opt }
2020     callsite { ctf-assignment-expression-list-opt }
2021     typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list
2022     typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list
2023
2024 ctf-type-specifier:
2025     floating_point { ctf-assignment-expression-list-opt }
2026     integer { ctf-assignment-expression-list-opt }
2027     string { ctf-assignment-expression-list-opt }
2028     string
2029
2030 ctf-assignment-expression-list:
2031     ctf-assignment-expression ;
2032     ctf-assignment-expression-list ctf-assignment-expression ;
2033
2034 ctf-assignment-expression:
2035     unary-expression assignment-operator unary-expression
2036     unary-expression type-assignment-operator type-specifier
2037     declaration-specifiers-opt storage-class-specifier declaration-specifiers-opt declarator-list
2038     typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list
2039     typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list
2040 ~~~