Describe text-only and packet-based metadata
[ctf.git] / common-trace-format-proposal.txt
1
2 RFC: Common Trace Format (CTF) Proposal (pre-v1.7)
3
4 Mathieu Desnoyers, EfficiOS Inc.
5
6 The goal of the present document is to propose a trace format that suits the
7 needs of the embedded, telecom, high-performance and kernel communities. It is
8 based on the Common Trace Format Requirements (v1.4) document. It is designed to
9 allow traces to be natively generated by the Linux kernel, Linux user-space
10 applications written in C/C++, and hardware components. One major element of
11 CTF is the Trace Stream Description Language (TSDL) which flexibility
12 enables description of various binary trace stream layouts.
13
14 The latest version of this document can be found at:
15
16 git tree: git://git.efficios.com/ctf.git
17 gitweb: http://git.efficios.com/?p=ctf.git
18
19 A reference implementation of a library to read and write this trace format is
20 being implemented within the BabelTrace project, a converter between trace
21 formats. The development tree is available at:
22
23 git tree: git://git.efficios.com/babeltrace.git
24 gitweb: http://git.efficios.com/?p=babeltrace.git
25
26
27 1. Preliminary definitions
28
29 - Event Trace: An ordered sequence of events.
30 - Event Stream: An ordered sequence of events, containing a subset of the
31 trace event types.
32 - Event Packet: A sequence of physically contiguous events within an event
33 stream.
34 - Event: This is the basic entry in a trace. (aka: a trace record).
35 - An event identifier (ID) relates to the class (a type) of event within
36 an event stream.
37 e.g. event: irq_entry.
38 - An event (or event record) relates to a specific instance of an event
39 class.
40 e.g. event: irq_entry, at time X, on CPU Y
41 - Source Architecture: Architecture writing the trace.
42 - Reader Architecture: Architecture reading the trace.
43
44
45 2. High-level representation of a trace
46
47 A trace is divided into multiple event streams. Each event stream contains a
48 subset of the trace event types.
49
50 The final output of the trace, after its generation and optional transport over
51 the network, is expected to be either on permanent or temporary storage in a
52 virtual file system. Because each event stream is appended to while a trace is
53 being recorded, each is associated with a separate file for output. Therefore,
54 a stored trace can be represented as a directory containing one file per stream.
55
56 Meta-data description associated with the trace contains information on
57 trace event types expressed in the Trace Stream Description Language
58 (TSDL). This language describes:
59
60 - Trace version.
61 - Types available.
62 - Per-trace event header description.
63 - Per-stream event header description.
64 - Per-stream event context description.
65 - Per-event
66 - Event type to stream mapping.
67 - Event type to name mapping.
68 - Event type to ID mapping.
69 - Event context description.
70 - Event fields description.
71
72
73 3. Event stream
74
75 An event stream can be divided into contiguous event packets of variable
76 size. These subdivisions have a variable size. An event packet can
77 contain a certain amount of padding at the end. The stream header is
78 repeated at the beginning of each event packet. The rationale for the
79 event stream design choices is explained in Appendix B. Stream Header
80 Rationale.
81
82 The event stream header will therefore be referred to as the "event packet
83 header" throughout the rest of this document.
84
85
86 4. Types
87
88 Types are organized as type classes. Each type class belong to either of two
89 kind of types: basic types or compound types.
90
91 4.1 Basic types
92
93 A basic type is a scalar type, as described in this section. It includes
94 integers, GNU/C bitfields, enumerations, and floating point values.
95
96 4.1.1 Type inheritance
97
98 Type specifications can be inherited to allow deriving types from a
99 type class. For example, see the uint32_t named type derived from the "integer"
100 type class below ("Integers" section). Types have a precise binary
101 representation in the trace. A type class has methods to read and write these
102 types, but must be derived into a type to be usable in an event field.
103
104 4.1.2 Alignment
105
106 We define "byte-packed" types as aligned on the byte size, namely 8-bit.
107 We define "bit-packed" types as following on the next bit, as defined by the
108 "Integers" section.
109
110 Each basic type must specify its alignment, in bits. Examples of
111 possible alignments are: bit-packed, byte-packed, or word-aligned. The
112 choice depends on the architecture preference and compactness vs
113 performance trade-offs of the implementation. Architectures providing
114 fast unaligned write byte-packed basic types to save space, aligning
115 each type on byte boundaries (8-bit). Architectures with slow unaligned
116 writes align types on specific alignment values. If no specific
117 alignment is declared for a type, it is assumed to be bit-packed for
118 integers with size not multiple of 8 bits and for gcc bitfields. All
119 other types are byte-packed. It is however recommended to always specify
120 the alignment explicitly.
121
122 TSDL meta-data attribute representation of a specific alignment:
123
124 align = value; /* value in bits */
125
126 4.1.3 Byte order
127
128 By default, the native endianness of the source architecture the trace is used.
129 Byte order can be overridden for a basic type by specifying a "byte_order"
130 attribute. Typical use-case is to specify the network byte order (big endian:
131 "be") to save data captured from the network into the trace without conversion.
132 If not specified, the byte order is native.
133
134 TSDL meta-data representation:
135
136 byte_order = native OR network OR be OR le; /* network and be are aliases */
137
138 4.1.4 Size
139
140 Type size, in bits, for integers and floats is that returned by "sizeof()" in C
141 multiplied by CHAR_BIT.
142 We require the size of "char" and "unsigned char" types (CHAR_BIT) to be fixed
143 to 8 bits for cross-endianness compatibility.
144
145 TSDL meta-data representation:
146
147 size = value; (value is in bits)
148
149 4.1.5 Integers
150
151 Signed integers are represented in two-complement. Integer alignment,
152 size, signedness and byte ordering are defined in the TSDL meta-data.
153 Integers aligned on byte size (8-bit) and with length multiple of byte
154 size (8-bit) correspond to the C99 standard integers. In addition,
155 integers with alignment and/or size that are _not_ a multiple of the
156 byte size are permitted; these correspond to the C99 standard bitfields,
157 with the added specification that the CTF integer bitfields have a fixed
158 binary representation. A MIT-licensed reference implementation of the
159 CTF portable bitfields is available at:
160
161 http://git.efficios.com/?p=babeltrace.git;a=blob;f=include/babeltrace/bitfield.h
162
163 Binary representation of integers:
164
165 - On little and big endian:
166 - Within a byte, high bits correspond to an integer high bits, and low bits
167 correspond to low bits.
168 - On little endian:
169 - Integer across multiple bytes are placed from the less significant to the
170 most significant.
171 - Consecutive integers are placed from lower bits to higher bits (even within
172 a byte).
173 - On big endian:
174 - Integer across multiple bytes are placed from the most significant to the
175 less significant.
176 - Consecutive integers are placed from higher bits to lower bits (even within
177 a byte).
178
179 This binary representation is derived from the bitfield implementation in GCC
180 for little and big endian. However, contrary to what GCC does, integers can
181 cross units boundaries (no padding is required). Padding can be explicitly
182 added (see 4.1.6 GNU/C bitfields) to follow the GCC layout if needed.
183
184 TSDL meta-data representation:
185
186 integer {
187 signed = true OR false; /* default false */
188 byte_order = native OR network OR be OR le; /* default native */
189 size = value; /* value in bits, no default */
190 align = value; /* value in bits */
191 }
192
193 Example of type inheritance (creation of a uint32_t named type):
194
195 typealias integer {
196 size = 32;
197 signed = false;
198 align = 32;
199 } := uint32_t;
200
201 Definition of a named 5-bit signed bitfield:
202
203 typealias integer {
204 size = 5;
205 signed = true;
206 align = 1;
207 } := int5_t;
208
209 4.1.6 GNU/C bitfields
210
211 The GNU/C bitfields follow closely the integer representation, with a
212 particularity on alignment: if a bitfield cannot fit in the current unit, the
213 unit is padded and the bitfield starts at the following unit. The unit size is
214 defined by the size of the type "unit_type".
215
216 TSDL meta-data representation:
217
218 unit_type name:size:
219
220 As an example, the following structure declared in C compiled by GCC:
221
222 struct example {
223 short a:12;
224 short b:5;
225 };
226
227 The example structure is aligned on the largest element (short). The second
228 bitfield would be aligned on the next unit boundary, because it would not fit in
229 the current unit.
230
231 4.1.7 Floating point
232
233 The floating point values byte ordering is defined in the TSDL meta-data.
234
235 Floating point values follow the IEEE 754-2008 standard interchange formats.
236 Description of the floating point values include the exponent and mantissa size
237 in bits. Some requirements are imposed on the floating point values:
238
239 - FLT_RADIX must be 2.
240 - mant_dig is the number of digits represented in the mantissa. It is specified
241 by the ISO C99 standard, section 5.2.4, as FLT_MANT_DIG, DBL_MANT_DIG and
242 LDBL_MANT_DIG as defined by <float.h>.
243 - exp_dig is the number of digits represented in the exponent. Given that
244 mant_dig is one bit more than its actual size in bits (leading 1 is not
245 needed) and also given that the sign bit always takes one bit, exp_dig can be
246 specified as:
247
248 - sizeof(float) * CHAR_BIT - FLT_MANT_DIG
249 - sizeof(double) * CHAR_BIT - DBL_MANT_DIG
250 - sizeof(long double) * CHAR_BIT - LDBL_MANT_DIG
251
252 TSDL meta-data representation:
253
254 floating_point {
255 exp_dig = value;
256 mant_dig = value;
257 byte_order = native OR network OR be OR le;
258 }
259
260 Example of type inheritance:
261
262 typealias floating_point {
263 exp_dig = 8; /* sizeof(float) * CHAR_BIT - FLT_MANT_DIG */
264 mant_dig = 24; /* FLT_MANT_DIG */
265 byte_order = native;
266 } := float;
267
268 TODO: define NaN, +inf, -inf behavior.
269
270 4.1.8 Enumerations
271
272 Enumerations are a mapping between an integer type and a table of strings. The
273 numerical representation of the enumeration follows the integer type specified
274 by the meta-data. The enumeration mapping table is detailed in the enumeration
275 description within the meta-data. The mapping table maps inclusive value
276 ranges (or single values) to strings. Instead of being limited to simple
277 "value -> string" mappings, these enumerations map
278 "[ start_value ... end_value ] -> string", which map inclusive ranges of
279 values to strings. An enumeration from the C language can be represented in
280 this format by having the same start_value and end_value for each element, which
281 is in fact a range of size 1. This single-value range is supported without
282 repeating the start and end values with the value = string declaration.
283
284 enum name : integer_type {
285 somestring = start_value1 ... end_value1,
286 "other string" = start_value2 ... end_value2,
287 yet_another_string, /* will be assigned to end_value2 + 1 */
288 "some other string" = value,
289 ...
290 };
291
292 If the values are omitted, the enumeration starts at 0 and increment of 1 for
293 each entry:
294
295 enum name : unsigned int {
296 ZERO,
297 ONE,
298 TWO,
299 TEN = 10,
300 ELEVEN,
301 };
302
303 Overlapping ranges within a single enumeration are implementation defined.
304
305 A nameless enumeration can be declared as a field type or as part of a typedef:
306
307 enum : integer_type {
308 ...
309 }
310
311 Enumerations omitting the container type ": integer_type" use the "int"
312 type (for compatibility with C99). The "int" type must be previously
313 declared. E.g.:
314
315 typealias integer { size = 32; align = 32; signed = true } := int;
316
317 enum {
318 ...
319 }
320
321
322 4.2 Compound types
323
324 Compound are aggregation of type declarations. Compound types include
325 structures, variant, arrays, sequences, and strings.
326
327 4.2.1 Structures
328
329 Structures are aligned on the largest alignment required by basic types
330 contained within the structure. (This follows the ISO/C standard for structures)
331
332 TSDL meta-data representation of a named structure:
333
334 struct name {
335 field_type field_name;
336 field_type field_name;
337 ...
338 };
339
340 Example:
341
342 struct example {
343 integer { /* Nameless type */
344 size = 16;
345 signed = true;
346 align = 16;
347 } first_field_name;
348 uint64_t second_field_name; /* Named type declared in the meta-data */
349 };
350
351 The fields are placed in a sequence next to each other. They each possess a
352 field name, which is a unique identifier within the structure.
353
354 A nameless structure can be declared as a field type or as part of a typedef:
355
356 struct {
357 ...
358 }
359
360 4.2.2 Variants (Discriminated/Tagged Unions)
361
362 A CTF variant is a selection between different types. A CTF variant must
363 always be defined within the scope of a structure or within fields
364 contained within a structure (defined recursively). A "tag" enumeration
365 field must appear in either the same lexical scope, prior to the variant
366 field (in field declaration order), in an uppermost lexical scope (see
367 Section 7.3.1), or in an uppermost dynamic scope (see Section 7.3.2).
368 The type selection is indicated by the mapping from the enumeration
369 value to the string used as variant type selector. The field to use as
370 tag is specified by the "tag_field", specified between "< >" after the
371 "variant" keyword for unnamed variants, and after "variant name" for
372 named variants.
373
374 The alignment of the variant is the alignment of the type as selected by the tag
375 value for the specific instance of the variant. The alignment of the type
376 containing the variant is independent of the variant alignment. The size of the
377 variant is the size as selected by the tag value for the specific instance of
378 the variant.
379
380 A named variant declaration followed by its definition within a structure
381 declaration:
382
383 variant name {
384 field_type sel1;
385 field_type sel2;
386 field_type sel3;
387 ...
388 };
389
390 struct {
391 enum : integer_type { sel1, sel2, sel3, ... } tag_field;
392 ...
393 variant name <tag_field> v;
394 }
395
396 An unnamed variant definition within a structure is expressed by the following
397 TSDL meta-data:
398
399 struct {
400 enum : integer_type { sel1, sel2, sel3, ... } tag_field;
401 ...
402 variant <tag_field> {
403 field_type sel1;
404 field_type sel2;
405 field_type sel3;
406 ...
407 } v;
408 }
409
410 Example of a named variant within a sequence that refers to a single tag field:
411
412 variant example {
413 uint32_t a;
414 uint64_t b;
415 short c;
416 };
417
418 struct {
419 enum : uint2_t { a, b, c } choice;
420 variant example <choice> v[unsigned int];
421 }
422
423 Example of an unnamed variant:
424
425 struct {
426 enum : uint2_t { a, b, c, d } choice;
427 /* Unrelated fields can be added between the variant and its tag */
428 int32_t somevalue;
429 variant <choice> {
430 uint32_t a;
431 uint64_t b;
432 short c;
433 struct {
434 unsigned int field1;
435 uint64_t field2;
436 } d;
437 } s;
438 }
439
440 Example of an unnamed variant within an array:
441
442 struct {
443 enum : uint2_t { a, b, c } choice;
444 variant <choice> {
445 uint32_t a;
446 uint64_t b;
447 short c;
448 } v[10];
449 }
450
451 Example of a variant type definition within a structure, where the defined type
452 is then declared within an array of structures. This variant refers to a tag
453 located in an upper lexical scope. This example clearly shows that a variant
454 type definition referring to the tag "x" uses the closest preceding field from
455 the lexical scope of the type definition.
456
457 struct {
458 enum : uint2_t { a, b, c, d } x;
459
460 typedef variant <x> { /*
461 * "x" refers to the preceding "x" enumeration in the
462 * lexical scope of the type definition.
463 */
464 uint32_t a;
465 uint64_t b;
466 short c;
467 } example_variant;
468
469 struct {
470 enum : int { x, y, z } x; /* This enumeration is not used by "v". */
471 example_variant v; /*
472 * "v" uses the "enum : uint2_t { a, b, c, d }"
473 * tag.
474 */
475 } a[10];
476 }
477
478 4.2.3 Arrays
479
480 Arrays are fixed-length. Their length is declared in the type
481 declaration within the meta-data. They contain an array of "inner type"
482 elements, which can refer to any type not containing the type of the
483 array being declared (no circular dependency). The length is the number
484 of elements in an array.
485
486 TSDL meta-data representation of a named array:
487
488 typedef elem_type name[length];
489
490 A nameless array can be declared as a field type within a structure, e.g.:
491
492 uint8_t field_name[10];
493
494
495 4.2.4 Sequences
496
497 Sequences are dynamically-sized arrays. They start with an integer that specify
498 the length of the sequence, followed by an array of "inner type" elements.
499 The length is the number of elements in the sequence.
500
501 TSDL meta-data representation for a named sequence:
502
503 typedef elem_type name[length_type];
504
505 A nameless sequence can be declared as a field type, e.g.:
506
507 long field_name[int];
508
509 The length type follows the integer types specifications, and the sequence
510 elements follow the "array" specifications.
511
512 4.2.5 Strings
513
514 Strings are an array of bytes of variable size and are terminated by a '\0'
515 "NULL" character. Their encoding is described in the TSDL meta-data. In
516 absence of encoding attribute information, the default encoding is
517 UTF-8.
518
519 TSDL meta-data representation of a named string type:
520
521 typealias string {
522 encoding = UTF8 OR ASCII;
523 } := name;
524
525 A nameless string type can be declared as a field type:
526
527 string field_name; /* Use default UTF8 encoding */
528
529 5. Event Packet Header
530
531 The event packet header consists of two parts: the "event packet header"
532 is the same for all streams of a trace. The second part, the "event
533 packet context", is described on a per-stream basis. Both are described
534 in the TSDL meta-data. The packets are aligned on architecture-page-sized
535 addresses.
536
537 Event packet header (all fields are optional, specified by TSDL meta-data):
538
539 - Magic number (CTF magic number: 0xC1FC1FC1) specifies that this is a
540 CTF packet. This magic number is optional, but when present, it should
541 come at the very beginning of the packet.
542 - Trace UUID, used to ensure the event packet match the meta-data used.
543 (note: we cannot use a meta-data checksum in every cases instead of a
544 UUID because meta-data can be appended to while tracing is active)
545 This field is optional.
546 - Stream ID, used as reference to stream description in meta-data.
547 This field is optional if there is only one stream description in the
548 meta-data, but becomes required if there are more than one stream in
549 the TSDL meta-data description.
550
551 Event packet context (all fields are optional, specified by TSDL meta-data):
552
553 - Event packet content size (in bytes).
554 - Event packet size (in bytes, includes padding).
555 - Event packet content checksum (optional). Checksum excludes the event packet
556 header.
557 - Per-stream event packet sequence count (to deal with UDP packet loss). The
558 number of significant sequence counter bits should also be present, so
559 wrap-arounds are dealt with correctly.
560 - Time-stamp at the beginning and time-stamp at the end of the event packet.
561 Both timestamps are written in the packet header, but sampled respectively
562 while (or before) writing the first event and while (or after) writing the
563 last event in the packet. The inclusive range between these timestamps should
564 include all event timestamps assigned to events contained within the packet.
565 - Events discarded count
566 - Snapshot of a per-stream free-running counter, counting the number of
567 events discarded that were supposed to be written in the stream prior to
568 the first event in the event packet.
569 * Note: producer-consumer buffer full condition should fill the current
570 event packet with padding so we know exactly where events have been
571 discarded.
572 - Lossless compression scheme used for the event packet content. Applied
573 directly to raw data. New types of compression can be added in following
574 versions of the format.
575 0: no compression scheme
576 1: bzip2
577 2: gzip
578 3: xz
579 - Cypher used for the event packet content. Applied after compression.
580 0: no encryption
581 1: AES
582 - Checksum scheme used for the event packet content. Applied after encryption.
583 0: no checksum
584 1: md5
585 2: sha1
586 3: crc32
587
588 5.1 Event Packet Header Description
589
590 The event packet header layout is indicated by the trace packet.header
591 field. Here is a recommended structure type for the packet header with
592 the fields typically expected (although these fields are each optional):
593
594 struct event_packet_header {
595 uint32_t magic;
596 uint8_t trace_uuid[16];
597 uint32_t stream_id;
598 };
599
600 trace {
601 ...
602 packet.header := struct event_packet_header;
603 };
604
605 If the magic number is not present, tools such as "file" will have no
606 mean to discover the file type.
607
608 If the trace_uuid is not present, no validation that the meta-data
609 actually corresponds to the stream is performed.
610
611 If the stream_id packet header field is missing, the trace can only
612 contain a single stream. Its "id" field can be left out, and its events
613 don't need to declare a "stream_id" field.
614
615
616 5.2 Event Packet Context Description
617
618 Event packet context example. These are declared within the stream declaration
619 in the meta-data. All these fields are optional. If the packet size field is
620 missing, the whole stream only contains a single packet. If the content
621 size field is missing, the packet is filled (no padding). The content
622 and packet sizes include all headers.
623
624 An example event packet context type:
625
626 struct event_packet_context {
627 uint64_t timestamp_begin;
628 uint64_t timestamp_end;
629 uint32_t checksum;
630 uint32_t stream_packet_count;
631 uint32_t events_discarded;
632 uint32_t cpu_id;
633 uint32_t/uint16_t content_size;
634 uint32_t/uint16_t packet_size;
635 uint8_t stream_packet_count_bits; /* Significant counter bits */
636 uint8_t compression_scheme;
637 uint8_t encryption_scheme;
638 uint8_t checksum_scheme;
639 };
640
641
642 6. Event Structure
643
644 The overall structure of an event is:
645
646 1 - Stream Packet Context (as specified by the stream meta-data)
647 2 - Event Header (as specified by the stream meta-data)
648 3 - Stream Event Context (as specified by the stream meta-data)
649 4 - Event Context (as specified by the event meta-data)
650 5 - Event Payload (as specified by the event meta-data)
651
652 This structure defines an implicit dynamic scoping, where variants
653 located in inner structures (those with a higher number in the listing
654 above) can refer to the fields of outer structures (with lower number in
655 the listing above). See Section 7.3 TSDL Scopes for more detail.
656
657 6.1 Event Header
658
659 Event headers can be described within the meta-data. We hereby propose, as an
660 example, two types of events headers. Type 1 accommodates streams with less than
661 31 event IDs. Type 2 accommodates streams with 31 or more event IDs.
662
663 One major factor can vary between streams: the number of event IDs assigned to
664 a stream. Luckily, this information tends to stay relatively constant (modulo
665 event registration while trace is being recorded), so we can specify different
666 representations for streams containing few event IDs and streams containing
667 many event IDs, so we end up representing the event ID and time-stamp as
668 densely as possible in each case.
669
670 The header is extended in the rare occasions where the information cannot be
671 represented in the ranges available in the standard event header. They are also
672 used in the rare occasions where the data required for a field could not be
673 collected: the flag corresponding to the missing field within the missing_fields
674 array is then set to 1.
675
676 Types uintX_t represent an X-bit unsigned integer, as declared with
677 either:
678
679 typealias integer { size = X; align = X; signed = false } := uintX_t;
680
681 or
682
683 typealias integer { size = X; align = 1; signed = false } := uintX_t;
684
685 6.1.1 Type 1 - Few event IDs
686
687 - Aligned on 32-bit (or 8-bit if byte-packed, depending on the architecture
688 preference).
689 - Native architecture byte ordering.
690 - For "compact" selection
691 - Fixed size: 32 bits.
692 - For "extended" selection
693 - Size depends on the architecture and variant alignment.
694
695 struct event_header_1 {
696 /*
697 * id: range: 0 - 30.
698 * id 31 is reserved to indicate an extended header.
699 */
700 enum : uint5_t { compact = 0 ... 30, extended = 31 } id;
701 variant <id> {
702 struct {
703 uint27_t timestamp;
704 } compact;
705 struct {
706 uint32_t id; /* 32-bit event IDs */
707 uint64_t timestamp; /* 64-bit timestamps */
708 } extended;
709 } v;
710 };
711
712
713 6.1.2 Type 2 - Many event IDs
714
715 - Aligned on 16-bit (or 8-bit if byte-packed, depending on the architecture
716 preference).
717 - Native architecture byte ordering.
718 - For "compact" selection
719 - Size depends on the architecture and variant alignment.
720 - For "extended" selection
721 - Size depends on the architecture and variant alignment.
722
723 struct event_header_2 {
724 /*
725 * id: range: 0 - 65534.
726 * id 65535 is reserved to indicate an extended header.
727 */
728 enum : uint16_t { compact = 0 ... 65534, extended = 65535 } id;
729 variant <id> {
730 struct {
731 uint32_t timestamp;
732 } compact;
733 struct {
734 uint32_t id; /* 32-bit event IDs */
735 uint64_t timestamp; /* 64-bit timestamps */
736 } extended;
737 } v;
738 };
739
740
741 6.2 Event Context
742
743 The event context contains information relative to the current event.
744 The choice and meaning of this information is specified by the TSDL
745 stream and event meta-data descriptions. The stream context is applied
746 to all events within the stream. The stream context structure follows
747 the event header. The event context is applied to specific events. Its
748 structure follows the stream context structure.
749
750 An example of stream-level event context is to save the event payload size with
751 each event, or to save the current PID with each event. These are declared
752 within the stream declaration within the meta-data:
753
754 stream {
755 ...
756 event.context := struct {
757 uint pid;
758 uint16_t payload_size;
759 };
760 };
761
762 An example of event-specific event context is to declare a bitmap of missing
763 fields, only appended after the stream event context if the extended event
764 header is selected. NR_FIELDS is the number of fields within the event (a
765 numeric value).
766
767 event {
768 context = struct {
769 variant <id> {
770 struct { } compact;
771 struct {
772 uint1_t missing_fields[NR_FIELDS]; /* missing event fields bitmap */
773 } extended;
774 } v;
775 };
776 ...
777 }
778
779 6.3 Event Payload
780
781 An event payload contains fields specific to a given event type. The fields
782 belonging to an event type are described in the event-specific meta-data
783 within a structure type.
784
785 6.3.1 Padding
786
787 No padding at the end of the event payload. This differs from the ISO/C standard
788 for structures, but follows the CTF standard for structures. In a trace, even
789 though it makes sense to align the beginning of a structure, it really makes no
790 sense to add padding at the end of the structure, because structures are usually
791 not followed by a structure of the same type.
792
793 This trick can be done by adding a zero-length "end" field at the end of the C
794 structures, and by using the offset of this field rather than using sizeof()
795 when calculating the size of a structure (see Appendix "A. Helper macros").
796
797 6.3.2 Alignment
798
799 The event payload is aligned on the largest alignment required by types
800 contained within the payload. (This follows the ISO/C standard for structures)
801
802
803 7. Trace Stream Description Language (TSDL)
804
805 The Trace Stream Description Language (TSDL) allows expression of the
806 binary trace streams layout in a C99-like Domain Specific Language
807 (DSL).
808
809
810 7.1 Meta-data
811
812 The trace stream layout description is located in the trace meta-data.
813 The meta-data is itself located in a stream identified by its name:
814 "metadata".
815
816 The meta-data description can be expressed in two different formats:
817 text-only and packet-based. The text-only description facilitates
818 generation of meta-data and provides a convenient way to enter the
819 meta-data information by hand. The packet-based meta-data provides the
820 CTF stream packet facilities (checksumming, compression, encryption,
821 network-readiness) for meta-data stream generated and transported by a
822 tracer.
823
824 The text-only meta-data file is a plain text TSDL description.
825
826 The packet-based meta-data is made of "meta-data packets", which each
827 start with a meta-data packet header. The packet-based meta-data
828 description is detected by reading the magic number "0x75D11D57" at the
829 beginning of the file. This magic number is also used to detect the
830 endianness of the architecture by trying to read the CTF magic number
831 and its counterpart in reversed endianness. The events within the
832 meta-data stream have no event header nor event context. Each event only
833 contains a "string" payload. Each meta-data packet start with a special
834 packet header, specific to the meta-data stream, which contains,
835 exactly:
836
837 struct metadata_packet_header {
838 uint32_t magic; /* 0x3FF1C105 */
839 uint8_t trace_uuid[16]; /* Unique Universal Identifier */
840 uint32_t checksum; /* 0 if unused */
841 uint32_t content_size; /* in bits */
842 uint32_t packet_size; /* in bits */
843 uint8_t compression_scheme; /* 0 if unused */
844 uint8_t encryption_scheme; /* 0 if unused */
845 uint8_t checksum_scheme; /* 0 if unused */
846 };
847
848 The packet-based meta-data can be converted to a text-only meta-data by
849 concatenating all the strings in contains.
850
851 In the textual representation of the meta-data, the text contained
852 within "/*" and "*/", as well as within "//" and end of line, are
853 treated as comments. Boolean values can be represented as true, TRUE,
854 or 1 for true, and false, FALSE, or 0 for false. Within the string-based
855 meta-data description, the trace UUID is represented as a string of
856 hexadecimal digits and dashes "-". In the event packet header, the trace
857 UUID is represented as an array of bytes.
858
859
860 7.2 Declaration vs Definition
861
862 A declaration associates a layout to a type, without specifying where
863 this type is located in the event structure hierarchy (see Section 6).
864 This therefore includes typedef, typealias, as well as all type
865 specifiers. In certain circumstances (typedef, structure field and
866 variant field), a declaration is followed by a declarator, which specify
867 the newly defined type name (for typedef), or the field name (for
868 declarations located within structure and variants). Array and sequence,
869 declared with square brackets ("[" "]"), are part of the declarator,
870 similarly to C99. The enumeration base type is specified by
871 ": enum_base", which is part of the type specifier. The variant tag
872 name, specified between "<" ">", is also part of the type specifier.
873
874 A definition associates a type to a location in the event structure
875 hierarchy (see Section 6). This association is denoted by ":=", as shown
876 in Section 7.3.
877
878
879 7.3 TSDL Scopes
880
881 TSDL uses two different types of scoping: a lexical scope is used for
882 declarations and type definitions, and a dynamic scope is used for
883 variants references to tag fields.
884
885 7.3.1 Lexical Scope
886
887 Each of "trace", "stream", "event", "struct" and "variant" have their own
888 nestable declaration scope, within which types can be declared using "typedef"
889 and "typealias". A root declaration scope also contains all declarations
890 located outside of any of the aforementioned declarations. An inner
891 declaration scope can refer to type declared within its container
892 lexical scope prior to the inner declaration scope. Redefinition of a
893 typedef or typealias is not valid, although hiding an upper scope
894 typedef or typealias is allowed within a sub-scope.
895
896 7.3.2 Dynamic Scope
897
898 A dynamic scope consists in the lexical scope augmented with the
899 implicit event structure definition hierarchy presented at Section 6.
900 The dynamic scope is only used for variant tag definitions. It is used
901 at definition time to look up the location of the tag field associated
902 with a variant.
903
904 Therefore, variants in lower levels in the dynamic scope (e.g. event
905 context) can refer to a tag field located in upper levels (e.g. in the
906 event header) by specifying, in this case, the associated tag with
907 <header.field_name>. This allows, for instance, the event context to
908 define a variant referring to the "id" field of the event header as
909 selector.
910
911 The target dynamic scope must be specified explicitly when referring to
912 a field outside of the local static scope. The dynamic scope prefixes
913 are thus:
914
915 - Trace Packet Header: <trace.packet.header. >,
916 - Stream Packet Context: <stream.packet.context. >,
917 - Event Header: <stream.event.header. >,
918 - Stream Event Context: <stream.event.context. >,
919 - Event Context: <event.context. >,
920 - Event Payload: <event.fields. >.
921
922 Multiple declarations of the same field name within a single scope is
923 not valid. It is however valid to re-use the same field name in
924 different scopes. There is no possible conflict, because the dynamic
925 scope must be specified when a variant refers to a tag field located in
926 a different dynamic scope.
927
928 The information available in the dynamic scopes can be thought of as the
929 current tracing context. At trace production, information about the
930 current context is saved into the specified scope field levels. At trace
931 consumption, for each event, the current trace context is therefore
932 readable by accessing the upper dynamic scopes.
933
934
935 7.4 TSDL Examples
936
937 The grammar representing the TSDL meta-data is presented in Appendix C.
938 TSDL Grammar. This section presents a rather ligher reading that
939 consists in examples of TSDL meta-data, with template values.
940
941 The stream "id" can be left out if there is only one stream in the
942 trace. The event "id" field can be left out if there is only one event
943 in a stream.
944
945 trace {
946 major = value; /* Trace format version */
947 minor = value;
948 uuid = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"; /* Trace UUID */
949 byte_order = be OR le; /* Endianness (required) */
950 packet.header := struct {
951 uint32_t magic;
952 uint8_t trace_uuid[16];
953 uint32_t stream_id;
954 };
955 };
956
957 stream {
958 id = stream_id;
959 /* Type 1 - Few event IDs; Type 2 - Many event IDs. See section 6.1. */
960 event.header := event_header_1 OR event_header_2;
961 event.context := struct {
962 ...
963 };
964 packet.context := struct {
965 ...
966 };
967 };
968
969 event {
970 name = event_name;
971 id = value; /* Numeric identifier within the stream */
972 stream = stream_id;
973 context := struct {
974 ...
975 };
976 fields := struct {
977 ...
978 };
979 };
980
981 /* More detail on types in section 4. Types */
982
983 /*
984 * Named types:
985 *
986 * Type declarations behave similarly to the C standard.
987 */
988
989 typedef aliased_type_specifiers new_type_declarators;
990
991 /* e.g.: typedef struct example new_type_name[10]; */
992
993 /*
994 * typealias
995 *
996 * The "typealias" declaration can be used to give a name (including
997 * pointer declarator specifier) to a type. It should also be used to
998 * map basic C types (float, int, unsigned long, ...) to a CTF type.
999 * Typealias is a superset of "typedef": it also allows assignment of a
1000 * simple variable identifier to a type.
1001 */
1002
1003 typealias type_class {
1004 ...
1005 } := type_specifiers type_declarator;
1006
1007 /*
1008 * e.g.:
1009 * typealias integer {
1010 * size = 32;
1011 * align = 32;
1012 * signed = false;
1013 * } := struct page *;
1014 *
1015 * typealias integer {
1016 * size = 32;
1017 * align = 32;
1018 * signed = true;
1019 * } := int;
1020 */
1021
1022 struct name {
1023 ...
1024 };
1025
1026 variant name {
1027 ...
1028 };
1029
1030 enum name : integer_type {
1031 ...
1032 };
1033
1034
1035 /*
1036 * Unnamed types, contained within compound type fields, typedef or typealias.
1037 */
1038
1039 struct {
1040 ...
1041 }
1042
1043 variant {
1044 ...
1045 }
1046
1047 enum : integer_type {
1048 ...
1049 }
1050
1051 typedef type new_type[length];
1052
1053 struct {
1054 type field_name[length];
1055 }
1056
1057 typedef type new_type[length_type];
1058
1059 struct {
1060 type field_name[length_type];
1061 }
1062
1063 integer {
1064 ...
1065 }
1066
1067 floating_point {
1068 ...
1069 }
1070
1071 struct {
1072 integer_type field_name:size; /* GNU/C bitfield */
1073 }
1074
1075 struct {
1076 string field_name;
1077 }
1078
1079
1080 A. Helper macros
1081
1082 The two following macros keep track of the size of a GNU/C structure without
1083 padding at the end by placing HEADER_END as the last field. A one byte end field
1084 is used for C90 compatibility (C99 flexible arrays could be used here). Note
1085 that this does not affect the effective structure size, which should always be
1086 calculated with the header_sizeof() helper.
1087
1088 #define HEADER_END char end_field
1089 #define header_sizeof(type) offsetof(typeof(type), end_field)
1090
1091
1092 B. Stream Header Rationale
1093
1094 An event stream is divided in contiguous event packets of variable size. These
1095 subdivisions allow the trace analyzer to perform a fast binary search by time
1096 within the stream (typically requiring to index only the event packet headers)
1097 without reading the whole stream. These subdivisions have a variable size to
1098 eliminate the need to transfer the event packet padding when partially filled
1099 event packets must be sent when streaming a trace for live viewing/analysis.
1100 An event packet can contain a certain amount of padding at the end. Dividing
1101 streams into event packets is also useful for network streaming over UDP and
1102 flight recorder mode tracing (a whole event packet can be swapped out of the
1103 buffer atomically for reading).
1104
1105 The stream header is repeated at the beginning of each event packet to allow
1106 flexibility in terms of:
1107
1108 - streaming support,
1109 - allowing arbitrary buffers to be discarded without making the trace
1110 unreadable,
1111 - allow UDP packet loss handling by either dealing with missing event packet
1112 or asking for re-transmission.
1113 - transparently support flight recorder mode,
1114 - transparently support crash dump.
1115
1116
1117 C. TSDL Grammar
1118
1119 /*
1120 * Common Trace Format (CTF) Trace Stream Description Language (TSDL) Grammar.
1121 *
1122 * Inspired from the C99 grammar:
1123 * http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf (Annex A)
1124 * and c++1x grammar (draft)
1125 * http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3291.pdf (Annex A)
1126 *
1127 * Specialized for CTF needs by including only constant and declarations from
1128 * C99 (excluding function declarations), and by adding support for variants,
1129 * sequences and CTF-specific specifiers. Enumeration container types
1130 * semantic is inspired from c++1x enum-base.
1131 */
1132
1133 1) Lexical grammar
1134
1135 1.1) Lexical elements
1136
1137 token:
1138 keyword
1139 identifier
1140 constant
1141 string-literal
1142 punctuator
1143
1144 1.2) Keywords
1145
1146 keyword: is one of
1147
1148 const
1149 char
1150 double
1151 enum
1152 event
1153 floating_point
1154 float
1155 integer
1156 int
1157 long
1158 short
1159 signed
1160 stream
1161 string
1162 struct
1163 trace
1164 typealias
1165 typedef
1166 unsigned
1167 variant
1168 void
1169 _Bool
1170 _Complex
1171 _Imaginary
1172
1173
1174 1.3) Identifiers
1175
1176 identifier:
1177 identifier-nondigit
1178 identifier identifier-nondigit
1179 identifier digit
1180
1181 identifier-nondigit:
1182 nondigit
1183 universal-character-name
1184 any other implementation-defined characters
1185
1186 nondigit:
1187 _
1188 [a-zA-Z] /* regular expression */
1189
1190 digit:
1191 [0-9] /* regular expression */
1192
1193 1.4) Universal character names
1194
1195 universal-character-name:
1196 \u hex-quad
1197 \U hex-quad hex-quad
1198
1199 hex-quad:
1200 hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit
1201
1202 1.5) Constants
1203
1204 constant:
1205 integer-constant
1206 enumeration-constant
1207 character-constant
1208
1209 integer-constant:
1210 decimal-constant integer-suffix-opt
1211 octal-constant integer-suffix-opt
1212 hexadecimal-constant integer-suffix-opt
1213
1214 decimal-constant:
1215 nonzero-digit
1216 decimal-constant digit
1217
1218 octal-constant:
1219 0
1220 octal-constant octal-digit
1221
1222 hexadecimal-constant:
1223 hexadecimal-prefix hexadecimal-digit
1224 hexadecimal-constant hexadecimal-digit
1225
1226 hexadecimal-prefix:
1227 0x
1228 0X
1229
1230 nonzero-digit:
1231 [1-9]
1232
1233 integer-suffix:
1234 unsigned-suffix long-suffix-opt
1235 unsigned-suffix long-long-suffix
1236 long-suffix unsigned-suffix-opt
1237 long-long-suffix unsigned-suffix-opt
1238
1239 unsigned-suffix:
1240 u
1241 U
1242
1243 long-suffix:
1244 l
1245 L
1246
1247 long-long-suffix:
1248 ll
1249 LL
1250
1251 digit-sequence:
1252 digit
1253 digit-sequence digit
1254
1255 hexadecimal-digit-sequence:
1256 hexadecimal-digit
1257 hexadecimal-digit-sequence hexadecimal-digit
1258
1259 enumeration-constant:
1260 identifier
1261 string-literal
1262
1263 character-constant:
1264 ' c-char-sequence '
1265 L' c-char-sequence '
1266
1267 c-char-sequence:
1268 c-char
1269 c-char-sequence c-char
1270
1271 c-char:
1272 any member of source charset except single-quote ('), backslash
1273 (\), or new-line character.
1274 escape-sequence
1275
1276 escape-sequence:
1277 simple-escape-sequence
1278 octal-escape-sequence
1279 hexadecimal-escape-sequence
1280 universal-character-name
1281
1282 simple-escape-sequence: one of
1283 \' \" \? \\ \a \b \f \n \r \t \v
1284
1285 octal-escape-sequence:
1286 \ octal-digit
1287 \ octal-digit octal-digit
1288 \ octal-digit octal-digit octal-digit
1289
1290 hexadecimal-escape-sequence:
1291 \x hexadecimal-digit
1292 hexadecimal-escape-sequence hexadecimal-digit
1293
1294 1.6) String literals
1295
1296 string-literal:
1297 " s-char-sequence-opt "
1298 L" s-char-sequence-opt "
1299
1300 s-char-sequence:
1301 s-char
1302 s-char-sequence s-char
1303
1304 s-char:
1305 any member of source charset except double-quote ("), backslash
1306 (\), or new-line character.
1307 escape-sequence
1308
1309 1.7) Punctuators
1310
1311 punctuator: one of
1312 [ ] ( ) { } . -> * + - < > : ; ... = ,
1313
1314
1315 2) Phrase structure grammar
1316
1317 primary-expression:
1318 identifier
1319 constant
1320 string-literal
1321 ( unary-expression )
1322
1323 postfix-expression:
1324 primary-expression
1325 postfix-expression [ unary-expression ]
1326 postfix-expression . identifier
1327 postfix-expressoin -> identifier
1328
1329 unary-expression:
1330 postfix-expression
1331 unary-operator postfix-expression
1332
1333 unary-operator: one of
1334 + -
1335
1336 assignment-operator:
1337 =
1338
1339 type-assignment-operator:
1340 :=
1341
1342 constant-expression:
1343 unary-expression
1344
1345 constant-expression-range:
1346 constant-expression ... constant-expression
1347
1348 2.2) Declarations:
1349
1350 declaration:
1351 declaration-specifiers declarator-list-opt ;
1352 ctf-specifier ;
1353
1354 declaration-specifiers:
1355 storage-class-specifier declaration-specifiers-opt
1356 type-specifier declaration-specifiers-opt
1357 type-qualifier declaration-specifiers-opt
1358
1359 declarator-list:
1360 declarator
1361 declarator-list , declarator
1362
1363 abstract-declarator-list:
1364 abstract-declarator
1365 abstract-declarator-list , abstract-declarator
1366
1367 storage-class-specifier:
1368 typedef
1369
1370 type-specifier:
1371 void
1372 char
1373 short
1374 int
1375 long
1376 float
1377 double
1378 signed
1379 unsigned
1380 _Bool
1381 _Complex
1382 _Imaginary
1383 struct-specifier
1384 variant-specifier
1385 enum-specifier
1386 typedef-name
1387 ctf-type-specifier
1388
1389 struct-specifier:
1390 struct identifier-opt { struct-or-variant-declaration-list-opt }
1391 struct identifier
1392
1393 struct-or-variant-declaration-list:
1394 struct-or-variant-declaration
1395 struct-or-variant-declaration-list struct-or-variant-declaration
1396
1397 struct-or-variant-declaration:
1398 specifier-qualifier-list struct-or-variant-declarator-list ;
1399 declaration-specifiers storage-class-specifier declaration-specifiers declarator-list ;
1400 typealias declaration-specifiers abstract-declarator-list := declaration-specifiers abstract-declarator-list ;
1401 typealias declaration-specifiers abstract-declarator-list := declarator-list ;
1402
1403 specifier-qualifier-list:
1404 type-specifier specifier-qualifier-list-opt
1405 type-qualifier specifier-qualifier-list-opt
1406
1407 struct-or-variant-declarator-list:
1408 struct-or-variant-declarator
1409 struct-or-variant-declarator-list , struct-or-variant-declarator
1410
1411 struct-or-variant-declarator:
1412 declarator
1413 declarator-opt : constant-expression
1414
1415 variant-specifier:
1416 variant identifier-opt variant-tag-opt { struct-or-variant-declaration-list }
1417 variant identifier variant-tag
1418
1419 variant-tag:
1420 < identifier >
1421
1422 enum-specifier:
1423 enum identifier-opt { enumerator-list }
1424 enum identifier-opt { enumerator-list , }
1425 enum identifier
1426 enum identifier-opt : declaration-specifiers { enumerator-list }
1427 enum identifier-opt : declaration-specifiers { enumerator-list , }
1428
1429 enumerator-list:
1430 enumerator
1431 enumerator-list , enumerator
1432
1433 enumerator:
1434 enumeration-constant
1435 enumeration-constant = constant-expression
1436 enumeration-constant = constant-expression-range
1437
1438 type-qualifier:
1439 const
1440
1441 declarator:
1442 pointer-opt direct-declarator
1443
1444 direct-declarator:
1445 identifier
1446 ( declarator )
1447 direct-declarator [ type-specifier ]
1448 direct-declarator [ constant-expression ]
1449
1450 abstract-declarator:
1451 pointer-opt direct-abstract-declarator
1452
1453 direct-abstract-declarator:
1454 identifier-opt
1455 ( abstract-declarator )
1456 direct-abstract-declarator [ type-specifier ]
1457 direct-abstract-declarator [ constant-expression ]
1458 direct-abstract-declarator [ ]
1459
1460 pointer:
1461 * type-qualifier-list-opt
1462 * type-qualifier-list-opt pointer
1463
1464 type-qualifier-list:
1465 type-qualifier
1466 type-qualifier-list type-qualifier
1467
1468 typedef-name:
1469 identifier
1470
1471 2.3) CTF-specific declarations
1472
1473 ctf-specifier:
1474 event { ctf-assignment-expression-list-opt }
1475 stream { ctf-assignment-expression-list-opt }
1476 trace { ctf-assignment-expression-list-opt }
1477 typealias declaration-specifiers abstract-declarator-list := declaration-specifiers abstract-declarator-list ;
1478 typealias declaration-specifiers abstract-declarator-list := declarator-list ;
1479
1480 ctf-type-specifier:
1481 floating_point { ctf-assignment-expression-list-opt }
1482 integer { ctf-assignment-expression-list-opt }
1483 string { ctf-assignment-expression-list-opt }
1484
1485 ctf-assignment-expression-list:
1486 ctf-assignment-expression
1487 ctf-assignment-expression-list ; ctf-assignment-expression
1488
1489 ctf-assignment-expression:
1490 unary-expression assignment-operator unary-expression
1491 unary-expression type-assignment-operator type-specifier
1492 declaration-specifiers storage-class-specifier declaration-specifiers declarator-list
1493 typealias declaration-specifiers abstract-declarator-list := declaration-specifiers abstract-declarator-list
1494 typealias declaration-specifiers abstract-declarator-list := declarator-list
This page took 0.104877 seconds and 4 git commands to generate.