Update metadata magic number
[ctf.git] / common-trace-format-proposal.txt
... / ...
CommitLineData
1
2RFC: Common Trace Format (CTF) Proposal (pre-v1.7)
3
4Mathieu Desnoyers, EfficiOS Inc.
5
6The goal of the present document is to propose a trace format that suits the
7needs of the embedded, telecom, high-performance and kernel communities. It is
8based on the Common Trace Format Requirements (v1.4) document. It is designed to
9allow traces to be natively generated by the Linux kernel, Linux user-space
10applications written in C/C++, and hardware components. One major element of
11CTF is the Trace Stream Description Language (TSDL) which flexibility
12enables description of various binary trace stream layouts.
13
14The latest version of this document can be found at:
15
16 git tree: git://git.efficios.com/ctf.git
17 gitweb: http://git.efficios.com/?p=ctf.git
18
19A reference implementation of a library to read and write this trace format is
20being implemented within the BabelTrace project, a converter between trace
21formats. The development tree is available at:
22
23 git tree: git://git.efficios.com/babeltrace.git
24 gitweb: http://git.efficios.com/?p=babeltrace.git
25
26
271. Preliminary definitions
28
29 - Event Trace: An ordered sequence of events.
30 - Event Stream: An ordered sequence of events, containing a subset of the
31 trace event types.
32 - Event Packet: A sequence of physically contiguous events within an event
33 stream.
34 - Event: This is the basic entry in a trace. (aka: a trace record).
35 - An event identifier (ID) relates to the class (a type) of event within
36 an event stream.
37 e.g. event: irq_entry.
38 - An event (or event record) relates to a specific instance of an event
39 class.
40 e.g. event: irq_entry, at time X, on CPU Y
41 - Source Architecture: Architecture writing the trace.
42 - Reader Architecture: Architecture reading the trace.
43
44
452. High-level representation of a trace
46
47A trace is divided into multiple event streams. Each event stream contains a
48subset of the trace event types.
49
50The final output of the trace, after its generation and optional transport over
51the network, is expected to be either on permanent or temporary storage in a
52virtual file system. Because each event stream is appended to while a trace is
53being recorded, each is associated with a separate file for output. Therefore,
54a stored trace can be represented as a directory containing one file per stream.
55
56Meta-data description associated with the trace contains information on
57trace event types expressed in the Trace Stream Description Language
58(TSDL). This language describes:
59
60- Trace version.
61- Types available.
62- Per-trace event header description.
63- Per-stream event header description.
64- Per-stream event context description.
65- Per-event
66 - Event type to stream mapping.
67 - Event type to name mapping.
68 - Event type to ID mapping.
69 - Event context description.
70 - Event fields description.
71
72
733. Event stream
74
75An event stream can be divided into contiguous event packets of variable
76size. These subdivisions have a variable size. An event packet can
77contain a certain amount of padding at the end. The stream header is
78repeated at the beginning of each event packet. The rationale for the
79event stream design choices is explained in Appendix B. Stream Header
80Rationale.
81
82The event stream header will therefore be referred to as the "event packet
83header" throughout the rest of this document.
84
85
864. Types
87
88Types are organized as type classes. Each type class belong to either of two
89kind of types: basic types or compound types.
90
914.1 Basic types
92
93A basic type is a scalar type, as described in this section. It includes
94integers, GNU/C bitfields, enumerations, and floating point values.
95
964.1.1 Type inheritance
97
98Type specifications can be inherited to allow deriving types from a
99type class. For example, see the uint32_t named type derived from the "integer"
100type class below ("Integers" section). Types have a precise binary
101representation in the trace. A type class has methods to read and write these
102types, but must be derived into a type to be usable in an event field.
103
1044.1.2 Alignment
105
106We define "byte-packed" types as aligned on the byte size, namely 8-bit.
107We define "bit-packed" types as following on the next bit, as defined by the
108"Integers" section.
109
110Each basic type must specify its alignment, in bits. Examples of
111possible alignments are: bit-packed, byte-packed, or word-aligned. The
112choice depends on the architecture preference and compactness vs
113performance trade-offs of the implementation. Architectures providing
114fast unaligned write byte-packed basic types to save space, aligning
115each type on byte boundaries (8-bit). Architectures with slow unaligned
116writes align types on specific alignment values. If no specific
117alignment is declared for a type, it is assumed to be bit-packed for
118integers with size not multiple of 8 bits and for gcc bitfields. All
119other types are byte-packed. It is however recommended to always specify
120the alignment explicitly.
121
122TSDL meta-data attribute representation of a specific alignment:
123
124 align = value; /* value in bits */
125
1264.1.3 Byte order
127
128By default, the native endianness of the source architecture the trace is used.
129Byte order can be overridden for a basic type by specifying a "byte_order"
130attribute. Typical use-case is to specify the network byte order (big endian:
131"be") to save data captured from the network into the trace without conversion.
132If not specified, the byte order is native.
133
134TSDL meta-data representation:
135
136 byte_order = native OR network OR be OR le; /* network and be are aliases */
137
1384.1.4 Size
139
140Type size, in bits, for integers and floats is that returned by "sizeof()" in C
141multiplied by CHAR_BIT.
142We require the size of "char" and "unsigned char" types (CHAR_BIT) to be fixed
143to 8 bits for cross-endianness compatibility.
144
145TSDL meta-data representation:
146
147 size = value; (value is in bits)
148
1494.1.5 Integers
150
151Signed integers are represented in two-complement. Integer alignment,
152size, signedness and byte ordering are defined in the TSDL meta-data.
153Integers aligned on byte size (8-bit) and with length multiple of byte
154size (8-bit) correspond to the C99 standard integers. In addition,
155integers with alignment and/or size that are _not_ a multiple of the
156byte size are permitted; these correspond to the C99 standard bitfields,
157with the added specification that the CTF integer bitfields have a fixed
158binary representation. A MIT-licensed reference implementation of the
159CTF portable bitfields is available at:
160
161 http://git.efficios.com/?p=babeltrace.git;a=blob;f=include/babeltrace/bitfield.h
162
163Binary representation of integers:
164
165- On little and big endian:
166 - Within a byte, high bits correspond to an integer high bits, and low bits
167 correspond to low bits.
168- On little endian:
169 - Integer across multiple bytes are placed from the less significant to the
170 most significant.
171 - Consecutive integers are placed from lower bits to higher bits (even within
172 a byte).
173- On big endian:
174 - Integer across multiple bytes are placed from the most significant to the
175 less significant.
176 - Consecutive integers are placed from higher bits to lower bits (even within
177 a byte).
178
179This binary representation is derived from the bitfield implementation in GCC
180for little and big endian. However, contrary to what GCC does, integers can
181cross units boundaries (no padding is required). Padding can be explicitly
182added (see 4.1.6 GNU/C bitfields) to follow the GCC layout if needed.
183
184TSDL meta-data representation:
185
186 integer {
187 signed = true OR false; /* default false */
188 byte_order = native OR network OR be OR le; /* default native */
189 size = value; /* value in bits, no default */
190 align = value; /* value in bits */
191 }
192
193Example of type inheritance (creation of a uint32_t named type):
194
195typealias integer {
196 size = 32;
197 signed = false;
198 align = 32;
199} := uint32_t;
200
201Definition of a named 5-bit signed bitfield:
202
203typealias integer {
204 size = 5;
205 signed = true;
206 align = 1;
207} := int5_t;
208
2094.1.6 GNU/C bitfields
210
211The GNU/C bitfields follow closely the integer representation, with a
212particularity on alignment: if a bitfield cannot fit in the current unit, the
213unit is padded and the bitfield starts at the following unit. The unit size is
214defined by the size of the type "unit_type".
215
216TSDL meta-data representation:
217
218 unit_type name:size:
219
220As an example, the following structure declared in C compiled by GCC:
221
222struct example {
223 short a:12;
224 short b:5;
225};
226
227The example structure is aligned on the largest element (short). The second
228bitfield would be aligned on the next unit boundary, because it would not fit in
229the current unit.
230
2314.1.7 Floating point
232
233The floating point values byte ordering is defined in the TSDL meta-data.
234
235Floating point values follow the IEEE 754-2008 standard interchange formats.
236Description of the floating point values include the exponent and mantissa size
237in bits. Some requirements are imposed on the floating point values:
238
239- FLT_RADIX must be 2.
240- mant_dig is the number of digits represented in the mantissa. It is specified
241 by the ISO C99 standard, section 5.2.4, as FLT_MANT_DIG, DBL_MANT_DIG and
242 LDBL_MANT_DIG as defined by <float.h>.
243- exp_dig is the number of digits represented in the exponent. Given that
244 mant_dig is one bit more than its actual size in bits (leading 1 is not
245 needed) and also given that the sign bit always takes one bit, exp_dig can be
246 specified as:
247
248 - sizeof(float) * CHAR_BIT - FLT_MANT_DIG
249 - sizeof(double) * CHAR_BIT - DBL_MANT_DIG
250 - sizeof(long double) * CHAR_BIT - LDBL_MANT_DIG
251
252TSDL meta-data representation:
253
254floating_point {
255 exp_dig = value;
256 mant_dig = value;
257 byte_order = native OR network OR be OR le;
258}
259
260Example of type inheritance:
261
262typealias floating_point {
263 exp_dig = 8; /* sizeof(float) * CHAR_BIT - FLT_MANT_DIG */
264 mant_dig = 24; /* FLT_MANT_DIG */
265 byte_order = native;
266} := float;
267
268TODO: define NaN, +inf, -inf behavior.
269
2704.1.8 Enumerations
271
272Enumerations are a mapping between an integer type and a table of strings. The
273numerical representation of the enumeration follows the integer type specified
274by the meta-data. The enumeration mapping table is detailed in the enumeration
275description within the meta-data. The mapping table maps inclusive value
276ranges (or single values) to strings. Instead of being limited to simple
277"value -> string" mappings, these enumerations map
278"[ start_value ... end_value ] -> string", which map inclusive ranges of
279values to strings. An enumeration from the C language can be represented in
280this format by having the same start_value and end_value for each element, which
281is in fact a range of size 1. This single-value range is supported without
282repeating the start and end values with the value = string declaration.
283
284enum name : integer_type {
285 somestring = start_value1 ... end_value1,
286 "other string" = start_value2 ... end_value2,
287 yet_another_string, /* will be assigned to end_value2 + 1 */
288 "some other string" = value,
289 ...
290};
291
292If the values are omitted, the enumeration starts at 0 and increment of 1 for
293each entry:
294
295enum name : unsigned int {
296 ZERO,
297 ONE,
298 TWO,
299 TEN = 10,
300 ELEVEN,
301};
302
303Overlapping ranges within a single enumeration are implementation defined.
304
305A nameless enumeration can be declared as a field type or as part of a typedef:
306
307enum : integer_type {
308 ...
309}
310
311Enumerations omitting the container type ": integer_type" use the "int"
312type (for compatibility with C99). The "int" type must be previously
313declared. E.g.:
314
315typealias integer { size = 32; align = 32; signed = true } := int;
316
317enum {
318 ...
319}
320
321
3224.2 Compound types
323
324Compound are aggregation of type declarations. Compound types include
325structures, variant, arrays, sequences, and strings.
326
3274.2.1 Structures
328
329Structures are aligned on the largest alignment required by basic types
330contained within the structure. (This follows the ISO/C standard for structures)
331
332TSDL meta-data representation of a named structure:
333
334struct name {
335 field_type field_name;
336 field_type field_name;
337 ...
338};
339
340Example:
341
342struct example {
343 integer { /* Nameless type */
344 size = 16;
345 signed = true;
346 align = 16;
347 } first_field_name;
348 uint64_t second_field_name; /* Named type declared in the meta-data */
349};
350
351The fields are placed in a sequence next to each other. They each possess a
352field name, which is a unique identifier within the structure.
353
354A nameless structure can be declared as a field type or as part of a typedef:
355
356struct {
357 ...
358}
359
3604.2.2 Variants (Discriminated/Tagged Unions)
361
362A CTF variant is a selection between different types. A CTF variant must
363always be defined within the scope of a structure or within fields
364contained within a structure (defined recursively). A "tag" enumeration
365field must appear in either the same lexical scope, prior to the variant
366field (in field declaration order), in an uppermost lexical scope (see
367Section 7.3.1), or in an uppermost dynamic scope (see Section 7.3.2).
368The type selection is indicated by the mapping from the enumeration
369value to the string used as variant type selector. The field to use as
370tag is specified by the "tag_field", specified between "< >" after the
371"variant" keyword for unnamed variants, and after "variant name" for
372named variants.
373
374The alignment of the variant is the alignment of the type as selected by the tag
375value for the specific instance of the variant. The alignment of the type
376containing the variant is independent of the variant alignment. The size of the
377variant is the size as selected by the tag value for the specific instance of
378the variant.
379
380A named variant declaration followed by its definition within a structure
381declaration:
382
383variant name {
384 field_type sel1;
385 field_type sel2;
386 field_type sel3;
387 ...
388};
389
390struct {
391 enum : integer_type { sel1, sel2, sel3, ... } tag_field;
392 ...
393 variant name <tag_field> v;
394}
395
396An unnamed variant definition within a structure is expressed by the following
397TSDL meta-data:
398
399struct {
400 enum : integer_type { sel1, sel2, sel3, ... } tag_field;
401 ...
402 variant <tag_field> {
403 field_type sel1;
404 field_type sel2;
405 field_type sel3;
406 ...
407 } v;
408}
409
410Example of a named variant within a sequence that refers to a single tag field:
411
412variant example {
413 uint32_t a;
414 uint64_t b;
415 short c;
416};
417
418struct {
419 enum : uint2_t { a, b, c } choice;
420 variant example <choice> v[unsigned int];
421}
422
423Example of an unnamed variant:
424
425struct {
426 enum : uint2_t { a, b, c, d } choice;
427 /* Unrelated fields can be added between the variant and its tag */
428 int32_t somevalue;
429 variant <choice> {
430 uint32_t a;
431 uint64_t b;
432 short c;
433 struct {
434 unsigned int field1;
435 uint64_t field2;
436 } d;
437 } s;
438}
439
440Example of an unnamed variant within an array:
441
442struct {
443 enum : uint2_t { a, b, c } choice;
444 variant <choice> {
445 uint32_t a;
446 uint64_t b;
447 short c;
448 } v[10];
449}
450
451Example of a variant type definition within a structure, where the defined type
452is then declared within an array of structures. This variant refers to a tag
453located in an upper lexical scope. This example clearly shows that a variant
454type definition referring to the tag "x" uses the closest preceding field from
455the lexical scope of the type definition.
456
457struct {
458 enum : uint2_t { a, b, c, d } x;
459
460 typedef variant <x> { /*
461 * "x" refers to the preceding "x" enumeration in the
462 * lexical scope of the type definition.
463 */
464 uint32_t a;
465 uint64_t b;
466 short c;
467 } example_variant;
468
469 struct {
470 enum : int { x, y, z } x; /* This enumeration is not used by "v". */
471 example_variant v; /*
472 * "v" uses the "enum : uint2_t { a, b, c, d }"
473 * tag.
474 */
475 } a[10];
476}
477
4784.2.3 Arrays
479
480Arrays are fixed-length. Their length is declared in the type
481declaration within the meta-data. They contain an array of "inner type"
482elements, which can refer to any type not containing the type of the
483array being declared (no circular dependency). The length is the number
484of elements in an array.
485
486TSDL meta-data representation of a named array:
487
488typedef elem_type name[length];
489
490A nameless array can be declared as a field type within a structure, e.g.:
491
492 uint8_t field_name[10];
493
494
4954.2.4 Sequences
496
497Sequences are dynamically-sized arrays. They start with an integer that specify
498the length of the sequence, followed by an array of "inner type" elements.
499The length is the number of elements in the sequence.
500
501TSDL meta-data representation for a named sequence:
502
503typedef elem_type name[length_type];
504
505A nameless sequence can be declared as a field type, e.g.:
506
507long field_name[int];
508
509The length type follows the integer types specifications, and the sequence
510elements follow the "array" specifications.
511
5124.2.5 Strings
513
514Strings are an array of bytes of variable size and are terminated by a '\0'
515"NULL" character. Their encoding is described in the TSDL meta-data. In
516absence of encoding attribute information, the default encoding is
517UTF-8.
518
519TSDL meta-data representation of a named string type:
520
521typealias string {
522 encoding = UTF8 OR ASCII;
523} := name;
524
525A nameless string type can be declared as a field type:
526
527string field_name; /* Use default UTF8 encoding */
528
5295. Event Packet Header
530
531The event packet header consists of two parts: the "event packet header"
532is the same for all streams of a trace. The second part, the "event
533packet context", is described on a per-stream basis. Both are described
534in the TSDL meta-data. The packets are aligned on architecture-page-sized
535addresses.
536
537Event packet header (all fields are optional, specified by TSDL meta-data):
538
539- Magic number (CTF magic number: 0xC1FC1FC1) specifies that this is a
540 CTF packet. This magic number is optional, but when present, it should
541 come at the very beginning of the packet.
542- Trace UUID, used to ensure the event packet match the meta-data used.
543 (note: we cannot use a meta-data checksum in every cases instead of a
544 UUID because meta-data can be appended to while tracing is active)
545 This field is optional.
546- Stream ID, used as reference to stream description in meta-data.
547 This field is optional if there is only one stream description in the
548 meta-data, but becomes required if there are more than one stream in
549 the TSDL meta-data description.
550
551Event packet context (all fields are optional, specified by TSDL meta-data):
552
553- Event packet content size (in bytes).
554- Event packet size (in bytes, includes padding).
555- Event packet content checksum (optional). Checksum excludes the event packet
556 header.
557- Per-stream event packet sequence count (to deal with UDP packet loss). The
558 number of significant sequence counter bits should also be present, so
559 wrap-arounds are dealt with correctly.
560- Time-stamp at the beginning and time-stamp at the end of the event packet.
561 Both timestamps are written in the packet header, but sampled respectively
562 while (or before) writing the first event and while (or after) writing the
563 last event in the packet. The inclusive range between these timestamps should
564 include all event timestamps assigned to events contained within the packet.
565- Events discarded count
566 - Snapshot of a per-stream free-running counter, counting the number of
567 events discarded that were supposed to be written in the stream prior to
568 the first event in the event packet.
569 * Note: producer-consumer buffer full condition should fill the current
570 event packet with padding so we know exactly where events have been
571 discarded.
572- Lossless compression scheme used for the event packet content. Applied
573 directly to raw data. New types of compression can be added in following
574 versions of the format.
575 0: no compression scheme
576 1: bzip2
577 2: gzip
578 3: xz
579- Cypher used for the event packet content. Applied after compression.
580 0: no encryption
581 1: AES
582- Checksum scheme used for the event packet content. Applied after encryption.
583 0: no checksum
584 1: md5
585 2: sha1
586 3: crc32
587
5885.1 Event Packet Header Description
589
590The event packet header layout is indicated by the trace packet.header
591field. Here is a recommended structure type for the packet header with
592the fields typically expected (although these fields are each optional):
593
594struct event_packet_header {
595 uint32_t magic;
596 uint8_t trace_uuid[16];
597 uint32_t stream_id;
598};
599
600trace {
601 ...
602 packet.header := struct event_packet_header;
603};
604
605If the magic number is not present, tools such as "file" will have no
606mean to discover the file type.
607
608If the trace_uuid is not present, no validation that the meta-data
609actually corresponds to the stream is performed.
610
611If the stream_id packet header field is missing, the trace can only
612contain a single stream. Its "id" field can be left out, and its events
613don't need to declare a "stream_id" field.
614
615
6165.2 Event Packet Context Description
617
618Event packet context example. These are declared within the stream declaration
619in the meta-data. All these fields are optional. If the packet size field is
620missing, the whole stream only contains a single packet. If the content
621size field is missing, the packet is filled (no padding). The content
622and packet sizes include all headers.
623
624An example event packet context type:
625
626struct event_packet_context {
627 uint64_t timestamp_begin;
628 uint64_t timestamp_end;
629 uint32_t checksum;
630 uint32_t stream_packet_count;
631 uint32_t events_discarded;
632 uint32_t cpu_id;
633 uint32_t/uint16_t content_size;
634 uint32_t/uint16_t packet_size;
635 uint8_t stream_packet_count_bits; /* Significant counter bits */
636 uint8_t compression_scheme;
637 uint8_t encryption_scheme;
638 uint8_t checksum_scheme;
639};
640
641
6426. Event Structure
643
644The overall structure of an event is:
645
6461 - Stream Packet Context (as specified by the stream meta-data)
647 2 - Event Header (as specified by the stream meta-data)
648 3 - Stream Event Context (as specified by the stream meta-data)
649 4 - Event Context (as specified by the event meta-data)
650 5 - Event Payload (as specified by the event meta-data)
651
652This structure defines an implicit dynamic scoping, where variants
653located in inner structures (those with a higher number in the listing
654above) can refer to the fields of outer structures (with lower number in
655the listing above). See Section 7.3 TSDL Scopes for more detail.
656
6576.1 Event Header
658
659Event headers can be described within the meta-data. We hereby propose, as an
660example, two types of events headers. Type 1 accommodates streams with less than
66131 event IDs. Type 2 accommodates streams with 31 or more event IDs.
662
663One major factor can vary between streams: the number of event IDs assigned to
664a stream. Luckily, this information tends to stay relatively constant (modulo
665event registration while trace is being recorded), so we can specify different
666representations for streams containing few event IDs and streams containing
667many event IDs, so we end up representing the event ID and time-stamp as
668densely as possible in each case.
669
670The header is extended in the rare occasions where the information cannot be
671represented in the ranges available in the standard event header. They are also
672used in the rare occasions where the data required for a field could not be
673collected: the flag corresponding to the missing field within the missing_fields
674array is then set to 1.
675
676Types uintX_t represent an X-bit unsigned integer, as declared with
677either:
678
679 typealias integer { size = X; align = X; signed = false } := uintX_t;
680
681 or
682
683 typealias integer { size = X; align = 1; signed = false } := uintX_t;
684
6856.1.1 Type 1 - Few event IDs
686
687 - Aligned on 32-bit (or 8-bit if byte-packed, depending on the architecture
688 preference).
689 - Native architecture byte ordering.
690 - For "compact" selection
691 - Fixed size: 32 bits.
692 - For "extended" selection
693 - Size depends on the architecture and variant alignment.
694
695struct event_header_1 {
696 /*
697 * id: range: 0 - 30.
698 * id 31 is reserved to indicate an extended header.
699 */
700 enum : uint5_t { compact = 0 ... 30, extended = 31 } id;
701 variant <id> {
702 struct {
703 uint27_t timestamp;
704 } compact;
705 struct {
706 uint32_t id; /* 32-bit event IDs */
707 uint64_t timestamp; /* 64-bit timestamps */
708 } extended;
709 } v;
710};
711
712
7136.1.2 Type 2 - Many event IDs
714
715 - Aligned on 16-bit (or 8-bit if byte-packed, depending on the architecture
716 preference).
717 - Native architecture byte ordering.
718 - For "compact" selection
719 - Size depends on the architecture and variant alignment.
720 - For "extended" selection
721 - Size depends on the architecture and variant alignment.
722
723struct event_header_2 {
724 /*
725 * id: range: 0 - 65534.
726 * id 65535 is reserved to indicate an extended header.
727 */
728 enum : uint16_t { compact = 0 ... 65534, extended = 65535 } id;
729 variant <id> {
730 struct {
731 uint32_t timestamp;
732 } compact;
733 struct {
734 uint32_t id; /* 32-bit event IDs */
735 uint64_t timestamp; /* 64-bit timestamps */
736 } extended;
737 } v;
738};
739
740
7416.2 Event Context
742
743The event context contains information relative to the current event.
744The choice and meaning of this information is specified by the TSDL
745stream and event meta-data descriptions. The stream context is applied
746to all events within the stream. The stream context structure follows
747the event header. The event context is applied to specific events. Its
748structure follows the stream context structure.
749
750An example of stream-level event context is to save the event payload size with
751each event, or to save the current PID with each event. These are declared
752within the stream declaration within the meta-data:
753
754 stream {
755 ...
756 event.context := struct {
757 uint pid;
758 uint16_t payload_size;
759 };
760 };
761
762An example of event-specific event context is to declare a bitmap of missing
763fields, only appended after the stream event context if the extended event
764header is selected. NR_FIELDS is the number of fields within the event (a
765numeric value).
766
767 event {
768 context = struct {
769 variant <id> {
770 struct { } compact;
771 struct {
772 uint1_t missing_fields[NR_FIELDS]; /* missing event fields bitmap */
773 } extended;
774 } v;
775 };
776 ...
777 }
778
7796.3 Event Payload
780
781An event payload contains fields specific to a given event type. The fields
782belonging to an event type are described in the event-specific meta-data
783within a structure type.
784
7856.3.1 Padding
786
787No padding at the end of the event payload. This differs from the ISO/C standard
788for structures, but follows the CTF standard for structures. In a trace, even
789though it makes sense to align the beginning of a structure, it really makes no
790sense to add padding at the end of the structure, because structures are usually
791not followed by a structure of the same type.
792
793This trick can be done by adding a zero-length "end" field at the end of the C
794structures, and by using the offset of this field rather than using sizeof()
795when calculating the size of a structure (see Appendix "A. Helper macros").
796
7976.3.2 Alignment
798
799The event payload is aligned on the largest alignment required by types
800contained within the payload. (This follows the ISO/C standard for structures)
801
802
8037. Trace Stream Description Language (TSDL)
804
805The Trace Stream Description Language (TSDL) allows expression of the
806binary trace streams layout in a C99-like Domain Specific Language
807(DSL).
808
809
8107.1 Meta-data
811
812The trace stream layout description is located in the trace meta-data.
813The meta-data is itself located in a stream identified by its name:
814"metadata".
815
816The meta-data description can be expressed in two different formats:
817text-only and packet-based. The text-only description facilitates
818generation of meta-data and provides a convenient way to enter the
819meta-data information by hand. The packet-based meta-data provides the
820CTF stream packet facilities (checksumming, compression, encryption,
821network-readiness) for meta-data stream generated and transported by a
822tracer.
823
824The text-only meta-data file is a plain text TSDL description.
825
826The packet-based meta-data is made of "meta-data packets", which each
827start with a meta-data packet header. The packet-based meta-data
828description is detected by reading the magic number "0x75D11D57" at the
829beginning of the file. This magic number is also used to detect the
830endianness of the architecture by trying to read the CTF magic number
831and its counterpart in reversed endianness. The events within the
832meta-data stream have no event header nor event context. Each event only
833contains a "string" payload. Each meta-data packet start with a special
834packet header, specific to the meta-data stream, which contains,
835exactly:
836
837struct metadata_packet_header {
838 uint32_t magic; /* 0x75D11D57 */
839 uint8_t trace_uuid[16]; /* Unique Universal Identifier */
840 uint32_t checksum; /* 0 if unused */
841 uint32_t content_size; /* in bits */
842 uint32_t packet_size; /* in bits */
843 uint8_t compression_scheme; /* 0 if unused */
844 uint8_t encryption_scheme; /* 0 if unused */
845 uint8_t checksum_scheme; /* 0 if unused */
846};
847
848The packet-based meta-data can be converted to a text-only meta-data by
849concatenating all the strings in contains.
850
851In the textual representation of the meta-data, the text contained
852within "/*" and "*/", as well as within "//" and end of line, are
853treated as comments. Boolean values can be represented as true, TRUE,
854or 1 for true, and false, FALSE, or 0 for false. Within the string-based
855meta-data description, the trace UUID is represented as a string of
856hexadecimal digits and dashes "-". In the event packet header, the trace
857UUID is represented as an array of bytes.
858
859
8607.2 Declaration vs Definition
861
862A declaration associates a layout to a type, without specifying where
863this type is located in the event structure hierarchy (see Section 6).
864This therefore includes typedef, typealias, as well as all type
865specifiers. In certain circumstances (typedef, structure field and
866variant field), a declaration is followed by a declarator, which specify
867the newly defined type name (for typedef), or the field name (for
868declarations located within structure and variants). Array and sequence,
869declared with square brackets ("[" "]"), are part of the declarator,
870similarly to C99. The enumeration base type is specified by
871": enum_base", which is part of the type specifier. The variant tag
872name, specified between "<" ">", is also part of the type specifier.
873
874A definition associates a type to a location in the event structure
875hierarchy (see Section 6). This association is denoted by ":=", as shown
876in Section 7.3.
877
878
8797.3 TSDL Scopes
880
881TSDL uses two different types of scoping: a lexical scope is used for
882declarations and type definitions, and a dynamic scope is used for
883variants references to tag fields.
884
8857.3.1 Lexical Scope
886
887Each of "trace", "stream", "event", "struct" and "variant" have their own
888nestable declaration scope, within which types can be declared using "typedef"
889and "typealias". A root declaration scope also contains all declarations
890located outside of any of the aforementioned declarations. An inner
891declaration scope can refer to type declared within its container
892lexical scope prior to the inner declaration scope. Redefinition of a
893typedef or typealias is not valid, although hiding an upper scope
894typedef or typealias is allowed within a sub-scope.
895
8967.3.2 Dynamic Scope
897
898A dynamic scope consists in the lexical scope augmented with the
899implicit event structure definition hierarchy presented at Section 6.
900The dynamic scope is only used for variant tag definitions. It is used
901at definition time to look up the location of the tag field associated
902with a variant.
903
904Therefore, variants in lower levels in the dynamic scope (e.g. event
905context) can refer to a tag field located in upper levels (e.g. in the
906event header) by specifying, in this case, the associated tag with
907<header.field_name>. This allows, for instance, the event context to
908define a variant referring to the "id" field of the event header as
909selector.
910
911The target dynamic scope must be specified explicitly when referring to
912a field outside of the local static scope. The dynamic scope prefixes
913are thus:
914
915 - Trace Packet Header: <trace.packet.header. >,
916 - Stream Packet Context: <stream.packet.context. >,
917 - Event Header: <stream.event.header. >,
918 - Stream Event Context: <stream.event.context. >,
919 - Event Context: <event.context. >,
920 - Event Payload: <event.fields. >.
921
922Multiple declarations of the same field name within a single scope is
923not valid. It is however valid to re-use the same field name in
924different scopes. There is no possible conflict, because the dynamic
925scope must be specified when a variant refers to a tag field located in
926a different dynamic scope.
927
928The information available in the dynamic scopes can be thought of as the
929current tracing context. At trace production, information about the
930current context is saved into the specified scope field levels. At trace
931consumption, for each event, the current trace context is therefore
932readable by accessing the upper dynamic scopes.
933
934
9357.4 TSDL Examples
936
937The grammar representing the TSDL meta-data is presented in Appendix C.
938TSDL Grammar. This section presents a rather lighter reading that
939consists in examples of TSDL meta-data, with template values.
940
941The stream "id" can be left out if there is only one stream in the
942trace. The event "id" field can be left out if there is only one event
943in a stream.
944
945trace {
946 major = value; /* Trace format version */
947 minor = value;
948 uuid = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"; /* Trace UUID */
949 byte_order = be OR le; /* Endianness (required) */
950 packet.header := struct {
951 uint32_t magic;
952 uint8_t trace_uuid[16];
953 uint32_t stream_id;
954 };
955};
956
957stream {
958 id = stream_id;
959 /* Type 1 - Few event IDs; Type 2 - Many event IDs. See section 6.1. */
960 event.header := event_header_1 OR event_header_2;
961 event.context := struct {
962 ...
963 };
964 packet.context := struct {
965 ...
966 };
967};
968
969event {
970 name = event_name;
971 id = value; /* Numeric identifier within the stream */
972 stream = stream_id;
973 context := struct {
974 ...
975 };
976 fields := struct {
977 ...
978 };
979};
980
981/* More detail on types in section 4. Types */
982
983/*
984 * Named types:
985 *
986 * Type declarations behave similarly to the C standard.
987 */
988
989typedef aliased_type_specifiers new_type_declarators;
990
991/* e.g.: typedef struct example new_type_name[10]; */
992
993/*
994 * typealias
995 *
996 * The "typealias" declaration can be used to give a name (including
997 * pointer declarator specifier) to a type. It should also be used to
998 * map basic C types (float, int, unsigned long, ...) to a CTF type.
999 * Typealias is a superset of "typedef": it also allows assignment of a
1000 * simple variable identifier to a type.
1001 */
1002
1003typealias type_class {
1004 ...
1005} := type_specifiers type_declarator;
1006
1007/*
1008 * e.g.:
1009 * typealias integer {
1010 * size = 32;
1011 * align = 32;
1012 * signed = false;
1013 * } := struct page *;
1014 *
1015 * typealias integer {
1016 * size = 32;
1017 * align = 32;
1018 * signed = true;
1019 * } := int;
1020 */
1021
1022struct name {
1023 ...
1024};
1025
1026variant name {
1027 ...
1028};
1029
1030enum name : integer_type {
1031 ...
1032};
1033
1034
1035/*
1036 * Unnamed types, contained within compound type fields, typedef or typealias.
1037 */
1038
1039struct {
1040 ...
1041}
1042
1043variant {
1044 ...
1045}
1046
1047enum : integer_type {
1048 ...
1049}
1050
1051typedef type new_type[length];
1052
1053struct {
1054 type field_name[length];
1055}
1056
1057typedef type new_type[length_type];
1058
1059struct {
1060 type field_name[length_type];
1061}
1062
1063integer {
1064 ...
1065}
1066
1067floating_point {
1068 ...
1069}
1070
1071struct {
1072 integer_type field_name:size; /* GNU/C bitfield */
1073}
1074
1075struct {
1076 string field_name;
1077}
1078
1079
1080A. Helper macros
1081
1082The two following macros keep track of the size of a GNU/C structure without
1083padding at the end by placing HEADER_END as the last field. A one byte end field
1084is used for C90 compatibility (C99 flexible arrays could be used here). Note
1085that this does not affect the effective structure size, which should always be
1086calculated with the header_sizeof() helper.
1087
1088#define HEADER_END char end_field
1089#define header_sizeof(type) offsetof(typeof(type), end_field)
1090
1091
1092B. Stream Header Rationale
1093
1094An event stream is divided in contiguous event packets of variable size. These
1095subdivisions allow the trace analyzer to perform a fast binary search by time
1096within the stream (typically requiring to index only the event packet headers)
1097without reading the whole stream. These subdivisions have a variable size to
1098eliminate the need to transfer the event packet padding when partially filled
1099event packets must be sent when streaming a trace for live viewing/analysis.
1100An event packet can contain a certain amount of padding at the end. Dividing
1101streams into event packets is also useful for network streaming over UDP and
1102flight recorder mode tracing (a whole event packet can be swapped out of the
1103buffer atomically for reading).
1104
1105The stream header is repeated at the beginning of each event packet to allow
1106flexibility in terms of:
1107
1108 - streaming support,
1109 - allowing arbitrary buffers to be discarded without making the trace
1110 unreadable,
1111 - allow UDP packet loss handling by either dealing with missing event packet
1112 or asking for re-transmission.
1113 - transparently support flight recorder mode,
1114 - transparently support crash dump.
1115
1116
1117C. TSDL Grammar
1118
1119/*
1120 * Common Trace Format (CTF) Trace Stream Description Language (TSDL) Grammar.
1121 *
1122 * Inspired from the C99 grammar:
1123 * http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf (Annex A)
1124 * and c++1x grammar (draft)
1125 * http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3291.pdf (Annex A)
1126 *
1127 * Specialized for CTF needs by including only constant and declarations from
1128 * C99 (excluding function declarations), and by adding support for variants,
1129 * sequences and CTF-specific specifiers. Enumeration container types
1130 * semantic is inspired from c++1x enum-base.
1131 */
1132
11331) Lexical grammar
1134
11351.1) Lexical elements
1136
1137token:
1138 keyword
1139 identifier
1140 constant
1141 string-literal
1142 punctuator
1143
11441.2) Keywords
1145
1146keyword: is one of
1147
1148const
1149char
1150double
1151enum
1152event
1153floating_point
1154float
1155integer
1156int
1157long
1158short
1159signed
1160stream
1161string
1162struct
1163trace
1164typealias
1165typedef
1166unsigned
1167variant
1168void
1169_Bool
1170_Complex
1171_Imaginary
1172
1173
11741.3) Identifiers
1175
1176identifier:
1177 identifier-nondigit
1178 identifier identifier-nondigit
1179 identifier digit
1180
1181identifier-nondigit:
1182 nondigit
1183 universal-character-name
1184 any other implementation-defined characters
1185
1186nondigit:
1187 _
1188 [a-zA-Z] /* regular expression */
1189
1190digit:
1191 [0-9] /* regular expression */
1192
11931.4) Universal character names
1194
1195universal-character-name:
1196 \u hex-quad
1197 \U hex-quad hex-quad
1198
1199hex-quad:
1200 hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit
1201
12021.5) Constants
1203
1204constant:
1205 integer-constant
1206 enumeration-constant
1207 character-constant
1208
1209integer-constant:
1210 decimal-constant integer-suffix-opt
1211 octal-constant integer-suffix-opt
1212 hexadecimal-constant integer-suffix-opt
1213
1214decimal-constant:
1215 nonzero-digit
1216 decimal-constant digit
1217
1218octal-constant:
1219 0
1220 octal-constant octal-digit
1221
1222hexadecimal-constant:
1223 hexadecimal-prefix hexadecimal-digit
1224 hexadecimal-constant hexadecimal-digit
1225
1226hexadecimal-prefix:
1227 0x
1228 0X
1229
1230nonzero-digit:
1231 [1-9]
1232
1233integer-suffix:
1234 unsigned-suffix long-suffix-opt
1235 unsigned-suffix long-long-suffix
1236 long-suffix unsigned-suffix-opt
1237 long-long-suffix unsigned-suffix-opt
1238
1239unsigned-suffix:
1240 u
1241 U
1242
1243long-suffix:
1244 l
1245 L
1246
1247long-long-suffix:
1248 ll
1249 LL
1250
1251hexadecimal-digit-sequence:
1252 hexadecimal-digit
1253 hexadecimal-digit-sequence hexadecimal-digit
1254
1255enumeration-constant:
1256 identifier
1257 string-literal
1258
1259character-constant:
1260 ' c-char-sequence '
1261 L' c-char-sequence '
1262
1263c-char-sequence:
1264 c-char
1265 c-char-sequence c-char
1266
1267c-char:
1268 any member of source charset except single-quote ('), backslash
1269 (\), or new-line character.
1270 escape-sequence
1271
1272escape-sequence:
1273 simple-escape-sequence
1274 octal-escape-sequence
1275 hexadecimal-escape-sequence
1276 universal-character-name
1277
1278simple-escape-sequence: one of
1279 \' \" \? \\ \a \b \f \n \r \t \v
1280
1281octal-escape-sequence:
1282 \ octal-digit
1283 \ octal-digit octal-digit
1284 \ octal-digit octal-digit octal-digit
1285
1286hexadecimal-escape-sequence:
1287 \x hexadecimal-digit
1288 hexadecimal-escape-sequence hexadecimal-digit
1289
12901.6) String literals
1291
1292string-literal:
1293 " s-char-sequence-opt "
1294 L" s-char-sequence-opt "
1295
1296s-char-sequence:
1297 s-char
1298 s-char-sequence s-char
1299
1300s-char:
1301 any member of source charset except double-quote ("), backslash
1302 (\), or new-line character.
1303 escape-sequence
1304
13051.7) Punctuators
1306
1307punctuator: one of
1308 [ ] ( ) { } . -> * + - < > : ; ... = ,
1309
1310
13112) Phrase structure grammar
1312
1313primary-expression:
1314 identifier
1315 constant
1316 string-literal
1317 ( unary-expression )
1318
1319postfix-expression:
1320 primary-expression
1321 postfix-expression [ unary-expression ]
1322 postfix-expression . identifier
1323 postfix-expressoin -> identifier
1324
1325unary-expression:
1326 postfix-expression
1327 unary-operator postfix-expression
1328
1329unary-operator: one of
1330 + -
1331
1332assignment-operator:
1333 =
1334
1335type-assignment-operator:
1336 :=
1337
1338constant-expression:
1339 unary-expression
1340
1341constant-expression-range:
1342 constant-expression ... constant-expression
1343
13442.2) Declarations:
1345
1346declaration:
1347 declaration-specifiers declarator-list-opt ;
1348 ctf-specifier ;
1349
1350declaration-specifiers:
1351 storage-class-specifier declaration-specifiers-opt
1352 type-specifier declaration-specifiers-opt
1353 type-qualifier declaration-specifiers-opt
1354
1355declarator-list:
1356 declarator
1357 declarator-list , declarator
1358
1359abstract-declarator-list:
1360 abstract-declarator
1361 abstract-declarator-list , abstract-declarator
1362
1363storage-class-specifier:
1364 typedef
1365
1366type-specifier:
1367 void
1368 char
1369 short
1370 int
1371 long
1372 float
1373 double
1374 signed
1375 unsigned
1376 _Bool
1377 _Complex
1378 _Imaginary
1379 struct-specifier
1380 variant-specifier
1381 enum-specifier
1382 typedef-name
1383 ctf-type-specifier
1384
1385struct-specifier:
1386 struct identifier-opt { struct-or-variant-declaration-list-opt }
1387 struct identifier
1388
1389struct-or-variant-declaration-list:
1390 struct-or-variant-declaration
1391 struct-or-variant-declaration-list struct-or-variant-declaration
1392
1393struct-or-variant-declaration:
1394 specifier-qualifier-list struct-or-variant-declarator-list ;
1395 declaration-specifiers storage-class-specifier declaration-specifiers declarator-list ;
1396 typealias declaration-specifiers abstract-declarator-list := declaration-specifiers abstract-declarator-list ;
1397 typealias declaration-specifiers abstract-declarator-list := declarator-list ;
1398
1399specifier-qualifier-list:
1400 type-specifier specifier-qualifier-list-opt
1401 type-qualifier specifier-qualifier-list-opt
1402
1403struct-or-variant-declarator-list:
1404 struct-or-variant-declarator
1405 struct-or-variant-declarator-list , struct-or-variant-declarator
1406
1407struct-or-variant-declarator:
1408 declarator
1409 declarator-opt : constant-expression
1410
1411variant-specifier:
1412 variant identifier-opt variant-tag-opt { struct-or-variant-declaration-list }
1413 variant identifier variant-tag
1414
1415variant-tag:
1416 < identifier >
1417
1418enum-specifier:
1419 enum identifier-opt { enumerator-list }
1420 enum identifier-opt { enumerator-list , }
1421 enum identifier
1422 enum identifier-opt : declaration-specifiers { enumerator-list }
1423 enum identifier-opt : declaration-specifiers { enumerator-list , }
1424
1425enumerator-list:
1426 enumerator
1427 enumerator-list , enumerator
1428
1429enumerator:
1430 enumeration-constant
1431 enumeration-constant = constant-expression
1432 enumeration-constant = constant-expression-range
1433
1434type-qualifier:
1435 const
1436
1437declarator:
1438 pointer-opt direct-declarator
1439
1440direct-declarator:
1441 identifier
1442 ( declarator )
1443 direct-declarator [ type-specifier ]
1444 direct-declarator [ constant-expression ]
1445
1446abstract-declarator:
1447 pointer-opt direct-abstract-declarator
1448
1449direct-abstract-declarator:
1450 identifier-opt
1451 ( abstract-declarator )
1452 direct-abstract-declarator [ type-specifier ]
1453 direct-abstract-declarator [ constant-expression ]
1454 direct-abstract-declarator [ ]
1455
1456pointer:
1457 * type-qualifier-list-opt
1458 * type-qualifier-list-opt pointer
1459
1460type-qualifier-list:
1461 type-qualifier
1462 type-qualifier-list type-qualifier
1463
1464typedef-name:
1465 identifier
1466
14672.3) CTF-specific declarations
1468
1469ctf-specifier:
1470 event { ctf-assignment-expression-list-opt }
1471 stream { ctf-assignment-expression-list-opt }
1472 trace { ctf-assignment-expression-list-opt }
1473 typealias declaration-specifiers abstract-declarator-list := declaration-specifiers abstract-declarator-list ;
1474 typealias declaration-specifiers abstract-declarator-list := declarator-list ;
1475
1476ctf-type-specifier:
1477 floating_point { ctf-assignment-expression-list-opt }
1478 integer { ctf-assignment-expression-list-opt }
1479 string { ctf-assignment-expression-list-opt }
1480
1481ctf-assignment-expression-list:
1482 ctf-assignment-expression
1483 ctf-assignment-expression-list ; ctf-assignment-expression
1484
1485ctf-assignment-expression:
1486 unary-expression assignment-operator unary-expression
1487 unary-expression type-assignment-operator type-specifier
1488 declaration-specifiers storage-class-specifier declaration-specifiers declarator-list
1489 typealias declaration-specifiers abstract-declarator-list := declaration-specifiers abstract-declarator-list
1490 typealias declaration-specifiers abstract-declarator-list := declarator-list
This page took 0.025376 seconds and 4 git commands to generate.