Hint that content/packet size fields could be uint64_t

[ctf.git] / common-trace-format-specification.txt
diff --git a/common-trace-format-specification.txt b/common-trace-format-specification.txt

index 704968d1054268990b4ed83d3095581d8f6156a1..1e51849d7d95ded149bc17e2ccea5b527401f431 100644 (file)
--- a/common-trace-format-specification.txt
+++ b/common-trace-format-specification.txt
@@ -1,4 +1,4 @@
-Common Trace Format (CTF) Specification (pre-v1.8)
+Common Trace Format (CTF) Specification (v1.8.1)
  
  Mathieu Desnoyers, EfficiOS Inc.
  
@@ -65,6 +65,7 @@ Table of Contents
         7.3.1 Lexical Scope
         7.3.2 Static and Dynamic Scopes
     7.4 TSDL Examples
+8. Clocks
  
  
  1. Preliminary definitions
@@ -117,11 +118,10 @@ trace event types expressed in the Trace Stream Description Language
  3. Event stream
  
  An event stream can be divided into contiguous event packets of variable
-size. These subdivisions have a variable size. An event packet can
-contain a certain amount of padding at the end. The stream header is
-repeated at the beginning of each event packet. The rationale for the
-event stream design choices is explained in Appendix B. Stream Header
-Rationale.
+size. An event packet can contain a certain amount of padding at the
+end. The stream header is repeated at the beginning of each event
+packet. The rationale for the event stream design choices is explained
+in Appendix B. Stream Header Rationale.
  
  The event stream header will therefore be referred to as the "event packet
  header" throughout the rest of this document.
@@ -171,7 +171,7 @@ TSDL meta-data attribute representation of a specific alignment:
  
  4.1.3 Byte order
  
-By default, the native endianness of the source architecture the trace is used.
+By default, the native endianness of the source architecture is used.
  Byte order can be overridden for a basic type by specifying a "byte_order"
  attribute. Typical use-case is to specify the network byte order (big endian:
  "be") to save data captured from the network into the trace without conversion.
@@ -357,7 +357,8 @@ enum name : integer_type {
  };
  
  If the values are omitted, the enumeration starts at 0 and increment of 1 for
-each entry:
+each entry. An entry with omitted value that follows a range entry takes
+as value the end_value of the previous range + 1:
  
  enum name : unsigned int {
    ZERO,
@@ -415,8 +416,13 @@ struct example {
    uint64_t second_field_name;  /* Named type declared in the meta-data */
  };
  
-The fields are placed in a sequence next to each other. They each possess a
-field name, which is a unique identifier within the structure.
+The fields are placed in a sequence next to each other. They each
+possess a field name, which is a unique identifier within the structure.
+The identifier is not allowed to use any reserved keyword
+(see Section C.1.2). Replacing reserved keywords with
+underscore-prefixed field names is recommended. Fields starting with an
+underscore should have their leading underscore removed by the CTF trace
+readers.
  
  A nameless structure can be declared as a field type or as part of a typedef:
  
@@ -448,11 +454,36 @@ type selector. The field to use as tag is specified by the "tag_field",
  specified between "< >" after the "variant" keyword for unnamed
  variants, and after "variant name" for named variants.
  
-The alignment of the variant is the alignment of the type as selected by the tag
-value for the specific instance of the variant. The alignment of the type
-containing the variant is independent of the variant alignment.  The size of the
-variant is the size as selected by the tag value for the specific instance of
-the variant.
+The alignment of the variant is the alignment of the type as selected by
+the tag value for the specific instance of the variant.  The size of the
+variant is the size as selected by the tag value for the specific
+instance of the variant.
+
+The alignment of the type containing the variant is independent of the
+variant alignment.  For instance, if a structure contains two fields, a
+32-bit integer, aligned on 32 bits, and a variant, which contains two
+choices: either a 32-bit field, aligned on 32 bits, or a 64-bit field,
+aligned on 64 bits, the alignment of the outmost structure will be
+32-bit (the alignment of its largest field, disregarding the alignment
+of the variant). The alignment of the variant will depend on the
+selector: if the variant's 32-bit field is selected, its alignment will
+be 32-bit, or 64-bit otherwise. It is important to note that variants
+are specifically tailored for compactness in a stream. Therefore, the
+relative offsets of compound type fields can vary depending on
+the offset at which the compound type starts if it contains a variant
+that itself contains a type with alignment larger than the largest field
+contained within the compound type. This is caused by the fact that the
+compound type may contain the enumeration that select the variant's
+choice, and therefore the alignment to be applied to the compound type
+cannot be determined before encountering the enumeration.
+
+Each variant type selector possess a field name, which is a unique
+identifier within the variant. The identifier is not allowed to use any
+reserved keyword (see Section C.1.2). Replacing reserved keywords with
+underscore-prefixed field names is recommended. Fields starting with an
+underscore should have their leading underscore removed by the CTF trace
+readers.
+
  
  A named variant declaration followed by its definition within a structure
  declaration:
@@ -676,11 +707,15 @@ Event packet context (all fields are optional, specified by TSDL meta-data):
    include all event timestamps assigned to events contained within the packet.
  - Events discarded count
    - Snapshot of a per-stream free-running counter, counting the number of
-    events discarded that were supposed to be written in the stream prior to
-    the first event in the event packet.
-    * Note: producer-consumer buffer full condition should fill the current
+    events discarded that were supposed to be written in the stream after
+    the last event in the event packet.
+    * Note: producer-consumer buffer full condition can fill the current
              event packet with padding so we know exactly where events have been
-            discarded.
+           discarded. However, if the buffer full condition chooses not
+           to fill the current event packet with padding, all we know
+           about the timestamp range in which the events have been
+           discarded is that it is somewhere between the beginning and
+            the end of the packet.
  - Lossless compression scheme used for the event packet content. Applied
    directly to raw data. New types of compression can be added in following
    versions of the format.
@@ -742,9 +777,8 @@ struct event_packet_context {
    uint32_t stream_packet_count;
    uint32_t events_discarded;
    uint32_t cpu_id;
-  uint32_t/uint16_t content_size;
-  uint32_t/uint16_t packet_size;
-  uint8_t  stream_packet_count_bits;   /* Significant counter bits */
+  uint64_t/uint32_t/uint16_t content_size;
+  uint64_t/uint32_t/uint16_t packet_size;
    uint8_t  compression_scheme;
    uint8_t  encryption_scheme;
    uint8_t  checksum_scheme;
@@ -956,12 +990,13 @@ beginning of the file. This magic number is also used to detect the
  endianness of the architecture by trying to read the CTF magic number
  and its counterpart in reversed endianness. The events within the
  meta-data stream have no event header nor event context. Each event only
-contains a "sequence" payload, which is a sequence of bits using the
-"trace.packet.header.content_size" field as a placeholder for its length
-(the packet header size should be substracted). The formatting of this
-sequence of bits is a plain-text representation of the TSDL description.
-Each meta-data packet start with a special packet header, specific to
-the meta-data stream, which contains, exactly:
+contains a special "sequence" payload, which is a sequence of bits which
+length is implicitly calculated by using the
+"trace.packet.header.content_size" field, minus the packet header size.
+The formatting of this sequence of bits is a plain-text representation
+of the TSDL description.  Each meta-data packet start with a special
+packet header, specific to the meta-data stream, which contains,
+exactly:
  
  struct metadata_packet_header {
    uint32_t magic;                      /* 0x75D11D57 */
@@ -977,7 +1012,7 @@ struct metadata_packet_header {
  };
  
  The packet-based meta-data can be converted to a text-only meta-data by
-concatenating all the strings in contains.
+concatenating all the strings it contains.
  
  In the textual representation of the meta-data, the text contained
  within "/*" and "*/", as well as within "//" and end of line, are
@@ -1016,14 +1051,14 @@ path lookups) and for sequence references to length fields.
  
  7.3.1 Lexical Scope
  
-Each of "trace", "stream", "event", "struct" and "variant" have their own
-nestable declaration scope, within which types can be declared using "typedef"
-and "typealias". A root declaration scope also contains all declarations
-located outside of any of the aforementioned declarations. An inner
-declaration scope can refer to type declared within its container
-lexical scope prior to the inner declaration scope. Redefinition of a
-typedef or typealias is not valid, although hiding an upper scope
-typedef or typealias is allowed within a sub-scope.
+Each of "trace", "env", "stream", "event", "struct" and "variant" have
+their own nestable declaration scope, within which types can be declared
+using "typedef" and "typealias". A root declaration scope also contains
+all declarations located outside of any of the aforementioned
+declarations. An inner declaration scope can refer to type declared
+within its container lexical scope prior to the inner declaration scope.
+Redefinition of a typedef or typealias is not valid, although hiding an
+upper scope typedef or typealias is allowed within a sub-scope.
  
  7.3.2 Static and Dynamic Scopes
  
@@ -1045,9 +1080,8 @@ associated with a variant, and to lookup up the location of the length
  field associated with a sequence.
  
  Variants and sequences can refer to a tag field either using a relative
-path or an absolute path. The relative path starts with "." to ensure
-there are no conflicts with dynamic scope names. It is relative to the
-scope in which the variant or sequence performing the lookup is located.
+path or an absolute path. The relative path is relative to the scope in
+which the variant or sequence performing the lookup is located.
  Relative paths are only allowed to lookup within the same static scope,
  which includes its nested static scopes. Lookups targeting parent static
  scopes need to be performed with an absolute path.
@@ -1063,6 +1097,7 @@ header as selector.
  
  The dynamic scope prefixes are thus:
  
+ - Trace Environment: <env. >,
   - Trace Packet Header: <trace.packet.header. >,
   - Stream Packet Context: <stream.packet.context. >,
   - Event Header: <stream.event.header. >,
@@ -1072,15 +1107,15 @@ The dynamic scope prefixes are thus:
  
  
  The target dynamic scope must be specified explicitly when referring to
-a field outside of the static scope (absolute scope reference).
-References to fields within the static scope (including local static
-scopes and nested static scopes) can be referenced by using a relative
-reference (starting with ".").
+a field outside of the static scope (absolute scope reference). No
+conflict can occur between relative and dynamic paths, because the
+keywords "trace", "stream", and "event" are reserved, and thus
+not permitted as field names. It is recommended that field names
+clashing with CTF and C99 reserved keywords use an underscore prefix to
+eliminate the risk of generating a description containing an invalid
+field name. Consequently, fields starting with an underscore should have
+their leading underscore removed by the CTF trace readers.
  
-As a matter of convenience, the leading "." in relative paths can be
-omitted. In case of conflict between relative and dynamic paths, the
-relative path is preferred. It is recommended to use the "." prefix for
-relative paths to ensure no path name conflict can occur.
  
  The information available in the dynamic scopes can be thought of as the
  current tracing context. At trace production, information about the
@@ -1100,8 +1135,8 @@ trace. The event "id" field can be left out if there is only one event
  in a stream.
  
  trace {
-  major = value;                               /* Trace format version */
-  minor = value;
+  major = value;                       /* CTF spec version major number */
+  minor = value;                       /* CTF spec version minor number */
    uuid = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa";       /* Trace UUID */
    byte_order = be OR le;                       /* Endianness (required) */
    packet.header := struct {
@@ -1111,6 +1146,16 @@ trace {
    };
  };
  
+/*
+ * The "env" (environment) scope contains assignment expressions. The
+ * field names and content are implementation-defined.
+ */
+env {
+  pid = value;                 /* example */
+  proc_name = "name";          /* example */
+  ...
+};
+
  stream {
    id = stream_id;
    /* Type 1 - Few event IDs; Type 2 - Many event IDs. See section 6.1. */
@@ -1124,9 +1169,11 @@ stream {
  };
  
  event {
-  name = event_name;
+  name = "event_name";
    id = value;                  /* Numeric identifier within the stream */
    stream_id = stream_id;
+  loglevel = value;
+  model.emf.uri = "string";
    context := struct {
      ...
    };
@@ -1238,6 +1285,89 @@ struct {
  }
  
  
+8. Clocks
+
+Clock metadata allows to describe the clock topology of the system, as
+well as to detail each clock parameter. In absence of clock description,
+it is assumed that all fields named "timestamp" use the same clock
+source, which increments once per nanosecond.
+
+Describing a clock and how it is used by streams is threefold: first,
+the clock and clock topology should be described in a "clock"
+description block, e.g.:
+
+clock {
+       name = cycle_counter_sync;
+       uuid = "62189bee-96dc-11e0-91a8-cfa3d89f3923";
+       description = "Cycle counter synchronized across CPUs";
+       freq = 1000000000;             /* frequency, in Hz */
+       /* precision in seconds is: 1000 * (1/freq) */
+       precision = 1000;
+       /*
+        * clock value offset from Epoch is:
+        * offset_s + (offset * (1/freq))
+        */
+       offset_s = 1326476837;
+       offset = 897235420;
+       absolute = FALSE;
+};
+
+The mandatory "name" field specifies the name of the clock identifier,
+which can later be used as a reference. The optional field "uuid" is the
+unique identifier of the clock. It can be used to correlate different
+traces that use the same clock. An optional textual description string
+can be added with the "description" field. The "freq" field is the
+initial frequency of the clock, in Hz. If the "freq" field is not
+present, the frequency is assumed to be 1000000000 (providing clock
+increment of 1 ns). The optional "precision" field details the
+uncertainty on the clock measurements, in (1/freq) units. The "offset_s"
+and "offset" fields indicate the offset from POSIX.1 Epoch, 1970-01-01
+00:00:00 +0000 (UTC), to the zero of value of the clock. The "offset_s"
+field is in seconds. The "offset" field is in (1/freq) units. If any of
+the "offset_s" or "offset" field is not present, it is assigned the 0
+value. The field "absolute" is TRUE if the clock is a global reference
+across different clock uuid (e.g. NTP time). Otherwise, "absolute" is
+FALSE, and the clock can be considered as synchronized only with other
+clocks that have the same uuid.
+
+
+Secondly, a reference to this clock should be added within an integer
+type:
+
+typealias integer {
+       size = 64; align = 1; signed = false;
+       map = clock.cycle_counter_sync.value;
+} := uint64_ccnt_t;
+
+Thirdly, stream declarations can reference the clock they use as a
+time-stamp source:
+
+struct packet_context {
+       uint64_ccnt_t ccnt_begin;
+       uint64_ccnt_t ccnt_end;
+       /* ... */
+};
+
+stream {
+       /* ... */
+       event.header := struct {
+               uint64_ccnt_t timestamp;
+               /* ... */
+       }
+       packet.context := struct packet_context;
+};
+
+For a N-bit integer type referring to a clock, if the integer overflows
+compared to the N low order bits of the clock prior value, then it is
+assumed that one, and only one, overflow occurred. It is therefore
+important that events encoding time on a small number of bits happen
+frequently enough to detect when more than one N-bit overflow occurs.
+
+In a packet context, clock field names ending with "_begin" and "_end"
+have a special meaning: this refers to the time-stamps at, respectively,
+the beginning and the end of each packet.
+
+
  A. Helper macros
  
  The two following macros keep track of the size of a GNU/C structure without
@@ -1309,8 +1439,10 @@ keyword: is one of
  align
  const
  char
+clock
  double
  enum
+env
  event
  floating_point
  float
@@ -1623,8 +1755,10 @@ typedef-name:
  2.3) CTF-specific declarations
  
  ctf-specifier:
+       clock { ctf-assignment-expression-list-opt }
         event { ctf-assignment-expression-list-opt }
         stream { ctf-assignment-expression-list-opt }
+       env { ctf-assignment-expression-list-opt }
         trace { ctf-assignment-expression-list-opt }
         typealias declaration-specifiers abstract-declarator-list type-assignment-operator declaration-specifiers abstract-declarator-list
         typealias declaration-specifiers abstract-declarator-list type-assignment-operator declarator-list