Make API CTF-agnostic
The main purpose of this patch is to make the Babeltrace 2 API unrelated
to the Common Trace Format. This makes the Babeltrace 2 API easier to
use and data structures are more compact: you don't need to know the
implicit and explicit CTF rules to use the Babeltrace 2 API. It also
prepares the Babeltrace 2 API to be ready, as much as possible, for the
upcoming CTF 2.
General API changes
-------------------
* To make the API simpler, most object properties are optional. This
includes the names of event classes, stream classes, traces, and
clock classes, for example.
Event classes and stream classes still have unique IDs. This makes it
easier to deterministically identify metadata objects. You can call
one of:
bt_event_class_create():
Create an event class with an automatic ID. I believe most sources
will use this version.
bt_event_class_create_with_id():
Create an event class with an explicit unique ID (within its
parent stream class).
You cannot call bt_event_class_create() and
bt_event_class_create_with_id() to create event classes which belong
to the same stream class: the stream class has a manual or automatic
event class ID assignment mode (automatic by default) which you can
change with bt_stream_class_set_assigns_automatic_event_class_id()
before adding any event class. The same strategy is used for a stream
class and a stream (its parents being a trace).
* All API functions return some status code (sometimes a simple `int`
which can be 0 (success) or negative (error)) IF and only if they can
fail. Simple property getters never fail, so they return the
property's value directly. This means that many functions now return
the `uint64_t` type: the caller does not need to check for an error
(for example, bt_stream_class_get_id(), because a stream class always
has an ID).
Just in case, all property setters can still fail (although most won't
currently; they always return 0), so they return a status code. This
could help us add caches and create lazy setters eventually.
When a function returns a pointer, for example, `const char *` or
`struct bt_field_type *`, `NULL` does NOT indicate an error: it means
the property is absent. A function which could fail and which needs to
return a pointer sets an output parameter instead and returns a status
code.
A function can return `enum bt_property_availability` and set (or not)
an output parameter for non-pointer optional properties. An example is
bt_event_class_get_log_level(): it is possible that an event class has
no log level.
This also means that all `*_UNKNOWN` enumeration labels set to -1 are
gone.
* The public metadata visitor API is removed as it is not used anywhere.
We could reintroduce it later, with care, if need be.
* The `babeltrace/ctf-ir/utils.h` file is removed completely. It only
contained a function to check if a given string is a valid CTF
identifier.
* Terminology: "nanoseconds from Epoch" becomes "nanoseconds from
origin". When a clock class is not absolute, it has an offset from a
given origin, not from the Unix Epoch.
* New `bt_uuid` type alias for `const uint8_t *`.
* All API functions which increment an object's reference count
("getting" functions) are removed, as it is preferred and encouraged
to use borrowing functions. You can still do, for example:
struct bt_stream_class *sc =
bt_get(bt_event_class_borrow_stream_class(ec));
* Conditional precondition checking (BT_ASSERT_PRE()) is used at many
more places now, even in the metadata API, as you could call this API
on the fast path anyway.
* Various functions are renamed for consistency and terminology
accuracy.
* Most of the API documentation is removed because it is not accurate
anymore.
Clock class API changes
-----------------------
* bt_clock_class_create() has no parameters: the name property is
optional (not set by default), and the default frequency is 1 GHz.
* You set and get both offsets at the same time because you often need
both of them any: bt_clock_class_set_offset() and
bt_clock_class_get_offset().
* The name property can have any value: it is not limited to CTF
identifiers anymore.
* The absolute property is true by default: a default clock class's
origin is the Unix Epoch.
* The cycle part of the offset MUST be less than the frequency. The
library validates this in developer mode when calling both
bt_clock_class_set_offset() and bt_clock_class_set_frequency().
This makes some time conversion easier to compute and more precise.
Clock value API changes
-----------------------
* A clock value can be known or unknown. As of Babeltrace 2.0, there's
no function to make a clock value unknown, but the API to get this
state exists (returned by bt_event_borrow_default_clock_value(), for
example).
The bt_stream_class_default_clock_is_always_known() function indicates
if, for all the streams created from this stream class, their default
clocks are always known (the default clock value accessors never
return `BT_CLOCK_VALUE_STATUS_UNKNOWN`). As of this patch, this
function always returns `BT_TRUE`.
Event class API changes
-----------------------
* Terminology: "context field type" becomes "specific context field
type" to differentiate this scope from the "common context field type"
defined at the stream class level.
Event class, stream class, and trace API changes
------------------------------------------------
* You need to pass a trace object to bt_stream_class_create() and a
stream class object to bt_event_class_create(). In other words, you
cannot create a "free" stream class, add event classes to it, and then
add the stream class object to a trace object. This makes validation a
lot easier and a great quantity of code was sent to the Recycle Bin
thanks to this contraint.
bt_trace_add_stream_class() and bt_stream_class_add_event_class() are
removed because creating an event class or a stream class
automatically adds it to its parent.
Event header field API changes
------------------------------
* Replace bt_stream_class_create_event_header_field() with
bt_event_header_field_create(). This is more consistent with this API
where you typically create an object with its own API an pass whatever
is needed (e.g., bt_event_class_create() instead of
bt_stream_class_create_event_class()).
Event API changes
-----------------
* Terminology: "context field type" becomes "specific context field" to
differentiate this scope from the "common context field".
* New bt_event_set_default_clock_value() and
bt_event_borrow_default_clock_value() to set and get the event's
default clock value.
You MUST set an event's default clock value if its stream class has a
default clock class (see bt_stream_class_set_default_clock_class()).
With bt_event_set_default_clock_value(), you don't need to specify the
clock class as this function uses the stream class's default clock
class. When multiple clock classes per stream class become supported
eventually, we'll introduce bt_event_set_clock_value() where you
specify a clock class for which to set a clock value.
Field type API and concept changes
----------------------------------
* Field types do not have any attached semantics anymore. This includes:
* Special fields identified by name. Examples are `uuid`, `id`,
`packet_size`, and `events_discarded`.
This removes some redundancy between fields and metadata objects,
for example a packet's header field's `uuid` field always contains
the same value as the trace object's UUID property (if any). The
same goes for the stream class ID, the stream ID, the event class
ID, etc.
* Mapped clock classes. Mapping an integer field type to a clock class
makes things complicated because of the CTF clock updating mechanism
(with automatic wrapping for compression). Instead, packet and event
objects have a "default clock value" property which the source or
filter explicitly sets. Therefore, the CTF clock updating mechanism
is part of the CTF plugin now (in `notif-iter.c`).
* All field type objects, within a given trace object (recursively),
MUST be unique. The library validates this in developer mode. This
means you cannot create a single integer field type, for example, and
add it more than one time to a given structure field type.
This makes it possible to uniquely identify a field type by its
address, and it makes validation easier. Also, weird side effects like
variant or sequence field types being copied during validation are
gone with this constraint.
Field type objects are still shared, although most components should
only need to borrow them.
* The byte order, alignment, and integer/string field type encoding
properties are removed. They are not needed by the IR API: they are
CTF concepts.
* bt_field_type_integer_create(), bt_field_type_integer_set_is_signed(),
and bt_field_type_integer_is_signed() are removed. Now there's an
unsigned and a signed integer type:
`BT_FIELD_TYPE_ID_UNSIGNED_INTEGER` and
`BT_FIELD_TYPE_ID_SIGNED_INTEGER`. This is because integer field type
and field APIs can be different depending on the signedness (return
types, for example).
bt_field_type_unsigned_integer_create() and
bt_field_type_signed_integer_create() have no parameters: they create
default integer field types with a 64-bit equivalent value range.
* An enumeration field type is now conceptually an integer field type.
* bt_field_type_enumeration_create() is removed. Now there's an unsigned
and a signed enumeration type: `BT_FIELD_TYPE_ID_UNSIGNED_ENUMERATION`
and `BT_FIELD_TYPE_ID_SIGNED_ENUMERATION`. This is because enumeration
field type and field APIs can be different depending on the signedness
(return types, for example).
bt_field_type_unsigned_enumeration_create() and
bt_field_type_signed_enumeration_create() have no parameters: they
don't need an "underlying" integer field type because they ARE integer
field types.
* Functions named bt_field_type_integer_*() apply to any integer
(including enumeration) field type (unsigned or signed).
* Functions named bt_field_type_enumeration_*() apply to any enumeration
field type (unsigned or signed).
* Terminology: the "size" integer field type property becomes "field
value range". This property indicates the expected minimum and maximum
values of fields created from a given integer field type. The
bt_field_type_integer_set_field_value_range() function accepts a
parameter which is N in the following formulas:
* Unsigned integer range: [0, 2^N - 1]
* Signed integer range: [-2^(N - 1), 2^(N - 1) - 1]
* Terminology: "base" becomes "preferred display base". The default
preferred display base is 10. It is kept as a property to satisfy the
CTF 1.8 use case, although it should eventually be part of a custom
user attribute when we change the API to support CTF 2 features.
* Terminology: "floating point number field type" becomes "real field
type" (as in _real number_).
* The only property of a real field type is if it's single precision or
not (double precision). The accessors are
bt_field_type_real_is_single_precision() and
bt_field_type_real_set_is_single_precision().
We don't need explicit exponent and mantissa sizes as those are
CTF/encoding concepts.
* The concept of an enumeration field type mapping iterator is removed.
Instead:
* We change the "mapping" concept: an enumeration field type mapping
is a label and a set of ranges. Mapping labels are unique within
an enumeration field type, but the same ranges can exist in
different mappings (overlaps).
The functions to get mappings are
bt_field_type_enumeration_get_mapping_count(),
bt_field_type_unsigned_enumeration_borrow_mapping_by_index(), and
bt_field_type_signed_enumeration_borrow_mapping_by_index(),
depending on the enumeration field type's signedness.
The functions to add mappings look similar to what they used to:
bt_field_type_unsigned_enumeration_map_range() and
bt_field_type_signed_enumeration_map_range(). Those functions find
any existing mapping sharing the label and add the given range to it
(or create a new mapping).
Then it becomes trivial to get all the ranges of a given mapping
by label: they are already stored as such in the object.
* To get all the mapping labels which contain a given value within
their ranges, call
bt_field_type_unsigned_enumeration_get_mapping_labels_by_value() or
bt_field_type_signed_enumeration_get_mapping_labels_by_value().
Those functions accept a
`bt_field_type_enumeration_mapping_label_array` parameter which is
`const char * const *`. There's no copy to the user here: the
functions fill an array internal to the enumeration field type
object and return its address. Then the array is valid as long as
you don't call those functions again for the same object.
This should be enough as what you want is often all the labels,
not just the first one, and use cases where there are thousands of
labels matching a given value are nonexistent AFAIK.
* Terminology: "structure field type field" becomes "structure field
type member".
* Terminology: "variant field type field" becomes "variant field type
option".
* Structure field type member and variant field type option names can
have any value (as long as they are unique within their parent): they
are not limited by CTF identifiers anymore.
* Terminology: "adding a structure field type member" becomes "appending
a structure field type member". Members are ordered, so "append" makes
sense here (like Python's list's append() method). The same goes for
variant field types.
* bt_field_type_array_create() and bt_field_type_sequence_create() are
remove.
Conceptually, with this patch, there are static and dynamic array
field types. Both are array field types, although you cannot create
an (abstract) array field type.
Use bt_field_type_static_array_create() and
bt_field_type_dynamic_array_create() to create static and dynamic
field types. Both require an element field type, and
bt_field_type_static_array_create() also requires the static length.
The common API is bt_field_type_array_borrow_element_field_type().
* Terminology: "variant field type tag" becomes "variant field type
selector". This is more in line with the concept of a "selected
field", for example.
* For both the dynamic array and variant field types, the
length/selector field type is now optional. This is a CTF concept: a
source can create dynamic array fields without having another field
which contains its length. You set a dynamic array field's length with
bt_field_dynamic_array_set_length(). Having another field contain its
length is just an encoding concept. The same is true for a variant
field and its selector field.
You can still link a dynamic array or variant field type to its length
of selector field type with
bt_field_type_dynamic_array_set_length_field_type() or
bt_field_type_variant_set_selector_field_type(). Those functions take
the linked field type and set the field paths automatically. This is
possible now because an event class is always within a stream class
which is always within a trace, so the linked field type should be
visible (validated in developer mode).
Field API changes
-----------------
* In general, field type API changes are reflected on the field API. For
example, since an enumeration field type is conceptually an integer
field type, an enumeration field is conceptually an integer field.
This means that you can call bt_field_signed_integer_get_value() to
get the integer value of a signed enumeration field.
* bt_field_array_get_length() is a common array field API which applies
to both static and dynamic array fields. When you call it with a
static array field, it returns its field type's static length.
* bt_field_array_borrow_element_field_by_index() is a common array
field API which applies to both static and dynamic array fields.
* Because a variant field has no link to its selector field now (it is
optional), you need to set the selected option by index with
bt_field_variant_select_option_field(). Then you can get the selected
option with bt_field_variant_borrow_selected_option_field() (and its
index with bt_field_variant_get_selected_option_field_index()).
Packet context field API changes
--------------------------------
* Replace bt_stream_class_create_packet_context_field() with
bt_packet_context_field_create(). This is more consistent with this
API where you typically create an object with its own API an pass
whatever is needed (e.g., bt_event_class_create() instead of
bt_stream_class_create_event_class()).
Packet header field API changes
-------------------------------
* Replace bt_trace_create_packet_header_field() with
bt_packet_header_field_create(). This is more consistent with this API
where you typically create an object with its own API an pass whatever
is needed (e.g., bt_event_class_create() instead of
bt_stream_class_create_event_class()).
Packet API changes
------------------
* A packet does not contain its previous packet's properties anymore.
Any component which needs to compute the difference between the
properties of two consecutive packets needs to keep the previous one
manually.
Therefore, everything related to the previous packet in the API is
removed.
* A packet contains _snapshots_ of stream properties:
* Default clock value at beginning of packet.
* Default clock value at end of packet.
* Discarded event counter at end of packet.
* Packet counter (sequence number) at end of packet.
You MUST set those snapshot properties if the packet's stream class
has them enabled: see
bt_stream_class_packets_have_discarded_event_counter_snapshot(),
bt_stream_class_packets_have_packet_counter_snapshot(),
bt_stream_class_packets_have_default_beginning_clock_value(), and
bt_stream_class_packets_have_default_end_clock_value(). All those
functions return `BT_FALSE` by default.
Stream class API changes
------------------------
* Terminology: "event context field type" becomes "event common context
field type".
* Use bt_stream_class_set_default_clock_class() to set a stream class's
default clock class. When a stream class has a default clock class,
all the events which belong to a stream created from this stream class
MUST have default clock values (bt_event_set_default_clock_value()).
* There are new properties which indicate if packets which belong to a
stream created from a given stream class have specific stream property
snapshots:
bt_stream_class_packets_have_discarded_event_counter_snapshot(),
bt_stream_class_packets_have_packet_counter_snapshot(),
bt_stream_class_packets_have_default_beginning_clock_value(), and
bt_stream_class_packets_have_default_end_clock_value(). All those
functions return `BT_FALSE` by default.
Trace API changes
-----------------
* The native byte order property is removed. Is is not needed by the IR
API: it is a CTF concept.
* Terminology: "environment field" becomes "environment entry".
Inactivity notification API changes
-----------------------------------
* bt_notification_inactivity_create() accepts a default clock class
parameter so that you can call
bt_notification_inactivity_set_default_clock_value() without
specifying a clock class.
Stream notification API changes
-------------------------------
* bt_notification_stream_begin_set_default_clock_value() and
bt_notification_stream_end_set_default_clock_value() do not accept a
clock class parameter anymore: you set the default clock class at the
stream class level.
Internal API changes
--------------------
* BT_LIB_LOG*(): the `%!u` conversion specifier formats a UUID
(`bt_uuid`). `%!l` is now used to format a plugin object. `%!r` is
removed (reference count) in favor of `%!O` which now formats any
Babeltrace object (`struct bt_object *`).
* All freezing functions are only enabled in developer mode.
* New `include/babeltrace/property-internal.h` file with data structures
and functions to deal with object properties (used for optional
properties, like an event class's log level).
* `clock-class-internal.h`: new base offset value (ns) to compute
nanoseconds from origin in clock values more efficiently.
* `field-types-internal.h`: new BT_ASSERT_PRE_FT_IS_*() macros to deal
with integer, enumeration, and array field types which now have more
than one field type ID.
* `fields-internal.h`: new BT_ASSERT_PRE_FIELD_IS_*() macros to deal
with integer, enumeration, and array field which now have more than
one field type ID.
* `utils-internal.h`: prefix function names with `bt_util_`.
* `validation-internal.h`: removed because all field types are valid
since they have no attached semantics. Scoped validations are
performed in property setters instead (developer mode).
* `visitor-internal.h`: removed because the visitor API is removed.
`ctf` plugin update
-------------------
Because the library's IR now misses important properties for decoding
purposes (alignment, byte order, linked field), the `ctf` plugin has its
own CTF metadata IR in `ctf-meta.h`. This file contains raw data
structures for:
* Field types
* Field path
* Event class
* Stream class
* Trace class
The IR generator AST visitor `visitor-generate-ir.c` converts the AST to
those data structures. All objects are uniquely allocated (no reference
count; they are not Babeltrace objects). Once the visitor has built a
basic CTF IR trace class, a sequence of filters are applied over it to
make it work for decoding purposes:
1. ctf_trace_class_update_default_clock_classes(): Set any stream
class's default clock class based on integer field types mapped to
clock classes within it.
2. ctf_trace_class_update_meanings(): Attaches meanings to specific
integer field types.
A meaning is a special quality which is needed to properly decode a
data stream, for example, `CTF_FIELD_TYPE_MEANING_EVENT_CLASS_ID`,
`CTF_FIELD_TYPE_MEANING_MAGIC`, and
`CTF_FIELD_TYPE_MEANING_EXP_PACKET_TOTAL_SIZE`.
For CTF 1.8, field types with meanings are found by name in specific
scopes.
When a field type has a meaning, it is not considered as an important
value for subsequent filters and sinks, so the field type is marked
as not having its equivalent library IR object. For example, the
`packet_size` field type does not need to exist for connected filters
and sinks because it's not holding trace information. If all the
members of a structure field type, or all the options of a variant
field type, recursively have no equivalent library IR objects, then
this structure/variant field type has no equivalent library IR object
either. Therefore, because the typical CTF packet header field type
contains only the `magic` (magic number), `uuid` (UUID), `stream_id`
(stream class ID), and `stream_instance_id` (stream ID) members, they
all have meanings, thus the corresponding `bt_trace` object has no
packet header field type.
3. ctf_trace_class_update_text_array_sequence(): Marks array and
sequence field types containing only 8-bit aligned 8-bit integer
field types with an encoding as being _text_ array and sequences.
The equivalent library IR field types will be string field types.
4. ctf_trace_class_resolve_field_types(): Does what `resolve.c` used
to do, but at the CTF IR level.
5. ctf_trace_class_update_in_ir(): Sets whether or not, depending on
some conditions (meanings, mapped clock classes, etc.), CTF IR field
types have equivalent library IR field types.
6. ctf_trace_class_update_value_storing_indexes(): Sets the indexes
where to store decoded integer values, and from where to read those
values for sequence lengths and variant tags.
During decoding, when we need the length of a sequence or the tag of
a variant, we don't look into existing `bt_field` objects. Instead,
the length or tag was already stored at a specific index within an
array of `uint64_t`/`int64_t` values, and we get this value back by
index.
7. ctf_trace_class_validate(): Validates the whole trace class.
8. ctf_trace_class_translate(): Translates the CTF IR trace class into
a `bt_trace` and marks translated CTF IR objects.
It is possible to add event classes or stream classes to an existing
trace class, and call all those functions again: they only update what's
not translated yet, and ctf_trace_class_translate() only translates
what's not translated yet.
When decoding (`notif-iter.c`), the bt_btr_start() function reads a CTF
IR field type to decode a data stream. When it calls back
bt_notif_iter_*() functions, they only set `bt_field` objects if they
need to. Of particular interest is the btr_unsigned_int_cb() callback:
This one:
1. Applies the meaning action if its CTF IR field type has a meaning.
For example, if its meaning is `CTF_FIELD_TYPE_MEANING_DATA_STREAM_ID`,
it sets the current data stream ID:
case CTF_FIELD_TYPE_MEANING_DATA_STREAM_ID:
notit->cur_data_stream_id = value;
break;
2. If the integer field type has a mapped clock class, it updates the
stream's default clock value using the CTF clock update mechanism.
3. If the integer field type has a storing index, it stores the decoded
integer value to a specific location within the stored values:
g_array_index(notit->stored_values, uint64_t,
(uint64_t) int_ft->storing_index) = value;
4. If the integer field type has an equivalent library IR field type, it
sets the appropriate `bt_field` object with the decoded value.
Other plugin updates
--------------------
`src.text.dmesg`, `sink.text.pretty`, and `flt.utils.muxer` are adapted
to the new API.
Test updates
------------
* `tests/lib/test_bt_ctf_field_type_validation.c`: removed because it is
now a precondition that the metadata be valid now. What's left to
validate is still done in developer mode.
* `tests/plugins/test-utils-muxer.c`: removed because it will be easier
to test with the Python bindings once they are updated instead of
wasting time adapting this one.
* Other tests are adapted to the new API.
Performance update
------------------
This patch makes the performance of Babeltrace 2 go from 40 % to 82 %
the Babeltrace 1's performance with:
babeltrace /path/to/trace -o dummy
with a 1.4 GiB LTTng kernel trace (four streams) and configured as such:
BABELTRACE_DEV_MODE=0 BABELTRACE_DEBUG_MODE=0 \
BABELTRACE_MINIMAL_LOG_LEVEL=INFO CFLAGS='-O3 -DNDEBUG' ./configure
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
114 files changed:
This page took 0.041437 seconds and 4 git commands to generate.