From: Jérémie Galarneau Date: Wed, 27 May 2020 16:15:54 +0000 (-0400) Subject: Fix: src.ctf.lttng-live: incomplete metadata packet is an error X-Git-Tag: v2.0.4~17 X-Git-Url: http://git.efficios.com/?p=babeltrace.git;a=commitdiff_plain;h=969f43d2990f7d176f6a23e412ef191138c066da Fix: src.ctf.lttng-live: incomplete metadata packet is an error Observed issue ============== While investigating the issue described in [1], I noticed that babeltrace2 falls into a retry loop when the src.ctf.lttng-live component encounters unparseable metadata. The src.ctf.lttng-live reports parsing errors on every subsequent reception of a metadata packet. The relay daemon eventually sends binary data which fails to be decoded, ending the graph's execution with a binary decoding error, which is not the "real" issue. Cause ===== Due to a (now fixed) bug in LTTng [1], unparseable (incomplete) metadata can be made visible to live clients. This bug fix doesn't involve the clients; it resulted in an illegal state in the lttng-live protocol. When the relay daemon notifies the live client that new metadata is available, lttng_live_metadata_update() receives all the metadata made available by the relay daemon in a memory stream and then uses the ctf_metadata_decoder to append the new content to the existing trace class. However, if the decoder returns `CTF_METADATA_DECODER_STATUS_INCOMPLETE`, `LTTNG_LIVE_ITERATOR_STATUS_AGAIN` is returned to the caller, resulting in the graph retrying to invoke the live iterator's `next` method until another error eventually prevents the successful completion of the graph. I am assuming that the use of the `AGAIN` status code may have been a failed attempt at fixing [1] in the live component rather than adressing the underlying problem. Solution ======== To my knowledge there are no provisions made for incomplete metadata in the lttng-live protocol as of the current version. This may have been done in anticipation of a future change (?), but it currently obscures the error reported when a corrupted/incomplete/unparseable metadata packet is received by the src.ctf.lttng-live component. The current approach doesn't work anyhow as lttng_live_metadata_update() creates a "fresh" memory stream on each invocation. If the intention was to accumulate partial metadata until it can be successfully parsed, the accumulated metadata would have to be preserved from one invocation to the next. The conversion of the status code from `INCOMPLETE` to `AGAIN` is removed to fail immediately during the current invocation of the iterator's `next` method. Drawbacks ========= None. References ========== [1] https://github.com/lttng/lttng-tools/commit/f5ba75b4f0c0b44092c76bc931b25b24a2e62718 Signed-off-by: Jérémie Galarneau Change-Id: I8a379ea5d838786a6731199dd5f03bbf70ec13f5 Reviewed-on: https://review.lttng.org/c/babeltrace/+/3586 Tested-by: jenkins Reviewed-by: Philippe Proulx --- diff --git a/src/plugins/ctf/lttng-live/metadata.c b/src/plugins/ctf/lttng-live/metadata.c index d1e3ae02..582c0662 100644 --- a/src/plugins/ctf/lttng-live/metadata.c +++ b/src/plugins/ctf/lttng-live/metadata.c @@ -294,9 +294,6 @@ enum lttng_live_iterator_status lttng_live_metadata_update( /* The metadata was updated succesfully. */ trace->metadata_stream_state = LTTNG_LIVE_METADATA_STREAM_STATE_NOT_NEEDED; - break; - case CTF_METADATA_DECODER_STATUS_INCOMPLETE: - status = LTTNG_LIVE_ITERATOR_STATUS_AGAIN; break; default: goto error;