From b27be37e634cbd569053410be6ef7cdd82537cc5 Mon Sep 17 00:00:00 2001 From: =?utf8?q?J=C3=A9r=C3=A9mie=20Galarneau?= Date: Wed, 27 May 2020 12:15:54 -0400 Subject: [PATCH] Fix: src.ctf.lttng-live: incomplete metadata packet is an error MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit Observed issue ============== While investigating the issue described in [1], I noticed that babeltrace2 falls into a retry loop when the src.ctf.lttng-live component encounters unparseable metadata. The src.ctf.lttng-live reports parsing errors on every subsequent reception of a metadata packet. The relay daemon eventually sends binary data which fails to be decoded, ending the graph's execution with a binary decoding error, which is not the "real" issue. Cause ===== Due to a (now fixed) bug in LTTng [1], unparseable (incomplete) metadata can be made visible to live clients. This bug fix doesn't involve the clients; it resulted in an illegal state in the lttng-live protocol. When the relay daemon notifies the live client that new metadata is available, lttng_live_metadata_update() receives all the metadata made available by the relay daemon in a memory stream and then uses the ctf_metadata_decoder to append the new content to the existing trace class. However, if the decoder returns `CTF_METADATA_DECODER_STATUS_INCOMPLETE`, `LTTNG_LIVE_ITERATOR_STATUS_AGAIN` is returned to the caller, resulting in the graph retrying to invoke the live iterator's `next` method until another error eventually prevents the successful completion of the graph. I am assuming that the use of the `AGAIN` status code may have been a failed attempt at fixing [1] in the live component rather than adressing the underlying problem. Solution ======== To my knowledge there are no provisions made for incomplete metadata in the lttng-live protocol as of the current version. This may have been done in anticipation of a future change (?), but it currently obscures the error reported when a corrupted/incomplete/unparseable metadata packet is received by the src.ctf.lttng-live component. The current approach doesn't work anyhow as lttng_live_metadata_update() creates a "fresh" memory stream on each invocation. If the intention was to accumulate partial metadata until it can be successfully parsed, the accumulated metadata would have to be preserved from one invocation to the next. The conversion of the status code from `INCOMPLETE` to `AGAIN` is removed to fail immediately during the current invocation of the iterator's `next` method. Drawbacks ========= None. References ========== [1] https://github.com/lttng/lttng-tools/commit/f5ba75b4f0c0b44092c76bc931b25b24a2e62718 Signed-off-by: Jérémie Galarneau Change-Id: I8a379ea5d838786a6731199dd5f03bbf70ec13f5 Reviewed-on: https://review.lttng.org/c/babeltrace/+/3586 Tested-by: jenkins Reviewed-by: Philippe Proulx --- src/plugins/ctf/lttng-live/metadata.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/src/plugins/ctf/lttng-live/metadata.c b/src/plugins/ctf/lttng-live/metadata.c index 859a36de..3edeb76d 100644 --- a/src/plugins/ctf/lttng-live/metadata.c +++ b/src/plugins/ctf/lttng-live/metadata.c @@ -276,9 +276,6 @@ enum lttng_live_iterator_status lttng_live_metadata_update( /* The metadata was updated succesfully. */ trace->metadata_stream_state = LTTNG_LIVE_METADATA_STREAM_STATE_NOT_NEEDED; - break; - case CTF_METADATA_DECODER_STATUS_INCOMPLETE: - status = LTTNG_LIVE_ITERATOR_STATUS_AGAIN; break; default: goto error; -- 2.34.1