From: Jonathan Rajotte Date: Mon, 18 Jan 2021 19:44:34 +0000 (-0500) Subject: CUSTOM: relayd protocol: ignore reply on relayd_send_index and relayd_send_close_stream X-Git-Url: http://git.efficios.com/?a=commitdiff_plain;h=9c15ec61708bbff73f6e2369c04ae65492990d0b;hp=9c15ec61708bbff73f6e2369c04ae65492990d0b;p=lttng-tools.git CUSTOM: relayd protocol: ignore reply on relayd_send_index and relayd_send_close_stream Note: this patch is not a bugfix, it is a targeted modification to improve timing and predictability for particular scenarios. Note: this patch only applies to userspace tracing. Part of this work might find itself upstream but most probably not in this form. Deeper work should be done upstream to mitigate the real root of the problem which is how the network protocol for live works and lifetime management of the different component required for the live feature. Scenario: ======= System with high CPU resource restriction (base high workload 99% cpu load), High CPU count (32), Slowish network (ping ~60-100ms), Timing constraint for create, configure, start, stop , destroy , create ... cycles. A lot of time is wasted waiting for response that do not provide more information and are essentially not pertinent. Here the focus is done on the reply for relayd_send_index and relayd_send_close_stream. relayd_send_index is used at the end of a buffer consumption and do not require further protocol synchronization on lttng-sessiond side. This allows us to simply mark the bytes of the response as `to ignore` on the next real receive operation. relayd_send_index is also used to send empty index during the live timer. Again there is now direct value in waiting for the reply here since tcp guarantee send/receive ordering. relayd_send_close_stream again does not benefit from waiting on the reply considering that even in the possibility of error there is not much we can do since we are closing it on lttng-sessiond side. NOTE: Three call sites are responsible for "purging" the unwanted reply: recv_reply, recv_reply_ignore and relayd_close. recv_reply simply try to receive all bytes to ignore at the time it is called. This is to preserve the protocol integrity. recv_reply_ignore will issue a recv call if a cutoff for `bytes to ignore` (128 currently) is met. The recv call will recv and discard the hald the quantity of the cutoff (64 in this patch). This ensure that long period with no real receive operation do not lead to buffer bloat. We slowly consume data that is "useless". The amount we receive is totally arbitrary and is only set to half of the cutoff to `give a chance to the runner` since there is a high probability that they are available. In any case, we block if it is not the case. relayd_close must wait for all `ignored bytes` to ensure that we do not close the socket before the relayd have sent all its reply. Signed-off-by: Jonathan Rajotte Change-Id: I5da9b364c73498e8c0156c71960432229932ecb5 ---