From: Jonathan Rajotte Date: Mon, 22 Jan 2018 20:43:34 +0000 (-0500) Subject: lttng-relayd: use TCP keep-alive mechanism to detect dead-peer X-Git-Url: http://git.efficios.com/?a=commitdiff_plain;h=2fc6b1ab74fcff341b8c55260b4b0bea1efd1d7a;hp=2fc6b1ab74fcff341b8c55260b4b0bea1efd1d7a;p=lttng-tools.git lttng-relayd: use TCP keep-alive mechanism to detect dead-peer Allow relayd to clean-up objects related to a dead connection for which the FIN packet was no emitted (Unexpected shutdown, ethernet:blocking). Note that an idle peer is not considered dead given that it respond to the keep-alive query after the idle time is elapsed. By RFC 1122-4.2.3.6 implementation must default to no less than two hours for the idle period. On linux the default value is indeed 2 hours. This could be problematic if relayd should be aggressive regarding dead-peers. Hence it is important to provide tuning knob regarding the tcp keep-alive mechanism. The following environments variable can be used to enable and fine-tune it: LTTNG_RELAYD_TCP_KEEP_ALIVE_ENABLE Set to 1 to enable the use of tcp keep-alive allowing the detection of dead peers. LTTNG_RELAYD_TCP_KEEP_ALIVE_TIME See tcp(7) tcp_keepalive_time or tcp_keepalive_interval on Solaris 11. A value of -1 lets the operating system manage this parameter (default). LTTNG_RELAYD_TCP_KEEP_ALIVE_PROBES See tcp(7) tcp_keepalive_probes. A value of -1 lets the operating system manage this parameter (default). No effect on Solaris. LTTNG_RELAYD_TCP_KEEP_ALIVE_INTVL`:: See tcp(7) tcp_keepalive_intvl. A value of -1 lets the operating system manage his parameter (default). Signed-off-by: Jonathan Rajotte Signed-off-by: Jérémie Galarneau ---