Fix: consumer data lock deadlock caused by monitor timer
authorJérémie Galarneau <jeremie.galarneau@efficios.com>
Mon, 8 May 2017 19:06:25 +0000 (15:06 -0400)
committerJérémie Galarneau <jeremie.galarneau@efficios.com>
Mon, 8 May 2017 19:06:25 +0000 (15:06 -0400)
The execution of the monitor timer takes the consumer data lock
which causes three threads to deadlock.

The consumer_thread_data_poll_thread takes the lock during
the teardown of a channel. This teardown stops the channel's
timers and, to ensure that the timers are not fired on a free'd
channel, uses a custom SIG_TEARDOWN signal as a "bubble" inserted
the signal processing "queue". It then waits until this signal
has been processed to release the consumer data lock.

The sessiond_poll_thread is creating a channel and waits on
the consumer data lock.

Meanwhile, the timer thread is blocked on this same lock
during the processing of the monitor timer signal which
prevents the queue from being flushed, causing the destruction
of the channel to never reach completion.

There is no need to take the consumer data lock in the monitor
timer code since the channel's existence is guaranteed by
the SIG_TEARDOWN mechanism.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
src/common/consumer/consumer-timer.c

index 55129914ecb15882e3f15a8fb479f701715fb2ca..60ed94083e543a4d082e6f4e700f1ca57662120a 100644 (file)
@@ -709,10 +709,9 @@ void monitor_timer(struct lttng_consumer_local_data *ctx,
        get_produced_cb get_produced;
 
        assert(channel);
        get_produced_cb get_produced;
 
        assert(channel);
-       pthread_mutex_lock(&consumer_data.lock);
 
        if (channel_monitor_pipe < 0) {
 
        if (channel_monitor_pipe < 0) {
-               goto end;
+               return;
        }
 
        switch (consumer_data.type) {
        }
 
        switch (consumer_data.type) {
@@ -734,7 +733,7 @@ void monitor_timer(struct lttng_consumer_local_data *ctx,
        ret = sample_channel_positions(channel, &msg.highest, &msg.lowest,
                        sample, get_consumed, get_produced);
        if (ret) {
        ret = sample_channel_positions(channel, &msg.highest, &msg.lowest,
                        sample, get_consumed, get_produced);
        if (ret) {
-               goto end;
+               return;
        }
 
        /*
        }
 
        /*
@@ -759,8 +758,6 @@ void monitor_timer(struct lttng_consumer_local_data *ctx,
                                ", (highest = %" PRIu64 ", lowest = %"PRIu64")",
                                channel->key, msg.highest, msg.lowest);
        }
                                ", (highest = %" PRIu64 ", lowest = %"PRIu64")",
                                channel->key, msg.highest, msg.lowest);
        }
-end:
-       pthread_mutex_unlock(&consumer_data.lock);
 }
 
 int consumer_timer_thread_get_channel_monitor_pipe(void)
 }
 
 int consumer_timer_thread_get_channel_monitor_pipe(void)
This page took 0.028542 seconds and 5 git commands to generate.