lib: run most of bt_self_component_port_input_message_iterator_try_finalize when...
authorSimon Marchi <simon.marchi@efficios.com>
Wed, 11 Dec 2019 18:43:34 +0000 (13:43 -0500)
committerJérémie Galarneau <jeremie.galarneau@efficios.com>
Mon, 20 Jan 2020 20:15:24 +0000 (15:15 -0500)
We hit a failed BT_ASSERT in the specific scenario explained below (and
illustrated by the new test).

The problem
-----------

Let's say we have a graph with this topology, where everything is
implemented in Python:

  MySink [in] <-> [out] MyFilter (MyFilterIter) [in] <-> [out] MySource (MySourceIter)

And the following sequence of events:

1. MySink's _user_graph_is_configured method creates an iterator on its
   "in" input port.  This runs MyFilterIter's __init__ method.
2. MyFilterIter's __init__ method starts by creating an upstream
   iterator on its "in" input port.  This iterator, of type
   MySourceIter, is initialized successfully.
3. MyFilterIter's __init__ method creates some other data structure which
   happens to form a Python reference cycle (the MyFilterIter instance has a
   reference on an object, which has a reference on the MyFilterIter
   instance).
4. MyFilterIter's __init__ method encounters an error, so an exception
   is raised.

When the MySourceIter is created, an entry is added to MyFilterIter's
upstream_msg_iters array (in the underlying
bt_self_component_port_input_message_iterator object).

When the exception is raised, because of the reference cycle, the
MyFilterIter Python object stays alive.  It has a reference on the
MySourceIter Python object, which keeps the underlying
bt_self_component_port_input_message_iterator object alive as well.

When the create_self_component_input_port_message_iterator call realizes
that the initialization of the filter iterator failed, it does a put_ref
on the iterator object, which ends up calling
bt_self_component_port_input_message_iterator_destroy.  In there, we
assert that:

  BT_ASSERT(iterator->upstream_msg_iters->len == 0);

This assertion does not hold, because the array still contains the entry
for the upstream iterator (on MySource).

Normally, the call to
bt_self_component_port_input_message_iterator_try_finalize, earlier in
bt_self_component_port_input_message_iterator_destroy, should take care
of unlinking any upstream or downstream iterator.  However, that step is
completely skipped because the iterator is still in the NON_INITIALIZED
state.

To reproduce, it is important to have the two conditions before the
exception is raised:

  - an upstream iterator is created: otherwise, the upstream_msg_iters
    array is empty so the assertion is true
  - a Python reference cycle exists, keeping the MyFilterIter object
    alive: otherwise, it is destroyed when the exception is raised (or
    caught?), which destroys the MySourceIter object, which destroys the
    underlying bt_self_component_port_input_message_iterator object (or
    the source iterator), which removes itself from the filter
    iterator's upstream_msg_iters array.  The array now being empty, the
    assertion would be true.

Note that the Python reference cycle is a user error that is not
desirable, but likely to happen.  To avoid any memory leak due to such a
reference cycle, I think the filter iterator here should normally make
sure to break the reference cycle.  In a successful execution, it would
be in _user_finalize.  If __init__ fails, it should make sure to not
leave a reference cycle in place, perhaps by using a try-except to clear
it before exiting the function.

The fix
-------

I believe it's wrong to skip the _try_finalize function even when we are
in this state, because it's possible (as shown above) for an iterator
initialize method to create some upstream iterators but then fail.  The
iterator remains in the NON_INITIALIZED state, but has some upstream
iterators.

I have changed
bt_self_component_port_input_message_iterator_try_finalize so that when
the iterator is in the NON_INITIALIZED, we do unlink the upstream and
downstream iterators.  However, I don't think we want to call the
iterator's finalize method in that case, since it hasn't been properly
initialized (a bit like a C++ object destructor is not called if the
constructor throws).  So I made it so the finalize method is only called
if the state is not NON_INITIALIZED.

Change-Id: Ibe2cbc729fc81ff0d68219400d925110242b7ae1
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Reviewed-on: https://review.lttng.org/c/babeltrace/+/2633
Tested-by: jenkins <jenkins@lttng.org>
Reviewed-by: Francis Deslauriers <francis.deslauriers@efficios.com>
src/lib/graph/iterator.c
tests/bindings/python/bt2/test_message_iterator.py

index a0a88199a0627c8a1eb6cf335f242c4700b9bf4d..3ef5fd57d226a6e17d7a69157dedf79a1831fec7 100644 (file)
@@ -170,18 +170,24 @@ void bt_self_component_port_input_message_iterator_try_finalize(
 {
        uint64_t i;
        typedef void (*method_t)(void *);
-
-       struct bt_component_class *comp_class = NULL;
-       method_t method = NULL;
+       bool call_user_finalize = true;
 
        BT_ASSERT(iterator);
 
        switch (iterator->state) {
        case BT_SELF_COMPONENT_PORT_INPUT_MESSAGE_ITERATOR_STATE_NON_INITIALIZED:
-               /* Skip user finalization if user initialization failed */
-               BT_LIB_LOGD("Not finalizing non-initialized message iterator: "
-                       "%!+i", iterator);
-               goto end;
+               /*
+                * If this function is called while the iterator is in the
+                * NON_INITIALIZED state, it means the user initialization
+                * method has either not been called, or has failed.  We
+                * therefore don't want to call the user finalization method.
+                * However, the initialization method might have created some
+                * upstream message iterators before failing, so we want to
+                * execute the rest of this function, which unlinks the related
+                * iterators.
+                */
+               call_user_finalize = false;
+               break;
        case BT_SELF_COMPONENT_PORT_INPUT_MESSAGE_ITERATOR_STATE_FINALIZED:
                /* Already finalized */
                BT_LIB_LOGD("Not finalizing message iterator: already finalized: "
@@ -200,42 +206,47 @@ void bt_self_component_port_input_message_iterator_try_finalize(
        set_self_comp_port_input_msg_iterator_state(iterator,
                BT_SELF_COMPONENT_PORT_INPUT_MESSAGE_ITERATOR_STATE_FINALIZING);
        BT_ASSERT(iterator->upstream_component);
-       comp_class = iterator->upstream_component->class;
 
        /* Call user-defined destroy method */
-       switch (comp_class->type) {
-       case BT_COMPONENT_CLASS_TYPE_SOURCE:
-       {
-               struct bt_component_class_source *src_comp_cls =
-                       (void *) comp_class;
+       if (call_user_finalize) {
+               method_t method = NULL;
+               struct bt_component_class *comp_class =
+                       iterator->upstream_component->class;
 
-               method = (method_t) src_comp_cls->methods.msg_iter_finalize;
-               break;
-       }
-       case BT_COMPONENT_CLASS_TYPE_FILTER:
-       {
-               struct bt_component_class_filter *flt_comp_cls =
-                       (void *) comp_class;
+               switch (comp_class->type) {
+               case BT_COMPONENT_CLASS_TYPE_SOURCE:
+               {
+                       struct bt_component_class_source *src_comp_cls =
+                               (void *) comp_class;
 
-               method = (method_t) flt_comp_cls->methods.msg_iter_finalize;
-               break;
-       }
-       default:
-               /* Unreachable */
-               bt_common_abort();
-       }
+                       method = (method_t) src_comp_cls->methods.msg_iter_finalize;
+                       break;
+               }
+               case BT_COMPONENT_CLASS_TYPE_FILTER:
+               {
+                       struct bt_component_class_filter *flt_comp_cls =
+                               (void *) comp_class;
 
-       if (method) {
-               const bt_error *saved_error;
+                       method = (method_t) flt_comp_cls->methods.msg_iter_finalize;
+                       break;
+               }
+               default:
+                       /* Unreachable */
+                       bt_common_abort();
+               }
 
-               saved_error = bt_current_thread_take_error();
+               if (method) {
+                       const bt_error *saved_error;
 
-               BT_LIB_LOGD("Calling user's finalization method: %!+i",
-                       iterator);
-               method(iterator);
+                       saved_error = bt_current_thread_take_error();
 
-               if (saved_error) {
-                       BT_CURRENT_THREAD_MOVE_ERROR_AND_RESET(saved_error);
+                       BT_LIB_LOGD("Calling user's finalization method: %!+i",
+                               iterator);
+                       method(iterator);
+
+                       if (saved_error) {
+                               BT_CURRENT_THREAD_MOVE_ERROR_AND_RESET(saved_error);
+                       }
                }
        }
 
index f2b9e2d3d51f08e9352260fe0fb38a1f1ca6ce07..bc8078a6922242798ac770ebc9b30b29dc36ef3a 100644 (file)
@@ -379,6 +379,63 @@ class UserMessageIteratorTestCase(unittest.TestCase):
             with self.assertRaises(bt2.TryAgain):
                 next(it)
 
+    def test_error_in_iterator_with_cycle_after_having_created_upstream_iterator(self):
+        # Test a failure that triggered an abort in libbabeltrace2, in this situation:
+        #
+        #   - The filter iterator creates an upstream iterator.
+        #   - The filter iterator creates a reference cycle, including itself.
+        #   - An exception is raised, causing the filter iterator's
+        #     initialization method to fail.
+        class MySourceIter(bt2._UserMessageIterator):
+            pass
+
+        class MySource(bt2._UserSourceComponent, message_iterator_class=MySourceIter):
+            def __init__(self, config, params, obj):
+                self._add_output_port('out')
+
+        class MyFilterIter(bt2._UserMessageIterator):
+            def __init__(self, config, port):
+                # First, create an upstream iterator.
+                self._upstream_iter = self._create_input_port_message_iterator(
+                    self._component._input_ports['in']
+                )
+
+                # Then, voluntarily make a reference cycle that will keep this
+                # Python object alive, which will keep the upstream iterator
+                # Babeltrace object alive.
+                self._self = self
+
+                # Finally, raise an exception to make __init__ fail.
+                raise ValueError('woops')
+
+        class MyFilter(bt2._UserFilterComponent, message_iterator_class=MyFilterIter):
+            def __init__(self, config, params, obj):
+                self._in = self._add_input_port('in')
+                self._out = self._add_output_port('out')
+
+        class MySink(bt2._UserSinkComponent):
+            def __init__(self, config, params, obj):
+                self._input_port = self._add_input_port('in')
+
+            def _user_graph_is_configured(self):
+                self._upstream_iter = self._create_input_port_message_iterator(
+                    self._input_port
+                )
+
+            def _user_consume(self):
+                # We should not reach this.
+                assert False
+
+        g = bt2.Graph()
+        src = g.add_component(MySource, 'src')
+        flt = g.add_component(MyFilter, 'flt')
+        snk = g.add_component(MySink, 'snk')
+        g.connect_ports(src.output_ports['out'], flt.input_ports['in'])
+        g.connect_ports(flt.output_ports['out'], snk.input_ports['in'])
+
+        with self.assertRaisesRegex(bt2._Error, 'ValueError: woops'):
+            g.run()
+
 
 def _setup_seek_test(
     sink_cls,
This page took 0.02837 seconds and 4 git commands to generate.