Jérémie Galarneau [Tue, 17 Oct 2017 21:18:14 +0000 (17:18 -0400)]
Add a util to create a buffer view from a raw buffer
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 29 Apr 2020 04:03:43 +0000 (00:03 -0400)]
consumerd: refactor: combine duplicated check_*_functions
The check_ust_stream and check_kernel_stream functions are identical
except for the call to the domain-specific call to
consumer_flush_*_index.
A "flush_index" callback is passed to check_stream in order to share
the rest of that code.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iafdb64192322c0106a555b67f54290dadc4f0579
Jérémie Galarneau [Wed, 29 Apr 2020 01:40:12 +0000 (21:40 -0400)]
kerner-ctl: add RING_RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK
Add a wrapper for RING_RING_BUFFER_GET_NEXT_SUBBUF_METADATA_CHECK
which gets the next metadata subbuffer and returns a boolean flag
indicating whether the metadata is guaranteed to be in a consistent
state at the end of this sub-buffer (can be parsed).
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I13fbdfe51c3c4ef04581409e0fbc9837ed6d555d
Jonathan Rajotte [Tue, 27 Aug 2019 18:02:02 +0000 (14:02 -0400)]
Fix: check validity of a stream before invoking ust flush command
[
BACKPORT INFO: already present in stable 2.9 but not in rev7
https://github.com/lttng/lttng-tools/commit/
610959b3677f92eb93e013ee87af49df816c2f48
]
At the time ustctl_flush_buffer is called the ustream object might have
already been freed on lttng-ust side.
This can happen following a lttng_consumer_cleanup_relayd and concurrent
consumer flush command (lttng stop).
The train of events goes as follows.
An error on communication with lttng-relayd occurs.
lttng_consumer_cleanup_relayd flags the streams for deletion
(CONSUMER_ENDPOINT_INACTIVE). validate_endpoint_status_data_stream calls
consumer_del_stream.
At the same time the hash table of streams is iterated over in the
flush_channel function following a stop command. The loop is iterating on
a given stream. The current thread is unscheduled before taking the stream
lock.
In the initial thread, the same stream is the current iteration of
cds_lfht_for_each_entry in validate_endpoint_status_data_stream.
consumer_del_stream is called on it. The stream lock is acquired, and
destroy_close_stream is called. lttng_ustconsumer_del_stream is eventually
called and at this point the ustream is freed.
Going back to the iteration in flush_channel. The current stream is still
valid from the point of view of the iteration, ustctl_flush_buffer is then
called on a freed ustream object.
This can lead to unknown behaviour since there is no validation on
lttng-ust side. The underlying memory of the ustream object is garbage at
this point.
To prevent such scenario, we check for the presence of the node in the
hash table via cds_lfht_is_node_deleted. This is valid because the node is
removed from the hash table before deleting the ustream object on
lttng-ust side. The removal from the hash table also requires the stream
lock ensuring the validity of cds_lfht_is_node_deleted return value.
This duplicate similar "validation" check of the stream object. [1][2]
[1] src/common/consumer/consumer.c:consumer_close_channel_streams
[2] src/common/ust-consumer/ust-consumer.c:close_metadata
This issue can be reproduced by the following scenario:
Modify flush_channel to sleep (i.e 10s) before acquiring the lock on
a stream.
Modify lttng-ust ustctl_destroy_stream to set the
ring_buffer_clock_read callback to NULL.
Note: An assert on !cds_lfht_is_node_deleted in flush channel
after acquiring the lock can provide the same information. We are
modifying the callback to simulate the original backtrace from our
customer.
lttng-relayd
lttng-sessiond
lttng create --live
lttng enable-event -u -a
lttng start
Start some applications to generate data.
lttng stop
The stop command force a flush of the channel/streams.
pkill -9 lttng-relayd
Expect assert or segfault
The original customer backtrace:
0 lib_ring_buffer_try_switch_slow (handle=<optimized out>, tsc=<synthetic pointer>, offsets=0x3fffa9b76c80, chan=0x3fff98006e90, buf=<optimized out>,
mode=<optimized out>) at /usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/ring_buffer_frontend.c:1834
1 lib_ring_buffer_switch_slow (buf=0x3fff98016b40, mode=<optimized out>, handle=0x3fff98017670)
at /usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/ring_buffer_frontend.c:1952
2 0x00003fffac680940 in ustctl_flush_buffer (stream=<optimized out>, producer_active=<optimized out>)
at /usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust-ctl/ustctl.c:1568
3 0x0000000010031bc8 in flush_channel (chan_key=<optimized out>) at ust-consumer.c:772
4 lttng_ustconsumer_recv_cmd (ctx=<optimized out>, sock=<optimized out>, consumer_sockpoll=<optimized out>) at ust-consumer.c:1651
5 0x000000001000de50 in lttng_consumer_recv_cmd (ctx=<optimized out>, sock=<optimized out>, consumer_sockpoll=<optimized out>) at consumer.c:2011
6 0x0000000010014208 in consumer_thread_sessiond_poll (data=0x10079430) at consumer.c:3192
7 0x00003fffac608b30 in start_thread (arg=0x3fffa9b7bdb0) at pthread_create.c:462
8 0x00003fffac530d0c in .__clone () at ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:96
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: I5ad8c2d0c37675d3d0fc4b4669db7c221bef78e3
Jérémie Galarneau [Tue, 4 Feb 2020 00:51:08 +0000 (19:51 -0500)]
BACKPORT: Tests: fix: test_relayd_working_directory fails as user
A formating issue introduced by
15da468cd causes the temporary
directory of the a test to be initialized incorrectly, causing it to
fail when it is not skipped (executed as a non-root user).
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idd09f27fa2ce0f5991056ab52bc1718080122151
Jérémie Galarneau [Mon, 3 Feb 2020 20:56:43 +0000 (15:56 -0500)]
BACKPORT: Tests: fix: test_relayd_working_directory fails as root
From the original bug report:
This test succeeds as user, but fails as root:
not ok 23 - Warning about missing write permission is present
Failed test 'Warning about missing write permission is present'
in tools/working-directory/test_relayd_working_directory:test_relayd_debug_permission() at line 182.
The warning does not trigger because root always has access.
Skip this test since the permission check will succeed and the relay
daemon won't produce the expected error message.
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4eb29958aaca78405e1fdd2392d73472af0d5912
Jonathan Rajotte [Tue, 22 Oct 2019 16:05:28 +0000 (12:05 -0400)]
BACKPORT: Tests: fix: tmp dir can be a symlink
Get the real path to perform valid comparison.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1b5a5ccb6787681a5dc84bdb1708811008c9c525
Jonathan Rajotte [Tue, 26 May 2020 18:16:42 +0000 (14:16 -0400)]
BACKPORT FIX: update test file against upstream
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: I8fa36d6bd4cbf1196bdf031360f118a8c3aa3cc0
Jonathan Rajotte [Tue, 26 May 2020 18:01:43 +0000 (14:01 -0400)]
BACKPORT FIX: wrong file name
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: I268d34671679b9c38b3571406d9da0a4766ed615
Jonathan Rajotte [Tue, 25 Jun 2019 15:18:10 +0000 (11:18 -0400)]
EfficiOS backport 2.9 revision 7
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Mathieu Desnoyers [Tue, 13 Nov 2018 17:12:21 +0000 (12:12 -0500)]
Fix: max_t/min_t macros are missing cast on input
The semantic expected from max_t and min_t is to perform the max/min
comparison in the type provided as first parameter.
Cast the input parameters to the proper type before comparing them,
rather than after. There is no more need to cast the result of the
expression now that both inputs are cast to the right type.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Mathieu Desnoyers [Tue, 13 Nov 2018 17:12:20 +0000 (12:12 -0500)]
Fix: Connect timeout arithmetic in inet/inet6 (v4)
The nanoseconds part of the timespec struct time_a is not always
bigger than time_b since it wraps around each second.
Use 64-bit arithmetic to compute the difference.
Merge/move duplicated code into utils.c.
This function is really doing two things. Split it into
timespec_to_ms() and timespec_abs_diff().
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 31 Jan 2019 21:54:32 +0000 (16:54 -0500)]
Bound maximum data read to RECV_DATA_BUFFER_SIZE per iteration
Do not consume everything all at once even if there is data left on the
socket. This is to provide fairness to the overall data handling of all
connections.
It also provide a bounded processing execution for a data processing
iteration.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Jonathan Rajotte [Mon, 11 Feb 2019 19:43:54 +0000 (14:43 -0500)]
relayd: do not prioritize control events over data.
Simplify the algorithm used by relayd for control and data connections
handling.
Use the notion of activity phase. An activity phase represent a phase
for which all connections with activity (poll/epoll) are not yet processed.
When an active connection is processed, her activity phase is set to the
current activity phase to prevent further progress during the same
activity phase.
Once all active connections (poll events) have been processed during
the current activity phase, the current activity phase is incremented.
This give fairness across all connections during a given activity phase.
This can also serve as a base for future work toward resources based
prioritizing.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Jonathan Rajotte [Tue, 8 Jan 2019 21:25:15 +0000 (16:25 -0500)]
Remove unnecessary mutex unlock
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Jonathan Rajotte [Mon, 27 Aug 2018 18:29:10 +0000 (14:29 -0400)]
Explicit data pending reason consumer side
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Jérémie Galarneau [Wed, 25 Jul 2018 19:32:10 +0000 (15:32 -0400)]
Cleanup: missing line in consumer-stream.c
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Jérémie Galarneau [Wed, 25 Jul 2018 19:26:37 +0000 (15:26 -0400)]
consumer: Rename net_seq_idx to relayd_id
The consumer's streams refer to a 'net_seq_idx' of which the
meaning must have been lost in the sands of time. It is a
unique identifier of a given relay daemon. Hence, renaming it to
'relayd_id' appears sensible.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Jonathan Rajotte [Tue, 25 Jul 2017 20:26:25 +0000 (16:26 -0400)]
Cleanup: remove dead assignment
Both calling sites do not use the return value and errors are already
managed inside the called function.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Fri, 21 Sep 2018 08:57:16 +0000 (04:57 -0400)]
EfficiOS backport 2.9 revision 6
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Jonathan Rajotte [Wed, 29 Aug 2018 01:19:53 +0000 (21:19 -0400)]
Teardown relayd on communication error during data pending
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Jérémie Galarneau [Fri, 20 Jul 2018 22:41:49 +0000 (18:41 -0400)]
Set consumer's verbosity to the max level on --verbose-consumer
The consumer's verbosity is set to '1' when --verbose-consumer
is used when launching the session daemon. This means that all
DBG2/3() statements are ignored.
This commit always sets the consumer's verbosity to the maximal
level.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 23 Jul 2018 03:38:34 +0000 (23:38 -0400)]
Perform local data pending check then relayd
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Jonathan Rajotte [Mon, 6 Aug 2018 01:38:10 +0000 (21:38 -0400)]
fd-tracker Fix: do not warn on index file not found
Upstream status pending on fd-tracker merge
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Jonathan Rajotte [Wed, 12 Sep 2018 15:55:57 +0000 (11:55 -0400)]
fd-tracker Fix: error path lead to null pointer dereference of handle
Upstream status: pending review and upstream merge of the fd-tracker
feature.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Jonathan Rajotte [Tue, 11 Sep 2018 00:09:11 +0000 (20:09 -0400)]
Fix: double put on error path
Let relay_index_try_flush be responsible for the self-reference put on
error path.
Code flow of relay_index_try_flush is a bit tricky but the only error
flow (via relay_index_file_write) will always mark the index as flushed
and perform the self-reference put.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 11 Sep 2018 00:09:14 +0000 (20:09 -0400)]
Fix: holding the stream lock does not equate to having data pending
The live timer can hold the stream lock while sending empty beacon. An
empty beacon does not mean that data is still pending for the stream.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Tue, 11 Sep 2018 00:09:13 +0000 (20:09 -0400)]
Fix: skip uid registry when metadata key is 0
A value of zero for the metadata key indicate that metadata was never
created/pushed to the consumer.
This can occur in scenario were a tracker is present since metadata
might never be created/pushed.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 11 Sep 2018 00:09:12 +0000 (20:09 -0400)]
Fix: acquire stream lock during kernel metadata snapshot
The stream lock is not taken when interacting with the kernel
metadata stream that is created at the time a snapshot is taken.
This was noticed while reviewing the code for an unrelated reason,
so there is no known problem caused by this. Nevertheless, this
is incorrect as the stream is globally visible in the consumer.
Moreover, the stream was not cleaned-up which can cause a leak
whenever a metadata snapshot fails.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Jonathan Rajotte [Fri, 7 Sep 2018 19:18:38 +0000 (15:18 -0400)]
Fix: skip closed session on viewer listing
There is no value in listing a closed session. A viewer cannot hook
itself to a closed session in live mode and the session is about to be
removed from the sessions hash table.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Fri, 7 Sep 2018 19:18:37 +0000 (15:18 -0400)]
Fix: use LTTNG_VIEWER_ATTACH_UNK to report a closed session
LTTNG_VIEWER_NEW_STREAMS_HUP is not a valid error number for the
LTTNG_VIEWER_ATTACH_SESSION command. This result in erroneous error
reporting on the client side.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 6 Jun 2018 01:00:28 +0000 (21:00 -0400)]
Fix: perform relayd socket pair cleanup on control socket error
A reference to the local context for the socket pair is used to "force" an
evaluation of the data and metadata streams since we changed the endpoint
status. This imitates what is currently done for the data socket.
This prevents hitting network timeouts multiple times in a row when an
error occurs. For now, there is no mechanism for retry hence
"terminating" all communication make sense and prevent unwanted delays
on operation.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 13 Sep 2018 21:04:45 +0000 (17:04 -0400)]
Fix: relayd control socket mutex is not destroyed
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 9 Jul 2018 15:32:37 +0000 (11:32 -0400)]
Rev5: Update extra_version information
Jérémie Galarneau [Fri, 6 Jul 2018 23:14:43 +0000 (19:14 -0400)]
Fix: remove inode from inode registry ht
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 22:44:41 +0000 (18:44 -0400)]
Fix: unbalanced fd references
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 22:43:37 +0000 (18:43 -0400)]
relayd: unlink stream files through the fd-tracker
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 22:11:27 +0000 (18:11 -0400)]
Fix: crash on close of partially initialized handle
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 21:40:30 +0000 (17:40 -0400)]
relayd: unlink index files through the fd-tracker
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 20:07:12 +0000 (16:07 -0400)]
Fix: rlim_cur and rlim_max printing causes a warning on some archs
rlim_cur and rlim_max were assumed to be unsigned longs, but they
are explicitly 64-bits long on 32 bits archs (warning seen for
powerpc 32 and arm 32 builds).
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 19:50:25 +0000 (15:50 -0400)]
Tests: add fd-tracker tests for the unlink operation
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 16:04:13 +0000 (12:04 -0400)]
fd-tracker test: register rcu thread of test application
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 16:03:13 +0000 (12:03 -0400)]
fd-tracker: use lttng_inode to store fs_handle's path
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 15:40:20 +0000 (11:40 -0400)]
fd-tracker: remove duplicate clear of O_CREAT flag
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 15:35:29 +0000 (11:35 -0400)]
fd-tracker build fix: missing parameter in poll compat function signature
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 05:01:48 +0000 (01:01 -0400)]
fd-tracker: perform unsuspendable_fd release through call_rcu
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 02:29:12 +0000 (22:29 -0400)]
fd-tracker: add the lttng-inode interface
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 02:29:59 +0000 (22:29 -0400)]
fd-tracker: add the unlink operation to fs handles
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 02:45:11 +0000 (22:45 -0400)]
fd-tracker: remove unneeded header inclusion
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 6 Jul 2018 02:43:33 +0000 (22:43 -0400)]
fd-tracker: add an optimization note to the fd-tracker
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Thu, 5 Jul 2018 16:01:07 +0000 (12:01 -0400)]
Backport: lttng-track(1), lttng-untrack(1): document new properties/options
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Mathieu Desnoyers [Thu, 5 Jul 2018 15:08:40 +0000 (11:08 -0400)]
Backport: trackers: bump MI version to 4.0
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Thu, 5 Jul 2018 14:50:18 +0000 (10:50 -0400)]
Backport: trackers: commands: validate duplicate options
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Thu, 5 Jul 2018 14:40:08 +0000 (10:40 -0400)]
Backport: Fix: tracker: no command shortcut for new trackers
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Thu, 5 Jul 2018 14:31:01 +0000 (10:31 -0400)]
Backport: trackers: add sessiond tracker list implementation
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 4 Jul 2018 22:12:14 +0000 (18:12 -0400)]
Backport: Fix: tracker: list/track/untrack commands leak strings
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 4 Jul 2018 21:51:47 +0000 (17:51 -0400)]
Backport: Fix: tracker: ensure consistency of tracker states
On error when adding/removing from either UST or kernel trackers,
we need to roll back the state of our internal lists.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 4 Jul 2018 20:12:54 +0000 (16:12 -0400)]
Backport: trackers: tests: adapt tests to new xsd schemas
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 4 Jul 2018 20:13:43 +0000 (16:13 -0400)]
Backport: trackers: update MI to new xsd schema
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 4 Jul 2018 20:14:20 +0000 (16:14 -0400)]
Backport: trackers: update config xsd schema
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 4 Jul 2018 20:46:25 +0000 (16:46 -0400)]
Backport: trackers: update lttng-sessiond
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 4 Jul 2018 20:17:48 +0000 (16:17 -0400)]
Backport: trackers: update list/track/untrack commands
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 4 Jul 2018 20:17:26 +0000 (16:17 -0400)]
Backport: trackers: update liblttng-ctl
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 4 Jul 2018 20:16:47 +0000 (16:16 -0400)]
Backport: trackers: update sessiond communication protocol
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 4 Jul 2018 20:15:07 +0000 (16:15 -0400)]
Backport: trackers: update lttng-modules tracer ABI
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mathieu Desnoyers [Wed, 4 Jul 2018 20:11:17 +0000 (16:11 -0400)]
Backport: trackers: change error code from "pid" to "id"
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Jérémie Galarneau [Thu, 5 Jul 2018 01:28:18 +0000 (21:28 -0400)]
Backport: LTTNG-RELAYD(8): document the --fd-pool-size option
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 3 Jul 2018 17:49:43 +0000 (13:49 -0400)]
Backport: relayd: rename fd-cap parameter to fd-pool-size
Rename the fd-cap parameter and change its default behaviour.
The minimum number of file descriptor is raised to 100 and a
"reserve" amount of 10 fds is allowed to accomodate transient
fd uses that can't be tracked by the relay daemon.
The --fd-pool-size will accept parameters in the
[100, fileno soft limit] interval.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 3 Jul 2018 17:48:29 +0000 (13:48 -0400)]
Backport: fd-tracker: log tracker capacity on creation
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sun, 1 Jul 2018 03:20:43 +0000 (23:20 -0400)]
Backport: relayd: replace lttng_index_file with relay_index_file
lttng_index_file is shared between the consumer and relay daemon.
However, the introduction of the fd-tracker in the relay daemon
makes it hard to cleanly share this piece of code between both
daemons.
The ctf-index.h header is still shared by both daemons which
is the most important part. The lttng/relay_index_file class
is a fairly thin wrapper around file system operations (unlink,
read, and write an index) so there is little value gained in
sharing the code vs heavily modifying it to handle the presence
of an fd-tracker in the process.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Sat, 30 Jun 2018 18:51:55 +0000 (14:51 -0400)]
Backport: Move index initialization to ctf-index.h
This initialization code is moved to a common header to re-use
it in a follow-up patch.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 29 Jun 2018 22:05:47 +0000 (18:05 -0400)]
Backport: Fix: fully initialize viewer stream before publishing it
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 29 Jun 2018 21:48:58 +0000 (17:48 -0400)]
Backport: relayd: use the fd-tracker to track stream_fd fds
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 28 Jun 2018 05:22:06 +0000 (01:22 -0400)]
Backport: relayd: track the live client connections socket
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 28 Jun 2018 05:16:56 +0000 (01:16 -0400)]
Backport: relayd: track relayd control connection sockets
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 28 Jun 2018 05:16:43 +0000 (01:16 -0400)]
Backport: relayd: track relayd data connection sockets
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 28 Jun 2018 04:15:54 +0000 (00:15 -0400)]
Backport: relayd: track the data listener socket
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 28 Jun 2018 04:15:40 +0000 (00:15 -0400)]
Backport: relayd: track the control listener socket
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 28 Jun 2018 04:14:52 +0000 (00:14 -0400)]
Backport: relayd: track the live listener socket
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 28 Jun 2018 03:16:44 +0000 (23:16 -0400)]
Backport: relayd: track stdio output fds
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 28 Jun 2018 02:38:52 +0000 (22:38 -0400)]
Backport: relayd: track the live viewer worker thread's epoll fd
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 28 Jun 2018 02:36:05 +0000 (22:36 -0400)]
Backport: relayd: track the live listener thread's epoll fd
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 28 Jun 2018 02:30:37 +0000 (22:30 -0400)]
Backport: relayd: track the live_conn_pipe with the fd-tracker
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 27 Jun 2018 19:48:53 +0000 (15:48 -0400)]
Backport: relayd: track listener's epoll fd using the fd-tracker
This addresses the bogus fd report mentionned in a previous
patch of this series as the clean-up of the listener thread's
epoll fd now occurs through the fd-tracker.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 27 Jun 2018 19:42:38 +0000 (15:42 -0400)]
Backport: relayd: track worker thread's epoll fd using the fd-tracker
This commit introduces an fd leak report (bogus) which is caused
by another thread using the same poll initialization functions as
the worker thread.
The fd is cleaned-up by that other thread, but the fd-tracker
is not aware of this, thus causing the report.
This is adressed in a follow-up patch.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 27 Jun 2018 18:58:48 +0000 (14:58 -0400)]
Backport: relayd: track the health thread's poll fd with fd-tracker
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 26 Jun 2018 21:38:22 +0000 (17:38 -0400)]
Backport: relayd: track clients of the health unix socket with the fd-tracker
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 26 Jun 2018 21:20:47 +0000 (17:20 -0400)]
Backport: relayd: track the health unix socket with the fd-tracker
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 26 Jun 2018 19:17:16 +0000 (15:17 -0400)]
Backport: relayd: track the health quit pipe with the fd-tracker
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 26 Jun 2018 19:07:08 +0000 (15:07 -0400)]
Backport: relayd: track the relay_conn_pipe with the fd-tracker
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 26 Jun 2018 18:51:22 +0000 (14:51 -0400)]
Backport: relayd: track the quit pipe with the fd-tracker
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 19 Jun 2018 16:22:31 +0000 (12:22 -0400)]
Backport: relayd: Don't bypass the fd tracker when closing file descriptors
There is no reason to close all file descriptors at this point in
the relay daemon as we know for a fact that the only open fds
are stdin, stdout, and stderr. If the relayd was to depend on a
library that opens other file descriptors, it would be unadvisable
to perform this kind of bulk closing of all possible file descriptors.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 19 Jun 2018 02:45:27 +0000 (22:45 -0400)]
Backport: relayd: close stdin
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 27 Jun 2018 19:04:49 +0000 (15:04 -0400)]
Backport: relayd: initialize the global fd tracker from fd_cap parameter
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 18 Jun 2018 22:39:07 +0000 (18:39 -0400)]
Backport: Add fd-cap option to the relay daemon
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 19 Jun 2018 18:27:01 +0000 (14:27 -0400)]
Backport: Test: add a unit test for the fd tracker
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 27 Jun 2018 20:38:17 +0000 (16:38 -0400)]
Backport: fd-tracker: add pipe management wrappers to fd-tracker
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 27 Jun 2018 19:05:06 +0000 (15:05 -0400)]
Backport: fd-tracker: add epoll/poll management wrappers to fd-tracker
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Thu, 14 Jun 2018 02:55:29 +0000 (22:55 -0400)]
Backport: fd-tracker: add an fd-tracker util to common
This commit adds an fd-tracker utility to the common libs.
This interface allows a process to keep track of its open
file descriptors and enforce a limit to the number of file
descriptors that may be simultaneously opened.
The intent is to use this interface as part of the relay daemon
to mitigate file descriptors exhaustion problems that are
encountered when the relay has to handle a large number of streams.
The fd-tracker defines two classes of file descriptors: suspendable
and unsuspendable file descriptors.
Suspendable file descriptors are handles to filesystem objects
(e.g. regular files) that may be closed and re-opened later without
affecting the application.
A suspendable file descriptor can be opened by creating a filesystem
handle (fs_handle) using the fd-tracker. The raw file descritptor
must then be obtained and released using that handle. Closing the
handle will effectively ensure that the file descritptor is closed.
Unsuspendable file descriptors are file descriptors that cannot
be closed without affecting the application's state. For instance,
it is not possible to close and re-open a pipe, a TCP socket, or
an epoll fd without involving some app-specific logic. Thus, the
fd-tracker considers those file descriptors as unsuspendable.
Opening an unsuspendable file descritptor will return a raw file
decriptor to the application. It is its responsability to notify the
fd-tracker of the file descriptor's closing to ensure the number
of active file descriptors can be tracked accurately.
If a request to open a new file descriptors is made to the fd-tracker
and the process has already reached its maximal count of
simultaneously opened file descriptors, an attempt will be made to
suspend a suspendable file descriptor to release a slot.
Suspending a file descriptor involves:
- verifying that the file is still available on the FS (restorable),
- sampling its current position,
- closing the file descriptor.
Note that suspending a file descriptor eliminates the POSIX guarantee
that a file may be unlinked at any time without affecting the
application (provided that it holds an open FD to that
file). Applications using the fd-tracker that need to maintain this
guarantee should open those files as unsuspendable file descriptors.
To protect against unlinking and file replacement scenarios, the
fd-tracker samples the files' inode number when a fs_handle is
created. This inode number will then be checked anytime the handle
is suspended or restored to ensure that the application is made
aware of the file's unavailability. This is preferable to
inadvertently opening another file of the same name if the original
file was unlinked and/or replaced between a fs_handle's suspension
and restoration.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Tue, 3 Jul 2018 01:13:23 +0000 (21:13 -0400)]
Backport: add DBG_NO_LOC logging macro
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
This page took 0.045635 seconds and 5 git commands to generate.