Commit | Line | Data |
---|---|---|
2df048c2 DG |
1 | .TH LTTNG_HEALTH_CHECK 3 2012-09-19 "LTTng" "LTTng Developer Manual" |
2 | .SH NAME | |
3 | lttng_health_check \- Monitor health of the session daemon | |
4 | .SH SYNOPSIS | |
5 | .nf | |
6 | .B #include <lttng/lttng.h> | |
7 | .sp | |
8 | .BI "int lttng_health_check(enum lttng_health_component c); | |
9 | .fi | |
10 | ||
11 | Link with -llttng-ctl. | |
12 | .SH DESCRIPTION | |
13 | The | |
14 | .BR lttng_health_check () | |
15 | is used to check the session daemon health for either a specific component | |
16 | .BR c | |
17 | or for all of them. Each component represent a subsystem of the session daemon. | |
18 | Those components are set with health counters that are atomically incremented | |
19 | once reached. An even value indicates progress in the execution of the | |
20 | component. An odd value means that the code has entered a blocking state which | |
21 | is not a poll(7) wait period. | |
22 | ||
23 | A bad health is defined by a fatal error code path reached or any IPC used in | |
24 | the session daemon that was blocked for more than 20 seconds (default timeout). | |
25 | The condition for this bad health to be detected is that one or many of the | |
26 | counters are odd. | |
27 | ||
28 | The health check mechanism of the session daemon can only be reached through | |
29 | the health socket which is a different one from the command and the application | |
30 | socket. An isolated thread serves this socket and only computes the health | |
31 | counters across the code when asked by the lttng control library (using this | |
32 | call). This subsystem is highly unlikely to fail due to its simplicity. | |
33 | ||
34 | The | |
35 | .BR c | |
36 | argument can be one of the following values: | |
37 | .TP | |
38 | .BR LTTNG_HEALTH_CMD | |
39 | Command subsystem which handles user commands coming from the liblttng-ctl or | |
40 | the | |
41 | .BR lttng(1) | |
42 | command line interface. | |
43 | .TP | |
44 | .BR LTTNG_HEALTH_APP_MANAGE | |
45 | The session daemon manages application socket in order to route client command | |
46 | and check if they get closed which indicates the application shutdown. | |
47 | .TP | |
48 | .BR LTTNG_HEALTH_APP_REG | |
49 | The application registration mechanism is an important and vital part of for | |
50 | user space tracing. Upon startup, applications instrumented with | |
51 | .BR lttng-ust(3) | |
52 | try to register to the session daemon through this subsystem. | |
53 | .TP | |
54 | .BR LTTNG_HEALTH_KERNEL | |
55 | Monitor the Kernel tracer streams and main channel of communication | |
56 | (/proc/lttng). If this component malfunction, the Kernel tracer is not usable | |
57 | anymore by lttng-tools. | |
58 | .TP | |
59 | .BR LTTNG_HEALTH_CONSUMER | |
60 | The session daemon can spawn up to | |
61 | .BR three | |
62 | consumer daemon for kernel, user space 32 and 64 bit. This subsystem monitors | |
63 | the consumer daemon(s). A bad health state means that the consumer(s) are not | |
64 | usable anymore hence likely making tracing not usable. | |
65 | .TP | |
66 | .BR LTTNG_HEALTH_ALL | |
67 | Check all components. If only one of them is in a bad state, a health check | |
68 | error is returned. | |
69 | ||
70 | .SH "RETURN VALUE" | |
71 | Return 0 if the health is OK, or 1 is it's in a bad state. A return code of \-1 | |
72 | indicates that the control library was not able to connect to the session | |
73 | daemon health socket. | |
74 | ||
75 | .SH "LIMITATIONS" | |
76 | ||
77 | For the LTTNG_HEALTH_CONSUMER, you can not know which consumer daemon has | |
78 | failed but only that either the consumer subsystem has failed or that a | |
79 | lttng-consumerd died. | |
80 | ||
81 | .SH "AUTHORS" | |
82 | Written and maintained by David Goulet <dgoulet@efficios.com>. |