Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

How to terminate orphan management thread created by fluentbit? #614

Closed
Gabriel1688 opened this issue Jun 1, 2018 · 5 comments
Closed
Assignees

Comments

@Gabriel1688
Copy link

I encounter the thread created by fluentbit exist all the time even invoking flb_stop(ctx), flb_destory(ctx).
I was wondering how to exit the thread related to fluentbit properly. Please help me on this

Issue : an orphan thread created by fluentbit exist all the time.
Version: fluent-bit-0.12.16
OS: CentOS 7

-----------following is the source code -------------------
#include <fluent-bit.h>
#define JSON_1 "[1449505010, {"key1": "some value"}]"
#define JSON_2 "[1449505620, {"key1": "some new value"}]"
int main()
{
int ret;
int in_ffd;
int out_ffd;
flb_ctx_t *ctx;

/* Create library context */
ctx = flb_create();
in_ffd = flb_input(ctx, "lib", NULL);
flb_input_set(ctx, in_ffd,   NULL);
out_ffd = flb_output(ctx, "stdout", NULL);

/* Start the engine */
ret = flb_start(ctx);

/* Ingest data manually */
flb_lib_push(ctx, in_ffd, JSON_1, sizeof(JSON_1) - 1);
flb_lib_push(ctx, in_ffd, JSON_2, sizeof(JSON_2) - 1);

sleep(5);

/* Stop the engine (5 seconds to flush remaining data) */
flb_stop(ctx);
flb_destroy(ctx);
return 0;

}
-----------Debug output----------------
40 ret = flb_start(ctx);
[New Thread 0x7ffff6c61700 (LWP 30677)]
[New Thread 0x7ffff6460700 (LWP 30678)]
(gdb) i thread
Id Target Id Frame
3 Thread 0x7ffff6460700 (LWP 30678) "a.out" 0x00007ffff7683701 in clone () from /lib64/libc.so.6
2 Thread 0x7ffff6c61700 (LWP 30677) "a.out" 0x00007ffff6c6d6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0

  • 1 Thread 0x7ffff7fdb740 (LWP 30564) "a.out" main () at fluentbit.c:41
    (gdb) n
    47 flb_lib_push(ctx, in_ffd, JSON_1, sizeof(JSON_1) - 1);
    (gdb) n
    [1449505010, {"key1": "some value"}]48 flb_lib_push(ctx, in_ffd, JSON_2, sizeof(JSON_2) - 1);
    (gdb) n
    [1449505620, {"key1": "some new value"}][New Thread 0x7ffff5c5f700 (LWP 30928)]
    [2018/06/01 14:22:17] [ info] [engine] started
    (gdb) i thread
    Id Target Id Frame
    4 Thread 0x7ffff5c5f700 (LWP 30928) "a.out" 0x00007ffff7683d13 in epoll_wait () from /lib64/libc.so.6
    3 Thread 0x7ffff6460700 (LWP 30678) "a.out" 0x00007ffff7683d13 in epoll_wait () from /lib64/libc.so.6
    2 Thread 0x7ffff6c61700 (LWP 30677) "a.out" 0x00007ffff7683d13 in epoll_wait () from /lib64/libc.so.6
  • 1 Thread 0x7ffff7fdb740 (LWP 30564) "a.out" main () at fluentbit.c:50
    (gdb) n
    53 flb_stop(ctx);
    (gdb) n
    [2018/06/01 14:22:46] [ info] [input] pausing lib.0
    [2018/06/01 14:22:46] [ warn] [engine] service will stop in 5 seconds
    [2018/06/01 14:22:50] [ info] [engine] service stopped
    [Thread 0x7ffff5c5f700 (LWP 30928) exited]
    [Thread 0x7ffff6c61700 (LWP 30677) exited]

(gdb) i thread
Id Target Id Frame
3 Thread 0x7ffff6460700 (LWP 30678) "a.out" 0x00007ffff7683d13 in epoll_wait () from /lib64/libc.so.6

  • 1 Thread 0x7ffff7fdb740 (LWP 30564) "a.out" main () at fluentbit.c:57
    (gdb) t 3
    [Switching to thread 3 (Thread 0x7ffff6460700 (LWP 30678))]
    #0 0x00007ffff7683d13 in epoll_wait () from /lib64/libc.so.6

[call stack of orphan fluentbit thread ]
(gdb) bt
#0 0x00007ffff7683d13 in epoll_wait () from /lib64/libc.so.6
#1 0x00007ffff7998ead in _mk_event_wait () from /lib/libfluent-bit.so
#2 0x00007ffff7999197 in mk_event_wait () from /lib/libfluent-bit.so
#3 0x00007ffff797c4f4 in log_worker_collector () from /lib/libfluent-bit.so
#4 0x00007ffff79900b7 in step_callback () from /lib/libfluent-bit.so
#5 0x00007ffff6c69dc5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007ffff768373d in clone () from /lib64/libc.so.6

@edsiper edsiper self-assigned this Jun 1, 2018
@edsiper edsiper added the bug label Jun 1, 2018
@edsiper
Copy link
Member

edsiper commented Jun 1, 2018

thanks for reporting this issue. From a library usage context this is a bug that needs to be fixed.

@Gabriel1688
Copy link
Author

Hi Eduardo,
thanks for your response! Could you please tell me when this bug will be fixed.
In our use case, we need to start/stop remote logger in on-the-fly way, so fluentbit library need to support :

  1. Startup/shutdown without any thread / resource leak.
  2. Shutdown quickly rather than blocking the calling thread for 5 seconds.

@nokute78
Copy link
Collaborator

@edsiper Somehow config->ch_manager[0] and [1] becomes 0, if gdb run such program.(e.g. example/hello_world).
So, engine can't receive stop message and threads keep running.

Without gdb, config->ch_managers are preserved.

I don't know why...

diff to output ch_manager fds

diff --git a/src/flb_engine.c b/src/flb_engine.c
index 9ea1d8f..1e42d2a 100644
--- a/src/flb_engine.c
+++ b/src/flb_engine.c
@@ -615,6 +615,7 @@ int flb_engine_exit(struct flb_config *config)
     flb_input_pause_all(config);
 
     val = FLB_ENGINE_EV_STOP;
+    printf("config->ch_manager[0]=%d [1]=%d\n",config->ch_manager[0], config->c
     ret = flb_pipe_w(config->ch_manager[1], &val, sizeof(uint64_t));
 
     return ret;

result using gdb

$ gdb bin/hello_world
(snip)
config->ch_manager[0]=0 [1]=0
[2018/06/14 22:10:57] [ info] [engine] started (pid=6817)

result without gdb

$ bin/hello_world
[2018/06/14 22:11:32] [ info] [input] pausing lib.0
config->ch_manager[0]=27 [1]=28
[2018/06/14 22:11:32] [ info] [engine] started (pid=6854)

@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

@github-actions github-actions bot added the Stale label Jan 22, 2022
@github-actions
Copy link
Contributor

This issue was closed because it has been stalled for 5 days with no activity.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

3 participants