-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[v1.7][v1.6] storage.total_limit_size is not working properly #2878
Comments
concerns:
Thread 2 "flb-pipeline" hit Breakpoint 3, flb_input_chunk_find_space_new_data (ic=0x7ffff00951c0, overlimit_routes_mask=1, chunk_size=555) Thread 2 "flb-pipeline" hit Breakpoint 3, flb_input_chunk_find_space_new_data (ic=0x7ffff00951c0, overlimit_routes_mask=1, chunk_size=555) Thread 2 "flb-pipeline" hit Breakpoint 3, flb_input_chunk_find_space_new_data (ic=0x7ffff00951c0, overlimit_routes_mask=1, chunk_size=555) Thread 2 "flb-pipeline" hit Breakpoint 3, flb_input_chunk_find_space_new_data (ic=0x7ffff00951c0, overlimit_routes_mask=1, chunk_size=555) Thread 2 "flb-pipeline" hit Breakpoint 3, flb_input_chunk_find_space_new_data (ic=0x7ffff00951c0, overlimit_routes_mask=1, chunk_size=555) Thread 2 "flb-pipeline" hit Breakpoint 3, flb_input_chunk_find_space_new_data (ic=0x7ffff00951c0, overlimit_routes_mask=1, chunk_size=555) Thread 2 "flb-pipeline" hit Breakpoint 3, flb_input_chunk_find_space_new_data (ic=0x7ffff00951c0, overlimit_routes_mask=1, chunk_size=555) Thread 2 "flb-pipeline" hit Breakpoint 3, flb_input_chunk_find_space_new_data (ic=0x7ffff00951c0, overlimit_routes_mask=1, chunk_size=555) |
when we schedule a retry, if the chunk is up and if we are over the configured memory limit, we put the chunks down which calls munmap which sets the chunks data size to 0. When we loop through all the chunks to find some space when disk limit is reached, these down chunks were considered as 0 bytes and removed as well. |
Hi @manojna10 The first one I think might be the case I missed when implemented it, I will double-check it later. For the second concern could you please elaborate more on it? Because it might be the case that the program needs to drop some chunks to place the new chunk. |
Issue 2: case 1, working fine: If disk size reaches y before max up chunks reaches z: all chunks are up in memory and the newly added chunks by replacing the older UP chunks also remains UP. so, it works as expected i,e, we use max disk space of y. case 2, having issue: But if max up chunks reaches z before total size reaches y, then new chunks are created as DOWN chunks, but it keeps adding new chunks even after disk size reaches y (didn't actually debug to see whether its discarding any old chunks or not, it will good to add a metric for that as well). the disk usage never stop never stops at configured y: it continues to store the data beyond the configured size of y. |
I can confirm
If we would have multiple such OUTPUT sections, the capacity would add up, right? If so, is there also any way to limit it globally for all (first-come-first-serve)? |
Hi @sjentzsch The capacity should not be added up to a larger number. Each output plugin has its own limit set by storage.total_limit_size. Therefore it means that storage.total_limit_size is not a global limit. Could you please let me know how do you get a |
This looks similar to the issue 2 I mentioned. |
@manojna10 Thank you. I will try to reproduce it. |
@JeffLuoo We indeed had multiple output plugins to Elasticsearch (two or three), each with a 30G limit. So I have to agree, it could be that they have added up. However, I suspect that we rather ran into the issue described here, as usually only one of our es outputs accumulates Gigabytes of data (the others are rather silent). No proof though, unfortunately. |
@sjentzsch Thank you. I will take a look. |
@manojna10 Hi could you please share with me the file like https://pastebin.com/yD6kBx6T you used to reproduce issue 2 if possible? Appreciate that! |
Hi @JeffLuoo I tried reproducing this issue. But regardless of how many times I tried, max_chunks_up configuration was not honored and there were no DOWN chunks at all even after the configured number of max_chunks_up is reached. Attaching the screenshot and the client file I used. I even tried using #2804 to keep the number of busy chunks less than than the total number of up chunks, it didn't help with the above situation as well. I used the latest Master code as well to see the behavior and its still the same. Not sure if I am missing anything here as it used to work before. Because of this, I am unable to try reproduce the original issue 2 reported here. Thanks, |
@manojna10 Thank you... I just created a PR for issue one #3054 |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This issue was closed because it has been stalled for 5 days with no activity. |
Bug Report
There are 2 issues when the output plugin is unable to forward the data so it needs buffering and storage, nullifying the use of disk for storage.
Issue 1: When storage.type is filesystem, and mem_buf_limit is x and storage.total_limit_size is y, and if y > x(mainstream case), once it reaches y with some chunks being UP(upto x memory) and remaining DOWN in fs, over the next few seconds, it slowly falls back to total disk size of only x and only have UP chunks. So it essentially stores the same amount of data in disk which is in memory.
Issue 2: when storage.type is filesystem and mem_buf_limit is not set but storage.total_limit_size is y and max_chunks_up are set to z, if y is reached before max up chunks reaches z, it works fine, i,e, we store max of y. But if max up chunks reaches z before total size reaches y, then new chunks are created as DOWN chunks, but it never stops at y, it continues to store the data beyond the configured size of y.
To Reproduce
Expected behavior
It should stay at 30M and not go back to 10M.
Screenshots



issue1_part1: just started FLB
issue1_part2: reached total_limit_size after crossing mem_buf_limit
issue1_part3: It goes back to mem_buf_limit size
Your Environment
Additional context
The text was updated successfully, but these errors were encountered: