High volume of drained events -- xtrim can't keep up #69

chrisvlopez · 2024-03-18T23:04:34Z

Hi,

We've experienced an issue where our redis instance was backed up with a large number of drained events (O(millions)).

The streams.events.maxLen bit was not changed and I confirmed it was defaulting to 10K on redis itself. From testing locally, I think the root cause to be the redis xtrim command used to trim the events, specifically that there's a default limit on the number of events that will be trimmed:

When LIMIT and count aren't specified, the default value of 100 * the number of entries in a macro node will be implicitly used as the count.

Meaning, we were likely generating too many drain events for the trim event commands to keep up. A direct fix could involve:

Specifying the value 0 as count disables the limiting mechanism entirely.

which at least functionally works from my local testing (ignoring perf implications).

But there is an underlying question of whether we should be generating so many drain events in the first place. We have ~100 workers which I assume are each generating its own drain event that is causing the build up. Is there some setting we should be tuning to reduce the amount of event creation? Is increasing drainDelay the only option I have?

Thanks!

The text was updated successfully, but these errors were encountered:

manast · 2024-03-19T09:08:09Z

On the latest versions at least, the drained event is only emitted when a job is completed and there are no other jobs to fetch in the queue (delayed jobs will not be taken into consideration though). So an scenario were a lot of drained events are generated would be if jobs are coming in much slower than the workers process them such that every-time a job is processed the next job in the queue has not been added yet.
The logic to generate the drained event in older version was different though, so it would be interesting to know in which version you are experiencing this issue.

chrisvlopez · 2024-03-19T15:52:11Z

so it would be interesting to know in which version you are experiencing this issue.

we're on 6.4.0

I did see a similar comment re: changing the drain event logic. ~~Let me bump to 6.6.1+ to see if that helps.~~

Confirm bumping to bullmq-pro @ 6.11.0 resolved it. I no longer get drain events on redis while idling.

chrisvlopez · 2024-03-19T16:25:59Z

One other symptom I forgot to mention --

we also noticed an elevated amount of GET commands to redis. Would those be related to the drain events and similarly be fixed with the version bump?

manast · 2024-03-19T21:22:49Z

@chrisvlopez never heard about elevated number of GET commands before, I would need a bit more context in order to be able to give an assessment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High volume of drained events -- xtrim can't keep up #69

High volume of drained events -- xtrim can't keep up #69

chrisvlopez commented Mar 18, 2024 •

edited

Loading

manast commented Mar 19, 2024

chrisvlopez commented Mar 19, 2024 •

edited

Loading

chrisvlopez commented Mar 19, 2024

manast commented Mar 19, 2024

High volume of drained events -- xtrim can't keep up #69

High volume of drained events -- xtrim can't keep up #69

Comments

chrisvlopez commented Mar 18, 2024 • edited Loading

manast commented Mar 19, 2024

chrisvlopez commented Mar 19, 2024 • edited Loading

chrisvlopez commented Mar 19, 2024

manast commented Mar 19, 2024

chrisvlopez commented Mar 18, 2024 •

edited

Loading

chrisvlopez commented Mar 19, 2024 •

edited

Loading