Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Quorum queues: memory spike when applying a max-length policy retroactively to a long queue #12608

Closed
mkuratczyk opened this issue Oct 29, 2024 · 1 comment · Fixed by #12712
Labels
Milestone

Comments

@mkuratczyk
Copy link
Contributor

Describe the bug

Given a long quorum queue, if I apply a policy to limit the queue's length (in a real-world scenario, likely with the intention of preventing further queue growth and running out of memory), a significant memory spike occurs to drop the messages above the new threshold. This can easily cause the opposite effect than intended: I run out of memory because I was trying to prevent running out of memory...

In my particular case, it was even "funnier": I had a cluster on Kubernetes, applied a policy, the leader was OOMkilled, a new leader was elected and tried to apply the policy, so it was OOMkilled. The remaining node survived because a leader could not be elected, but as soon as one of the nodes restarted, a leader was elected and OOMkilled. A policy that was meant to limit memory usage, cause an OOMkill-loop. :)

Reproduction steps

  1. make run-broker (tested on main)
  2. Publish a significant number of messages: perf-test -qq -u qq -x 4 -y 0 -c 100 -s 5000 -ms -C 1250000
  3. Apply a policy that sets the limit to a low value: rabbitmqctl set_policy max qq '{"max-length": 1234}'
  4. Observe memory usage
Screenshot 2024-10-29 at 12 35 47

set_policy-main

Expected behavior

Ideally there should be no significant spike when dropping messages.

Additional context

No response

@mkuratczyk mkuratczyk added the bug label Oct 29, 2024
@michaelklishin michaelklishin changed the title Quorum queues: memory spike when applying a max-length policy to a long queue Quorum queues: memory spike when applying a max-length policy retroactively to a long queue Oct 29, 2024
@kjnilsson
Copy link
Contributor

We create an effect for each dropped message here:

{state(), ra_machine:effects()}.
discard(Msgs, Reason, undefined, State) ->
{State, [{mod_call, rabbit_global_counters, messages_dead_lettered,
[Reason, rabbit_quorum_queue, disabled, length(Msgs)]}]};

That's probably what causes the memory growth combined with the use of ++ to concatenate the resulting effects.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants