Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

ref(statsd): Use statsdproxy to pre-aggregate metrics in-memory #2425

Merged
merged 6 commits into from
Feb 23, 2024

Conversation

untitaker
Copy link
Member

@untitaker untitaker commented Aug 24, 2023

use experimental statsdproxy hackweek project to aggregate counters and
gauges (i.e. the "easy stuff") in memory before sending it over the UDP
buffer.

We use the same code in rust consumers to pre-aggregate metrics. The
performance improvement is a wash (neither improves nor degrades perf),
but it should load on veneur, so it may still amount to cost savings.

Arpad has a kind of pre-aggregation that results in actual cost savings
within the application itself, in the future we may replace statsdproxy
with that.

use experimental statsdproxy hackweek project to aggregate counters and
gauges (i.e. the "easy stuff") in memory before sending it over the UDP
buffer.

this might not work perfectly and most bizarrely, aggregating only some
metric types probably will mess with timestamp accuracy (even though the
flush interval is at a very low 1s). however, currently it's possible
that we are dropping metrics because the udp send buffer is at its
limits. so who knows really if this makes metrics more or less
reliable...
@untitaker untitaker requested a review from a team August 24, 2023 20:42
@jan-auer jan-auer changed the title use statsdproxy to pre-aggregate metrics in-memory hackweek: Use statsdproxy to pre-aggregate metrics in-memory Aug 25, 2023
@untitaker untitaker marked this pull request as draft August 25, 2023 12:10
@jan-auer jan-auer added the question Further information is requested label Sep 4, 2023
@getsantry
Copy link

getsantry bot commented Oct 18, 2023

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you remove the label Waiting for: Community, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

@untitaker untitaker changed the title hackweek: Use statsdproxy to pre-aggregate metrics in-memory ref(statsd): Use statsdproxy to pre-aggregate metrics in-memory Feb 21, 2024
@untitaker untitaker marked this pull request as ready for review February 21, 2024 13:29
@untitaker untitaker requested a review from a team as a code owner February 21, 2024 13:29
relay-config/src/config.rs Outdated Show resolved Hide resolved
relay-statsd/src/lib.rs Show resolved Hide resolved
@untitaker
Copy link
Member Author

@Dav1dde some additional context that i forgot to share: we use statsdproxy in rust consumers to send data to DDM.

the way this works is that we pre-aggregate using statsdproxy, then multiplex to the rust SDK, in order to offset the performance overhead that the rust SDK has.

I think long-term statsdproxy is not the right abstraction for this, and in fact @Swatinem is already working on what I think could be a replacement for all of this. but in the short-term this would allow you to dogfood DDM in relay with minimal overhead (and no code locations). take a look at statsd.rs in snuba if you're interested.

@Dav1dde
Copy link
Member

Dav1dde commented Feb 23, 2024

the way this works is that we pre-aggregate using statsdproxy, then multiplex to the rust SDK, in order to offset the performance overhead that the rust SDK has.

Thanks, that seems like a good approach and something we wanted to do anyways.

Copy link
Member

@Dav1dde Dav1dde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try it.

We need to test this properly on Canary and S4S first, please don't merge if you don't have enough time to do that.

If you want I can pick this up and do the rollout sometime beginning of next week.

@Dav1dde Dav1dde removed question Further information is requested Stale labels Feb 23, 2024
@Dav1dde Dav1dde self-assigned this Feb 23, 2024
@@ -69,6 +69,7 @@ pub fn init_metrics(config: &Config) -> Result<()> {
&addrs[..],
default_tags,
config.metrics_buffering(),
config.metrics_aggregation(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: With this number of arguments, it might be nice to pass a StatsdConfig object instead. Not a blocker though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plan is to get rid of the options all together: #2425 (comment)

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK!

Comment on lines +262 to +264
flush_interval: 1,
flush_offset: 0,
max_map_size: None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be configurable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they should probably not have been options to begin with tbh

@untitaker untitaker merged commit 1dbd9bf into master Feb 23, 2024
20 checks passed
@untitaker untitaker deleted the statsdproxy-sink branch February 23, 2024 13:42
Dav1dde added a commit that referenced this pull request Feb 29, 2024
As discussed in #2425, removes the options, there is no reason not to
buffer and not to aggregate/use statsdproxy.

Also cleans up the configuration a bit.
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants