Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Datadog Scaler: Support Multi-Query Results #3423

Closed
dalgibbard opened this issue Jul 26, 2022 · 4 comments · Fixed by #3550 or kedacore/keda-docs#885
Closed

Datadog Scaler: Support Multi-Query Results #3423

dalgibbard opened this issue Jul 26, 2022 · 4 comments · Fixed by #3550 or kedacore/keda-docs#885
Labels
feature-request All issues for new features that have not been committed to needs-discussion

Comments

@dalgibbard
Copy link

Proposal

In the Datadog scaler code currently, we check if the number of Series returned is only 1, and then grab the last (newest) from the Pointlist: https://github.com/kedacore/keda/blob/main/pkg/scalers/datadog_scaler.go#L273

It would be very handy to support multi-query calls (defined in the form of comma-seperated queries) in a single call; eg: sum:http.requests{service:myservice},avg:http.backlog{service:myservice}

This results in the Datadog API returning multiple Series objects, each with it's own Pointlist - so we'd need to iterate through all Series items, grab the latest point from each (same as we currently do for one, but for multiple), and then run an Aggregation on those - eg: min/max/median/sum/avg -- probably defaulting to "max"?

Use-Case

In places where we currently define multiple Datadog Queries for Autoscaling a single service, this change would allow for massively reduced API calls (as multi-query calls is only a single API request); and then aggregation on the Datadog Scaler would allow for good user customisation of the results (which the Metrics API actually doesn't handle well).

Anything else?

@arapulido 👋

I can probably take a stab at this? But not sure if you had/have plans for this already etc.

@dalgibbard dalgibbard added feature-request All issues for new features that have not been committed to needs-discussion labels Jul 26, 2022
@tomkerkhove tomkerkhove moved this to Proposed in Roadmap - KEDA Core Jul 26, 2022
@dgibbard-cisco
Copy link
Contributor

dgibbard-cisco commented Jul 26, 2022

I started having a little look; but i'm a golang noob - I have this so far: https://github.com/dgibbard-cisco/keda/pull/1/files
What do you think @arapulido ?

@arapulido
Copy link
Contributor

hey @dgibbard-cisco Thanks for the suggestion and the patch! I will have a look to that PR in the following days

@arapulido
Copy link
Contributor

@dalgibbard Before starting reviewing the code, I want to clarify a bit more what you want to do here, from a logic point of view.

Can you put an example on how you would use two queries (and an aggregator) to scale an object and why?

Thanks!

@dgibbard-cisco
Copy link
Contributor

dgibbard-cisco commented Jul 28, 2022

@arapulido yeah sure thing - the short version is that it allows you to condense multiple API requests into a single API query, which eases strain on Datadog Rate Limiting - this was one of the main drivers, and stemmed from a Discussion with Datadog support recently when discussing how to aggregate multiple queries on our own proprietary autoscaling software, which uses Datadog on the backend, for some on-prem things.

The longer version is that it also allows you to make multiple queries, and then decide how best to use that result; eg, take the average from multiple queries for scaling. You can add arthimetic operations to support weighting etc, and other weird but powerful things; eg: query=query1/20,(query2*10)/6, and then queryAggregator would decide how to use the returned values.

Essentially; I was expecting to be able to do something like max(query1,query2) in a request to datadog; but this is not supported.
Datadog support advised that you can however submit multiple queries, and parse the multiple responses (Series), which is what i'm looking to do here.

I explained with some examples in the possible readme change too: https://github.com/dgibbard-cisco/keda-docs/pull/2
Especially; the last section, where I propose the ability for each query to be able to "answer" how many containers should run for a given metric, which you can then choose from based on an average or max of the results (ideal for non-similar queries).

A slight more real-world example of that would be something like:

per_second(sum:http.requests{service:myservice}.rollup(max, 300))*0.013,week_before(per_second(sum:http.requests{service:myservice}.rollup(max, 300)))*0.013,sum:http.backlog{service:myservice}.rollup(max, 300)/30,sum:docker.cpu.usage{service:myservice}.rollup(max, 300)/200

This would return an API response containing 4 Series elements, each with it's own Pointlist; which we sift through as before, to collect the latest timestamp/value, before passing the values through an avg() or max() operation.

NOTE: I haven't tested the code yet; so probably don't burn too much time on a deep review - I mainly just wanted to know that I was on the right track, and that my code isn't complete garbage :)

Particularly (as i'm new to golang):

  • Is slices the right datatype to use here?
  • Isn't there a builtin method for avg()/max() against an array/slice?

Repository owner moved this from Proposed to Ready To Ship in Roadmap - KEDA Core Sep 12, 2022
@JorTurFer JorTurFer moved this from Ready To Ship to Done in Roadmap - KEDA Core Mar 13, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
feature-request All issues for new features that have not been committed to needs-discussion
Projects
Archived in project
3 participants