[feature request] auto choose retention policies based on timestamp when querying #2625

liyichao · 2015-05-21T07:38:15Z

Now, select chooses the default retention policy if not specified. It will be better if select auto choose retention policies for the same series based on timestamp when the select statement does not specify retention policy. This will simplify dashboard tools. This is what graphite already does.

If we have to change retention policy when we want older data, it is tedious because we have to edit the dashboard definition.

When we select from a series, we do not care what retention policies it has, we just want the datapoints.

beckettsean · 2015-05-21T18:02:51Z

A time range alone is not sufficient to identify which retention policy is desired. There is nothing to prevent two series with identical measurement name and tag sets from existing in separate retention policies with overlapping time ranges. Therefore it is not possible for the system to know which series is intended if the retention policy is not provided.

A workaround for now is to keep all data for a given dashboard in the same retention policy. It does require maintaining multiple dashboards.

daviesalex · 2015-11-24T13:16:57Z

@beckettsean, this is certainly from our POV a fairly important feature and I dont think your workaround really works. Let me give an example of the problem we have - we capture some data every second. Lets say IO blocks out (which is in telegraf). You need data at 1 second granularity for some types of troubleshooting, but in most cases on a graph it would be crazy to worry about 1 second data.

Lets imagine a common case - a dashboard showing all metrics per server. It might default to show the last hour (3600 data points/server). Per day 86400, per month >2.5 million points. Per hour and perhaps per day will just work but nobody in their right mind would attempt to keep metrics at 1 second granularity over a year and then graph them - while InfluxDB can downsample them its going to have to pull a crazy number of metrics from disk for that query (and, in the real world, it would likely be >1 server on a graph; we also have plans to store data in some cases at a small number of microseconds delta). We also have a basic disk space problem - we are already capturing many hundreds of GB of 1s and 10s metrics per day.

The sane pattern is to keep 1 second for 24 hours, 1 minute for a week, 5 minute for a month and once an hour for a year (or something similar). This is how just about every other system (graphite, Ganglia, etc.) handle it. This we can sort of do with a Continuous query in InfluxDB, to copy the down sampled data to a new database (although we have to delete the 1 second data manually). The problem is now we have a Grafana problem - we can only query either downsampled, or original data, from a single graph. This means that a user who looks at a 1 hour graph (1 second granularity) then zooms out to see the last month and we have to change the database. Which Grafana does not support.

There are two ways to approach this:

Teach Grafana about this concept (preferably somehow auto-learning that downsampled data exists in this other place, although more realistically defining that in Grafana)
Provide a single "view" inside InfluxDB that merges the various levels of data
Provide a way to down-sample data in InfluxDB after a certain period of time, sort of like a contiguous down-sampling job.

My personal preference would be (3), but I suspect thats not an architectural starter (although if you would be willing to accept that as an option, we might be able to find somebody to work on it and send you a PR). This leaves us with (1) or (2). THis ticket strikes me as asking for (1). DO you think its best to attack this via means of this, or to track a issue more like (2) (for the InfluxDB project)

cc @sebito91, @wrigtim

PaulKuiper · 2015-11-24T20:59:59Z

A different way of solving is is to build a proxy between grafana and influx (we need this anyway to check user acces). Parse out the group by, measurement name and agregate of the incomming request at the proxy and apply a rule to change the measurement name (prepend a retention or a custom string fitting your data structure). Send this query to influx instead of the original. I think this is the most pratical solution at the moment.

daviesalex · 2015-11-24T22:06:49Z

@PaulKuiper, funnily enough thats exactly how we plan to achieve this (we also have the ACL problem).

Have you already worked on this? We may build this and open source it... or use somebody elses's if its already out there.

PaulKuiper · 2015-12-03T17:00:33Z

Attached is a python file (in txt format, else I could not upload it), which you can use as a simple proxy between grafana and influx.

It can greatly increase zoom speed. It assumes that the following continous queries are present for the measurement called "metric" :

metric.1s.max
metric.1m.max
metric.1h.max
metric.1d.max
metric.1h.mean
......

Point your "data source" to port 3004 (or whatever you choose) instead of port 8086 in Grafana.
The proxy will now change your query transparantly by choosing a different table when zooming out.
select max(value) from "metric" where time > x1 and time < x2 group by time(12h)
becomes:
select max(value) from "metric.1h.max" where time > x1 and time < x2 group by time(12h)

poxy.txt

huhongbo · 2015-12-22T03:06:27Z

+1

toni-moreno · 2015-12-22T06:52:25Z

+1

exeral · 2016-02-17T15:49:42Z

+1 !!
any feedback about PaulKuiper workaround ?

PaulKuiper · 2016-02-17T17:06:23Z

I'll update it for influxdb 0.10 somewhere this month

adrianlzt · 2016-02-18T16:20:55Z

+1

thbourlove · 2016-02-26T09:08:53Z

👍

@PaulKuiper

Author : @PaulKuiper From: influxdata/influxdb#2625 (comment)

Lupul · 2016-02-27T18:46:34Z

I've done some work on @PaulKuiper proxy to work with 0.10 and put it here:
https://github.com/Lupul/influxdb-grafana-rp-proxy

adrianlzt · 2016-03-03T15:37:03Z

With version 0.10 it's normal to have several values in each measurement.
In the proxy readme the CQs are just for only one value (called value)

CREATE CONTINUOUS QUERY graphite_cq_10sec  ON graphite BEGIN SELECT mean(value) as value INTO graphite."10sec".:MEASUREMENT  FROM graphite."default"./.*/ GROUP BY time(10s), * END

Any ideas how to handle where there are several values?

I was thinking in some batch processing with kapacitor which obtains all values for each measurement, and creates the appropriates CQs.

adrianlzt · 2016-03-03T18:09:40Z

I have make a small script to autogenerate RPs and CQs: https://gist.github.com/f4b6f5c8f6c2a51c3f60

beckettsean · 2016-03-04T01:19:50Z

@adrianlzt in CQs each tag or field must be explicitly named. It is possible to use SELECT * to return all columns from an ad hoc query, but not in a CQ, as there is no aggregation function.

So, to downsample multiple fields, the CQ would look something like this:

CREATE CONTINUOUS QUERY graphite_cq_10sec  
ON graphite BEGIN 
SELECT mean(value) as value, last(value) as last, mean(value_23) as value_23, top(field19) as top
INTO graphite."10sec".:MEASUREMENT  
FROM graphite."default"./.*/ 
GROUP BY time(10s), * 
END

The GROUP BY * clause means that each downsampled value would be stored in a series with the same tag set as the original series. So, while the tags aren't explicitly queried, they will still be part of the downsampled series. Without the GROUP BY * clause above, all tags would be lost during downsampling. It is possible to name explicit tags in the GROUP BY, and then only those tags would be preserved.

adrianlzt · 2016-03-04T17:12:28Z

Will be this fixed in next versions?
It is hard to maintain downsampling in multiple databases with lots of
series with multiple values.

El vie., 4 de marzo de 2016 2:20, Sean Beckett notifications@github.com
escribió:

@adrianlzt https://github.com/adrianlzt in CQs each tag or field must
be explicitly named. It is possible to use SELECT * to return all columns
from an ad hoc query, but not in a CQ, as there is no aggregation function.

So, to downsample multiple fields, the CQ would look something like this:

CREATE CONTINUOUS QUERY graphite_cq_10sec
ON graphite BEGIN
SELECT mean(value) as value, last(value) as last, mean(value_23) as value_23, top(field19) as top
INTO graphite."10sec".:MEASUREMENT
FROM graphite."default"./.*/
GROUP BY time(10s), *
END

—
Reply to this email directly or view it on GitHub
#2625 (comment)
.

beckettsean · 2016-03-05T00:43:39Z

Follow #5750, which is the
relevant issue.

On Fri, Mar 4, 2016 at 9:13 AM, Adrián López notifications@github.com
wrote:

Will be this fixed in next versions?
It is hard to maintain downsampling in multiple databases with lots of
series with multiple values.

El vie., 4 de marzo de 2016 2:20, Sean Beckett notifications@github.com
escribió:

@adrianlzt https://github.com/adrianlzt in CQs each tag or field must
be explicitly named. It is possible to use SELECT * to return all columns
from an ad hoc query, but not in a CQ, as there is no aggregation
function.

So, to downsample multiple fields, the CQ would look something like this:

CREATE CONTINUOUS QUERY graphite_cq_10sec
ON graphite BEGIN
SELECT mean(value) as value, last(value) as last, mean(value_23) as
value_23, top(field19) as top
INTO graphite."10sec".:MEASUREMENT
FROM graphite."default"./.*/
GROUP BY time(10s), *
END

—
Reply to this email directly or view it on GitHub
<
https://github.com/influxdata/influxdb/issues/2625#issuecomment-192046509>
.

—
Reply to this email directly or view it on GitHub
#2625 (comment)
.

Sean Beckett
Director of Support and Professional Services
InfluxDB

nelg · 2016-03-20T23:31:04Z

+1 for this issue. Without good data down-sampling / roll up, moving from whisper is likely to present problems for graphite graphs over long time periods. Ideally needs a way to set a default policy around rollup and retention, for all metrics of a specific type.

TomGudman · 2016-04-20T00:41:47Z

+1

Coming from the old RRD world, this is obviously a big shift in mentality, I fully agree with #2625 (comment) - option 3

Then I never used graphite but read about it several times and I liked how you can set different retention policies per metrics if desired otherwise you get the default downsampling automatically.

The influxdb approach is awkward and makes it hard to maintain in my beginner's opinion.

I believe users want a time series databases that is efficient, fast and requires low maintenance. Managing RP and CQ with complex InfluxQL queries isn't obvious...

My comments maybe irrelevant, I am still learning and reading mailing-list and github issues to figure out how I can set downsampling. Currently, I start to wonder why I waited so long for InfluxDB instead of just using Dixon's Graphite.

Still, influxdb is fast to spin and play with the tutorial but then it gets more complicated when you really want to do something with it.

jsternberg · 2016-07-30T23:16:06Z

Looking through old issues and I found this one. It seems related to #6910.

daviesalex · 2016-07-30T23:19:30Z

Agreed. I think this one can basically be closed as a dup of #6910

jsternberg · 2016-08-12T15:28:37Z

I'm going to close this in favor of #6910. If a new issue gets created for this, it will be mentioned in that issue.

liyichao changed the title ~~[feature request] auto select retention policies when select~~ [feature request] auto choose retention policies when select May 21, 2015

liyichao changed the title ~~[feature request] auto choose retention policies when select~~ [feature request] auto choose retention policies based on timestamp when select May 21, 2015

beckettsean added the RFC label May 26, 2015

beckettsean added this to the Longer term milestone Sep 17, 2015

beckettsean mentioned this issue Sep 17, 2015

Feature request: support queries that cross retention policies #3003

Closed

beckettsean added the area/queries label Sep 17, 2015

RKelln mentioned this issue Oct 22, 2015

grafana can support influxdb continuous queries some way grafana/grafana#420

Closed

beckettsean mentioned this issue Oct 28, 2015

Influxdb - Grafana stack is not scalable due to unsupported time roll up #4605

Open

beckettsean changed the title ~~[feature request] auto choose retention policies based on timestamp when select~~ [feature request] auto choose retention policies based on timestamp when querying Oct 28, 2015

jackzampolin added the kind/feature-request label Nov 3, 2015

beckettsean mentioned this issue Jan 5, 2016

[new page] open feature requests influxdata/docs.influxdata.com-ARCHIVE#64

Closed

Lupul added a commit to Lupul/influxdb-grafana-rp-proxy that referenced this issue Feb 27, 2016

Initial version for older influxdb ?~9.x

defaed7

Author : @PaulKuiper From: influxdata/influxdb#2625 (comment)

jsternberg closed this as completed Aug 12, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature request] auto choose retention policies based on timestamp when querying #2625

[feature request] auto choose retention policies based on timestamp when querying #2625

liyichao commented May 21, 2015

beckettsean commented May 21, 2015

daviesalex commented Nov 24, 2015

PaulKuiper commented Nov 24, 2015

daviesalex commented Nov 24, 2015

PaulKuiper commented Dec 3, 2015

huhongbo commented Dec 22, 2015

toni-moreno commented Dec 22, 2015

exeral commented Feb 17, 2016

PaulKuiper commented Feb 17, 2016

adrianlzt commented Feb 18, 2016

thbourlove commented Feb 26, 2016

Lupul commented Feb 27, 2016

adrianlzt commented Mar 3, 2016

adrianlzt commented Mar 3, 2016

beckettsean commented Mar 4, 2016

adrianlzt commented Mar 4, 2016

beckettsean commented Mar 5, 2016

nelg commented Mar 20, 2016

TomGudman commented Apr 20, 2016

jsternberg commented Jul 30, 2016

daviesalex commented Jul 30, 2016

jsternberg commented Aug 12, 2016

[feature request] auto choose retention policies based on timestamp when querying #2625

[feature request] auto choose retention policies based on timestamp when querying #2625

Comments

liyichao commented May 21, 2015

beckettsean commented May 21, 2015

daviesalex commented Nov 24, 2015

PaulKuiper commented Nov 24, 2015

daviesalex commented Nov 24, 2015

PaulKuiper commented Dec 3, 2015

huhongbo commented Dec 22, 2015

toni-moreno commented Dec 22, 2015

exeral commented Feb 17, 2016

PaulKuiper commented Feb 17, 2016

adrianlzt commented Feb 18, 2016

thbourlove commented Feb 26, 2016

Lupul commented Feb 27, 2016

adrianlzt commented Mar 3, 2016

adrianlzt commented Mar 3, 2016

beckettsean commented Mar 4, 2016

adrianlzt commented Mar 4, 2016

beckettsean commented Mar 5, 2016

nelg commented Mar 20, 2016

TomGudman commented Apr 20, 2016

jsternberg commented Jul 30, 2016

daviesalex commented Jul 30, 2016

jsternberg commented Aug 12, 2016