Skip to content
This repository was archived by the owner on Dec 18, 2019. It is now read-only.

Series with periodic #85

Open
hit9 opened this issue May 8, 2014 · 7 comments
Open

Series with periodic #85

hit9 opened this issue May 8, 2014 · 7 comments

Comments

@hit9
Copy link

hit9 commented May 8, 2014

For example, a series with periodic: 1 day, data at 12:00 is a peak(i.e 1000), and at 0:00 is 10, so, 1000 at 12:00 should be normal, and 10 at 12:00 should be anomalous.

But skyline thinks 10 is normal.

@astanway
Copy link
Contributor

astanway commented May 8, 2014

Pull requests accepted...

Seasonal algorithms are hard to automatically fit. Working on it, though...

On May 8, 2014, at 12:25 AM, 王超 notifications@github.com wrote:

For example, a series with periodic: 1 day, data at 12:00 is a peak(i.e 1000), and at 0:00 is 10, so, 1000 at 12:00 should be normal, and 10 at 12:00 should be anomalous.

But skyline thinks 10 is normal.


Reply to this email directly or view it on GitHub.

@hit9
Copy link
Author

hit9 commented May 8, 2014

A way is, use Fast Fourier Transform to detect series's periodic, and fetch datapoints at the same phase, then analyze the new dataset.

I am looking inside now ..

@astanway
Copy link
Contributor

astanway commented May 8, 2014

Yep! That's what I was leaning towards - use FFT to get periodicity, and maybe use that to populate an ARIMA or use a KS test along windowed intervals? cc @toufic

On May 8, 2014, at 6:01 AM, 王超 notifications@github.com wrote:

A way is, use Fast Fourier Transform to detect series's periodic, and fetch datapoints at the same phase, then analyze the new dataset.

I am looking inside now ..


Reply to this email directly or view it on GitHub.

@hit9
Copy link
Author

hit9 commented May 8, 2014

I'm not so sure of the last question, but the method to detect periodicity, I get some information from : http://stackoverflow.com/questions/15261122/determine-frequency-from-signal-data-in-matlab

And, this function may help:

def guess_period(x):
    x = np.array(x)
    n = np.size(x)
    m = np.mean(x)
    p = np.abs(np.fft.fft(x - m))
    i = np.argmax(p)
    if i:
        return n / float(i)

this might gives a series's period, but some fails:

>>> x = [1, 20, 2, 20, 1, 21, 2, 22, 1, 19]
>>> guess_period(x)
2.0
>>> import itertools
>>> source = itertools.cycle([1, 10, 20, 10, 1])
>>> x = [source.next() for _ in range(101)]
>>> guess_period(x)
5.05
>>> x = [source.next() for _ in range(103)]
>>> guess_period(x)
4.904761904761905
>>> x = [source.next() for _ in range(105)]
>>> guess_period(x)
1.25  # fails

I think, we can maintain a dict ({period: hit_times}), the period that hit most wins.

@astanway
Copy link
Contributor

astanway commented May 8, 2014

Awesome. You can use Crucible (github.com/astanway/crucible) to refine the algorithm.

@hit9
Copy link
Author

hit9 commented Jul 16, 2014

Any progress forward on this ?

@hit9
Copy link
Author

hit9 commented Sep 1, 2014

Hi @astanway , I have created another monitor similar with to skyline https://github.com/eleme/node-bell
, it's only for periodic metrics. And the algorithm used is only 3-sigma. Thanks, for this project giving me lot of ideas!

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants