Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

bug: median absolute deviation is always zero in case of odd number of samples #234

Closed
pallosp opened this issue Jan 29, 2025 · 7 comments · Fixed by #236
Closed

bug: median absolute deviation is always zero in case of odd number of samples #234

pallosp opened this issue Jan 29, 2025 · 7 comments · Fixed by #236

Comments

@pallosp
Copy link
Contributor

pallosp commented Jan 29, 2025

The root cause is that aggFn expects a sorted list, while absoluteDeviations is not sorted.

export const getStatisticsSorted = (samples: number[]): Statistics => {
  ...
  const p50 = medianSorted(samples)
  return {
    mad: absoluteDeviation(samples, medianSorted, p50),
    ...
  }
}

const absoluteDeviation = (
  samples: number[],
  aggFn: (arr: number[]) => number | undefined,
  aggValue = aggFn(samples)
) => {
  const absoluteDeviations: number[] = []

  for (const sample of samples) {
    // eslint-disable-next-line @typescript-eslint/no-non-null-assertion
    absoluteDeviations.push(Math.abs(sample - aggValue!))
  }

  return aggFn(absoluteDeviations)
}
@jerome-benoit
Copy link
Collaborator

getStatisticsSorted() is called only on a sorted sample. I do not understand how the number of samples is related to the MAD.

@pallosp
Copy link
Contributor Author

pallosp commented Jan 29, 2025

I'm talking about absoluteDeviations.

For example, when samples are [1, 2, 3]

  1. p50 will be 2
  2. absoluteDeviation will be called with the ([1, 2, 3], medianSorted, 2) params
  3. absoluteDeviations will be populated with [1, 0, 1]
  4. MAD will be medianSorted([1, 0, 1]), which returns the 0 instead of 1

@jerome-benoit
Copy link
Collaborator

absoluteDeviations() only call site location is in getStatisticsSorted(). And getStatisticsSorted() is always called on a sorted sample.

@pallosp
Copy link
Contributor Author

pallosp commented Jan 29, 2025

getStatisticsSorted() calls absoluteDeviation(), which inside calls medianSorted() with an unsorted list

@jerome-benoit
Copy link
Collaborator

I see it now :)

@pallosp
Copy link
Contributor Author

pallosp commented Jan 29, 2025

I sent a PR with the fix

@jerome-benoit
Copy link
Collaborator

fixed in #236

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants