Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Improve conservative percentile calculation in live combining single significance fits #5005

Conversation

GarethCabournDavies
Copy link
Contributor

We realised that in some cases, the simple percentile of fit coefficient can be less conservative than the maximum likelihood method - particularly for separated fits with a large fit coefficient and a small number of triggers.

This is as the maximum likelihood calculation takes number of triggers into account, but the percentile does not, leading to low-trigger-count fits taking up much more importance than they should.

I have updated the calculation of the conservative fit coefficient to use a calculation more similar to the maximum likelihood method.

The standard calculation is mean(counts) / mean(counts / alpha)

I have updated the 'conservative' fit to use percentile(counts, 100 - p) / percentile(counts / alpha, p)

Using 100 - p for the numerator, but p for the denominator means a smaller fit coefficient, but I'm not sure if this skews the calculation too much.

As a result, I have included an alternative versiion using p for both as "ALT" in the results folder linked below (In fact, I think I prefer this).

Possible alternative methods

  • since numpy v2, the percentile function allows for weights. Weighting by number of triggers, we become conservative again and are able to simply use the old calculation. This also allows the weighting to be a command line option if wanted.
  • Include the errors from the fit coefficient calculation in the separated fit files, then combine the fits using alpha + sigma_alpha and count + sqrt(count) to give the conservative fit. This would remove the option of conservative percentile in the interface

See plots in https://ldas-jobs.ligo.caltech.edu/~gareth.cabourndavies/pycbclive/various_tests/single_significance_fits_weighted_percentile/

Standard information about the request

This is a new feature
This change affects the live search
This change changes, scientific output
This change follows style guidelines (See e.g. PEP8), has been proposed using the contribution guidelines

  • The author of this pull request confirms they will adhere to the code of conduct

@titodalcanton titodalcanton added the v27_release_branch Things to be added to the v2.7 release branch label Jan 15, 2025
@tdent
Copy link
Contributor

tdent commented Jan 16, 2025

Yes, I don't quite see an argument for 1-p in the counts percentile, for several of the cases it seems to be fluctuating too far the other way (ie from being too large due to an outlier the coeff becomes probably too small).

So I vote ALT ... btw, good thing I'm not German given that ALT means 'old' 🇩🇪

Copy link
Contributor

@tdent tdent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine, see the suggested change

Co-authored-by: Thomas Dent <thomas.dent@usc.es>
@tdent tdent enabled auto-merge (squash) January 17, 2025 13:47
@tdent tdent merged commit dd3f009 into gwastro:master Jan 17, 2025
30 checks passed
maxtrevor pushed a commit to maxtrevor/pycbc that referenced this pull request Feb 24, 2025
…significance fits (gwastro#5005)

* Just use percentile rather than mean in ML calculation for conservative fits

* Fix

* Use ALT

* Update bin/live/pycbc_live_combine_single_significance_fits

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

---------

Co-authored-by: Thomas Dent <thomas.dent@usc.es>
maxtrevor added a commit that referenced this pull request Feb 25, 2025
#5059)

* Improve conservative percentile calculation in live combining single significance fits (#5005)

* Just use percentile rather than mean in ML calculation for conservative fits

* Fix

* Use ALT

* Update bin/live/pycbc_live_combine_single_significance_fits

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

---------

Co-authored-by: Thomas Dent <thomas.dent@usc.es>

* Sphinx version CI fix (#5060)

* Try this

* Pin to before 8.2.0

* add comment

---------

Co-authored-by: Gareth S Cabourn Davies <gareth.cabourndavies@ligo.org>
Co-authored-by: Thomas Dent <thomas.dent@usc.es>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
low latency v27_release_branch Things to be added to the v2.7 release branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants