Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Non-uniqueness in interval list operations #1269

Open
samuelbray32 opened this issue Mar 26, 2025 · 1 comment
Open

Non-uniqueness in interval list operations #1269

samuelbray32 opened this issue Mar 26, 2025 · 1 comment
Labels
bug Something isn't working common

Comments

@samuelbray32
Copy link
Collaborator

Describe the bug

  • Interval lists are not a formal class in spyglass, but are assumed to be a numpy array of start and stop times
  • We have not stated nor enforced that the intervals defined by these start and stop times must be non-overlapping
  • The function interval_list_contains and interval_list_contains_ind implicitly assume non-overlap of intervals.
  • This can lead to returning duplicates of a timestamp index or value if it falls within more than one interval

Solution Options

  1. Accept that intervals can be overlapping. Enforce uniqueness of the returned values using np.unique
  2. Enforce non-overlapping intervals. Make a function to check the validity of an interval list and run it within these types of interval-operation functions.

As a user, my assumptions had been that these functions would operate like solution (1). @edeno @CBroz1 do you have thoughts?

@samuelbray32 samuelbray32 added bug Something isn't working common labels Mar 26, 2025
@CBroz1
Copy link
Member

CBroz1 commented Mar 27, 2025

I would prefer (2). I'd like to set up IntervalList as a dataclass that carries its own operations like a set. We could then enforce norms in the init of the class. I think this would be much cleaner than our current appoach

Proposed:

# Just import data class and table
from spyglass.common.common_interval import IntervalTimes, IntervalList

# Option to manually construct, or build from fetch
foo = IntervalTimes(interval_list_name="foo", valid_times=mytimes1)
bar = IntervalTimes(**(IntervalList & key).fetch1())

# Make use of setwise operations
intersect = foo ^ bar # or foo.intersect(bar, unique=True)
intersect.name = 'new_name'

# Dump right back to the table
IntervalList().insert1(intersect.as_dict)

Status quo equivalent:

# manual management of which operations are needed
from spyglass.common.common_interval import interval_list_intersect, IntervalList

# explicit assignment of times objects w/o enforcement
mytimes1 = (IntervalList & key1).fetch1("valid_times")
mytimes2 = (IntervalList & key2).fetch1("valid_times")

# functional approach
intersect = interval_list_intersect(mytimes1, mytimes2)

# always reconstructing dicts of predictable structure
IntervalList().insert1(dict(interval_list_name='new_name', valid_times=intersect)

The trouble with enforcing something new is backwards compatibility. How does this new class handle violations? Just a warning?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working common
Projects
None yet
Development

No branches or pull requests

2 participants