-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Start facilitating coefficient splitting #247
Conversation
@@ -659,7 +692,9 @@ def _compute_features(self, *xi: ArrayLike) -> FeatureMatrix: | |||
Raises | |||
------ | |||
ValueError: | |||
If an invalid mode is specified or necessary parameters for the chosen mode are missing. | |||
- If an invalid mode is specified or necessary parameters for the chosen mode are missing. | |||
- In mode "conv", if the number of inputs to be convolved, doesn't match the number of inputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The init of mode="conv"
and mode="eval"
starts to diverge even more, and this is another argument in favor of having distinct classes for basis that performs convolution and basis for evaluation.
Separate PR I would say, since there is a lot to change if we include the test. Do you have suggestion on an API that may work and also naming conventions?
Most of the machinery is shared, so probably a base class which they both inherit from, with an abstract method compute_features
, and the two variants simply re-implement that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refer to #202 for a discussion on the basis API
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## development #247 +/- ##
===============================================
+ Coverage 96.73% 97.27% +0.53%
===============================================
Files 25 25
Lines 2206 2199 -7
===============================================
+ Hits 2134 2139 +5
+ Misses 72 60 -12 ☔ View full report in Codecov by Sentry. |
Co-authored-by: Sarah Jo Venditto <sjvenditto@gmail.com>
Co-authored-by: Sarah Jo Venditto <sjvenditto@gmail.com>
Co-authored-by: Sarah Jo Venditto <sjvenditto@gmail.com>
Co-authored-by: Sarah Jo Venditto <sjvenditto@gmail.com>
Co-authored-by: William F. Broderick <billbrod@gmail.com>
Co-authored-by: William F. Broderick <billbrod@gmail.com>
…nemos into coeff_parsing_support
Co-authored-by: William F. Broderick <billbrod@gmail.com>
…generalized-linear-models into coeff_parsing_support
…nemos into coeff_parsing_support
Basis in conv mode allow for mutlidimensional inputs (intended for counts which are a TsdFrame in pynapple). The extra dimension. Example below, >>> import nemos as nmo
>>> import pynapple as nap
>>> import numpy as np
>>> counts = nap.TsdFrame(t=np.arange(100), d=np.random.poisson(size=(100, 2)))
>>> basis = nmo.basis.BSplineBasis(5, mode="conv", window_size=50)
>>> X = basis.compute_features(counts)
>>> X.shape
(100, 10)
>>> basis.split_by_feature(X, axis=1)["BSplineBasis"].shape
(100, 2, 5)
I prefer that any basis should have the method for consistency. I can see a case in which one create the additive basis in a script, but the number of component is variable, from 1 to N, if you have the method, you can use the same exact code to process a regular basis ( 1 component) and the rest. Secondly, reshaping correctly by input is handy. Without the method, in my python example above how do you split the feature axis correctly? X.reshape(100, 2, 5) or X.reshape(100, 5, 2)? that depends on the internals of the convolution, but one doesn't need to memorize.
Yes, totally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the small changes to the docstring we discussed, then this looks good to me!
src/nemos/basis.py
Outdated
(..., n_i, b_i, ...) | ||
\] | ||
$$ | ||
|
||
Here: | ||
- $n_i$ is the number of inputs processed by the **i-th basis component**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- $n_i$ is the number of inputs processed by the **i-th basis component**. | |
- $n_i$ is the number of inputs processed by the **i-th basis component**. |
Need an empty line here to render the docs correctly
Summary
This PR starts the process of facilitating slicing the feature axis of either the model coefficients, or of the design matrix.
In particular,
Added To
Basis
label
parameter: name the variable, write only. Label ofAdditiveBasis
andMultiplicativeBasis
combines the labels of 1d basis.n_basis_input
parameter: name the variable, write only, the shape of axis 1 of a 2D input to basis. This happens in convolve, for example when we convolve the counts of a neural population._get_feature_slicing
methods are for internal use, but will make our life very easy. Insplit_feature_axis
we could use ajax.tree_utils.tree_map
to apply the slicing to any array, and get automatically a dictionary, well labeled in a meaningful way, containing the coefficients.If arrays are numpy, the use of slice is very efficient, since it will create a dict of views of the array.
It will also facilitate creating a "FeaturePytree" from an input in array or TsdFrame form.
Example of
_get_feature_slicing
:Example of
split_by_feature
Moved To
nemos.identifiability_constraint
apply_identifiability_constraints
to each feature. This doesn't guarantee full rank but can be way faster, especially when the number of features is large, and can be applied in sequence: first a pass by component, then a full pass to the design matrix. It has the advantage of being more stable numerically, since the dimensionality of each component can be much smaller than the overall design matrix dimensionality, making the singular value computation and thresholding easier.