-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Mixed effects model to convert irregular data to basis expansion #618
base: develop
Are you sure you want to change the base?
Mixed effects model to convert irregular data to basis expansion #618
Conversation
…to_basis_mixed_effects_method
…to_basis_mixed_effects_method
…asis_mixed_effects_method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @pcuestas, I hope everything is ok. First, I have to apologize for taking so much time in reviewing your work in depth. In general it is really good, but I have found some minor things that could be improved.
Of course, I would totally understand if you do not have the time or will to review this after so much time. In that case, please just tell me and I will try to do the changes myself and merge it.
) | ||
def test_irregular_sample( | ||
fdata_fixture: str, request: Any | ||
) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A docstring explaining the test is missing here.
n_points_range: Tuple[int], | ||
random_state: np.random.RandomState, | ||
) -> FDataIrregular: | ||
"""Generate samples of functions at random points with gaussian noise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Generate samples of functions at random points with gaussian noise. | |
"""Generate samples of functions at random points with Gaussian noise. |
|
||
|
||
def _linalg_solve( | ||
a: NDArrayFloat, b: NDArrayFloat, *, assume_a: str = 'gen' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One parameter per line:
a: NDArrayFloat, b: NDArrayFloat, *, assume_a: str = 'gen' | |
a: NDArrayFloat, | |
b: NDArrayFloat, | |
*, | |
assume_a: str = "gen", |
try: | ||
return scipy.linalg.solve(a=a, b=b, assume_a=assume_a) # type: ignore | ||
except scipy.linalg.LinAlgError: | ||
# TODO: is the best way to handle this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually have something in https://github.com/GAA-UAM/scikit-fda/blob/develop/skfda/misc/lstsq.py that could probably be reused.
) # type: ignore | ||
|
||
|
||
class _MixedEffectsParams(Protocol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why a protocol with just one implementer?
def objective_function(params_vec: NDArrayFloat) -> float: | ||
return - model.profile_loglikelihood( | ||
params=MinimizeMixedEffectsConverter.Params.from_vec( | ||
params_vec, dim_effects, model=self, has_mean=has_mean, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One param per line:
params_vec, dim_effects, model=self, has_mean=has_mean, | |
params_vec, | |
dim_effects, | |
model=self, | |
has_mean=has_mean, |
X: FDataIrregular, | ||
) -> FDataBasis: | ||
"""Transform X to FDataBasis using the fitted converter.""" | ||
if self.result is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a scikit-learn method for that: sklearn.utils.validation.check_is_fitted
covariance=params.covariance, | ||
sigmasq=params.sigmasq, | ||
) | ||
self.result = Bunch( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameters set on fit
must end in underscore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also not convinced that we want to expose this as a Bunch
object.
) | ||
) | ||
for basis_eval, Sigma, random_effect in zip( | ||
model.basis_evaluations, values_cov, random_effects, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One parameter per line:
model.basis_evaluations, values_cov, random_effects, | |
model.basis_evaluations, | |
values_cov, | |
random_effects, |
basis = FourierBasis(n_basis=5, domain_range=(0, 10)) | ||
model = _MixedEffectsModel(fdatairregular, basis) | ||
|
||
# These values have been obtained with Statsmodels' MixedLM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would have been good to have in the docstring the lines of code that generate these data, if possible.
Hello @vnmabus, I also hope you are doing great. Thanks for the review. I am very occupied with the master's until March 29th. I can do the changes then if waiting for a couple of weeks is OK. |
I am glad that you are doing well! Of course it is ok, take the time you need. In case you change your mind, again just tell me. |
References to issues or other PRs
This branch includes the changes not yet merged into develop from Implement scores for
FDatairregular
objects as described in #609 #610Problems found when evaluating FDataIrregular objects in some cases are described in Evaluation of FDataIrregular observation with only one point not working #616 and Evaluation of
FDataIrregular
objects with multidimensional domain not working #617Describe the proposed changes
1. Mixed-effects model
Implementation of the mixed effects method for converting irregularly sampled functional data to basis expansion.
This implementation includes the addition of a new module: skfda.representation.conversion, which is meant for including classes (transformers) that convert between different FData sublasses.
For this type of conversion (FDataIrregular to FDataBasis) two classes have been added: EMMixedEffectsConverter and MinimizeMixedEffectsConverter, each of which implements the conversion with a different method to fit the MLE of the mixed-effects model.
The EMMixedEffectsConverter uses the EM algorithm and the MinimizeMixedEffectsConverter uses generic optimizers from scipy.optimize.minimize to find maxima of the loglikelihood function.
2. Create an FDataIrregular object from another FData
A function:
irregular_sample
has been added to skfda.datasets, which creates an irregular sample by randomly selecting points from another FData object.Additional information
All implementation added has been adapted (and tested) for multidimensional domains and codomains.
However, some problems have been found regarding this topic and are described in #616 and #617 .
Checklist before requesting a review