Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feature request: Handling Inf in data #1694

Open
wpetry opened this issue Oct 16, 2024 · 2 comments
Open

feature request: Handling Inf in data #1694

wpetry opened this issue Oct 16, 2024 · 2 comments
Labels
Milestone

Comments

@wpetry
Copy link

wpetry commented Oct 16, 2024

Description of current behavior

When Inf or -Inf are encountered in data, brm passes these rows to Stan, which fails because it is not able to evaluate the lp at the initial values. I think the standard troubleshooting for this error is to specify init = ... and/or to use more informative priors. But Stan will fail with this same error regardless of the initial values or priors specified. This appears to be a fitting issue, when in reality the source of the problem is in the data.

reprex:

library(brms)

x <- 0:100
mu <- 10 + 0.3 * x
y <- rnorm(mu, sd = 2)
dat <- data.frame(x, y)
dat$y[1] <- Inf

mod <- brm(y ~ 1 + x, data = dat)  # fails with Stan initialization error
mod2 <- lm(y ~ 1 + x, data = dat)  # base R regression gives a (somewhat) informative error in the same circumstance

Desired feature behavior

I think the best approach would be to stop the model fitting with an informative error instead of a warning. Infinite values are likely artifacts of errors during the calculation of variables and warrant re-examination before fitting any model (e.g., dividing by 0, log-transforming 0, etc.).

A softer approach would be to drop rows containing infinite values with a warning on the R side, then pass the cleaned data to Stan for fitting. This mirrors the handling of rows containing NA (absent user-specified imputation with mi()). I don't favor this approach because I'm not able to think of cases when it's still reasonable to fit a model after learning that some of the variable values are infinite.

@paul-buerkner
Copy link
Owner

Thank you for opening this issue! I will address it in the next brms update.

@paul-buerkner paul-buerkner added this to the brms 2.23.0 milestone Oct 17, 2024
@wds15
Copy link
Contributor

wds15 commented Oct 17, 2024

Isn't brms using Inf to flag special values sometimes? That is a useful thing sometimes, which I am doing myself sometimes. Throwing out a warning is certainly appropriate as Inf values can easily make Stan go crazy.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants