Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add group-level weights argument pw to gr() and mm(), for #1719 #1727

Merged
merged 12 commits into from
Feb 24, 2025

Conversation

bschneidr
Copy link
Contributor

This is a PR with draft implementation and tests of a weights argument in gr(), for the feature request in #1719. Would you mind reviewing Paul with any suggestions?

In the Stan code and data, the group-level weights are denoted GMW_{id} (for "group model weights")- I wasn't sure what the best way to name them in a way that fits the notation used elsewhere. The weights name in the Stan code is already used by the ad term resp | weights(w), and the mm() weights use the W_ syntax. So I wanted to pick a name for the group-level weights that was clearly different from the other existing names for different types of weights in the Stan code.

@paul-buerkner
Copy link
Owner

Thanks! Looks pretty good already upon first glance. I quickly went over it and added some comments/questions.

@bschneidr
Copy link
Contributor Author

Thanks Paul- can you share a link to the added comments/questions? I don't see them anywhere (for example at the "Changed files" list ) after some searching in different places

@paul-buerkner
Copy link
Owner

Sorry forgot to press "submit review" :-D

@bschneidr
Copy link
Contributor Author

Thank you Paul- I've gone through and addressed each of those comments with some new commits and a statistical explanation in one place. I think that should resolve all the points you raised, but I'm happy to make further changes.

@paul-buerkner
Copy link
Owner

Thank you! Looks good for the most parts. Just two more comments remain.

…else()` function with `if else` control flow.
@bschneidr
Copy link
Contributor Author

Thanks Paul! I've added a new commit to incorporate those changes.

@paul-buerkner
Copy link
Owner

paul-buerkner commented Jan 31, 2025 via email

@bschneidr
Copy link
Contributor Author

I think that's a good idea to let mm() have weights too, and to disambiguate the names in that case: the name pw for prior or weights seems good to me in terms of clarifying the weights' purpose.

Do you think this kind of call makes sense?

y ~  x1 + ( 1 | mm(g1, g2, weights = cbind(w1, w2), pw = cbind(grpw1, grpw2))

I'm not sure if it would be necessary to do pw = cbind(grpw1, grpw2) instead of just pw = grpw; that is, if there are some groups that will only show up in g2 and not g1. Do you know if that's the case?

@paul-buerkner
Copy link
Owner

paul-buerkner commented Jan 31, 2025 via email

@paul-buerkner paul-buerkner added this to the brms 2.23.0 milestone Feb 3, 2025
@bschneidr
Copy link
Contributor Author

Got it, thanks for clarifying Paul. I've added a couple commits that do the following:

469f3eb: Consistently uses the argument name pw in both gr() and mm(), and in the generated Stan code represents these weights as PW (previous commits used GMW).
37763af Adds the pw argument to mm(), along with accompanying unit tests.

Can you take a look and let me know if there's anything else you'd like me to add?

@bschneidr bschneidr changed the title Add weights argument to gr(), for #1719 Add weights argument to gr() and mm(), for #1719 Feb 23, 2025
@bschneidr bschneidr changed the title Add weights argument to gr() and mm(), for #1719 Add group-level weights argument pw to gr() and mm(), for #1719 Feb 23, 2025
@paul-buerkner
Copy link
Owner

Thank you! I will take another look again make few remaining edits before merging. :-)

@paul-buerkner
Copy link
Owner

paul-buerkner commented Feb 24, 2025

I have now cleaned up the PR a bit, especially the mm part. The inconsitency induced by the matrix weights required a bit too much code duplication for my taste, so I changed things back a bit. pw in mm is not yet super safe to use so I labled it as experimental. The chances of people actually using these features together is very small and things will not fail silently so I guess people can complain if they need it and it doesn't run.

Can you double check that the code is still correct even after my edits? Once you give the okay, I will merge this PR.

@bschneidr
Copy link
Contributor Author

Thanks, Paul. I think it's a good idea to clean it up. Unfortunately I think the mm() processing may be too clean right now. It errors out on this small reprex.

# Load the package
devtools::load_all()

# Generate multimembership data with prior weights
dat <- data.frame(
  y = rnorm(100), x1 = rnorm(100), x2 = rnorm(100),
  mw1 = rep(0.7, times = 100), mw2 = rep(0.3, times = 100),
  g1 = sample(1:10, 100, TRUE), g2 = sample(1:10, 100, TRUE)
)
pw_of_groups <- runif(n = 10, min = 0.9, max = 1.1) 
dat[['g1w']] <- pw_of_groups[dat[['g1']]]
dat[['g2w']] <- pw_of_groups[dat[['g2']]]
 
# multi-membership model with two members per group and equal weights
mm_standata <- brms::make_standata(
  y ~ x1 + (1|mm(g1, g2, weights = cbind(mw1, mw2), pw = cbind(g1w, g2w))), 
  data = dat
)
#> Warning: Support for prior weights in multimembership terms is experimental.
#> Error in tapply(X = group_prior_weights, INDEX = J, FUN = function(x) length(unique(x)) == : arguments must have same length

I think instead of trying to depend on the last grouping vector J to work (which I think it won't if it's missing one of the groups), it would be better to just reconstruct J so that it meets the requirements for the subsequent data processing. Here's a proposed update that simply reconstructs J by retrieving the output that was previously stored in out.

Here's a proposed update:

    # prepare data for group prior weights if specified
    if (nzchar(id_reframe$gcall[[1]]$pw)) {      
      if (id_reframe$gtype[1] == "mm") {
        warning2("Support for prior weights in multimembership terms is experimental.")
        J <- numeric()
        for (i in seq_along(gs)) {
          J <- c(J, out[[paste0("J_", idresp, "_", i)]])
        }
      }

Would you be open to me adding a commit with this change, and then updating the unit tests accordingly?

@paul-buerkner
Copy link
Owner

I think that's great! Would a simple J <- unlist(out[paste0("J_", idresp, "_", seq_along(gs))]) also work?

@bschneidr
Copy link
Contributor Author

Thanks! That's cleaner still.

I just put that update into the latest commit and updated the unit test for the data. I tweaked the unit test so that the multimembership group data is a little more complex: some levels only appear in g1 or g2 and not both. So now the unit test confirms that the code is correctly picking up groups and group weights from all the grouping variables.

My preference would be to leave the warning message out since there's a working unit test for it, but I think it's OK to leave it in.

Now that I've seen the unit tests are passing and I've tried this out with example data, I'm comfortable with the PR being merged whenever you choose.

@paul-buerkner
Copy link
Owner

Perfect! I made some final edits and the PR is now ready for merging. Thank you for contributing to brms! I highly appreciate it!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants