[Feature Request] Better documentation of the estimated parameters for Gaussian processes #1725

jr-leary7 · 2025-01-15T20:22:49Z

Hello,

First of all, just want to say I greatly appreciate all the work on brms; it is fantastic software and I cannot imagine how much effort has gone into maintaining and improving it. You're doing a great job !!

My request is that the documentation of how Gaussian processes work in brms be fleshed out a bit. As it currently stands, I've visited pretty much every GP thread on the MC-Stan dev forums (and posted a few questions of my own), read the documentation of the gp() function in brms, and even bought Bayesian Data Analysis 3rd Edition and read the chapter on GPs (Ch. 20 if I remember correctly). Despite all this, I'm still rather confused as to what the syntax brms uses is. Some of the parameters make sense e.g., lscale is the estimated length-scale of the GP. Some are less clear, such as the parameters named following zgp_gpxy[i] (when fitting a multivariate GP with location variables named x and y). I assume the numeric indicator i is a function of the number of basis functions approximating the GP $k$, and the number of dimensions in the multivariate GP, but I'm unsure what z denotes or how to summarize the parameter information.

For context (not expecting you to solve my current problem here, as I'm aware the MC-Stan forums are the more appropriate place to ask for help), I am working with spatial transcriptomics data and I'm trying to model gene expression (a Negative-binomially distributed count) as a function of a multivariate GP (approximated as a Hilbert space by $k = 20$ basis functions) based on location in Euclidean space, with individual GPs fit for each gene (an unordered factor). Thus, the model I'm fitting is as follows below, assuming the data are stored in a variable called expr_df:

# load libraries 
library(brms) 
library(cmdstanr)
# set up formula and fit model 
model_formula <- bf(count ~ 1 + gp(x, y, by = gene, k = 20L, cov = "exp_quad", scale = TRUE), 
                    shape ~ 1)
brms_fit <- brm(model_formula, 
                data = expr_df,
                family = brms::negbinomial(link = "log", link_shape = "log"),
                chains = 1L,
                iter = 1500L,
                warmup = 250L,
                thin = 5L,
                cores = 1L,
                threads = 4L,
                normalize = FALSE, 
                silent = 1,
                backend = "cmdstanr",
                algorithm = "meanfield",
                stan_model_args = list(stanc_options = list("O1")), 
                seed = 312)

I hope this a reasonable request (and please let me know if it isn't !); the GP additions in the latest dev version of brms are awesome and I know I and others find them quite useful. I'm happy to contribute if possible r.e. sketching out other use cases, contributing code for examples, etc. if that would be helpful.

Thanks !

Jack
University of Florida

The text was updated successfully, but these errors were encountered:

paul-buerkner · 2025-01-16T09:41:57Z

I think requests to extend the doc are very reasonable. I am not sure though when I will have time for it myself at the moment. If you have concrete thoughts on what to add based on what you had to figure out yourself from other sources, I would be happy to review a PR adding some more details to the docs.

jr-leary7 · 2025-01-17T03:17:40Z

that makes a lot of sense - i know maintaining & expanding software while keeping up an academic career is a huge task. i think something along the lines of even just explaining what the estimated parameters are and how to interpret them would be very helpful, and the rest could be left to the forums. for example, adding a short table to the gp() docs like so (i'm leaving the parts unclear to me marked with a ?):

parameter	name	notation	interpretation
`lscale`	length-scale	$\ell$	distance over which points in the input space are strongly correlated
`sdgp`	standard deviation	$\sigma$	typical magnitude of variation in the GP
`zgp`	?	$z$	?

some very basic guidance on prior choice for relevant parameters would be useful too -- i could certainly gather some of this info from the forums and the literature and compile a summary.

paul-buerkner · 2025-01-17T07:09:51Z

yeah, makes sense. This may be something I can quickly add before the next release. If you happen to have time for a PR beforehand (even just the parts that you know), feel free and I can then edit in the remaining parts.

jr-leary7 · 2025-01-17T18:13:27Z

thanks so much ! where would be your preferred place for me to put the documentation ? the first thing that comes to mind is in the roxygen doc section of the gp() function, but i can do whatever works best for you.

paul-buerkner · 2025-01-17T18:34:24Z

yes that would be a good place! Jack Leary ***@***.***> schrieb am Fr., 17. Jan. 2025, 20:13:

…

thanks so much ! where would be your preferred place for me to put the documentation ? the first thing that comes to mind is in the roxygen doc section of the gp() function, but i can do whatever works best for you. — Reply to this email directly, view it on GitHub <#1725 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADCW2AEIQXG6B5IX4ZW3W7T2LFB55AVCNFSM6AAAAABVICDHBOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJYHEZDSNZXGY> . You are receiving this because you commented.Message ID: ***@***.***>

jr-leary7 · 2025-01-19T19:27:59Z

as shown above i added some details on each hyperparameter to the docs; let me know if anything looks amiss and i can correct it.

jr-leary7 · 2025-01-20T00:10:54Z

just realized i made a slight mistake in the code sent in the PR, i used $\rho$ to refer to the length-scale, which i should have denoted $\ell$, in lines 82-83 of formula-gp.R. This is because I had just a few minutes prior written a line about how the MC-Stan docs denote the length-scale as $\rho$ instead.

paul-buerkner · 2025-01-20T06:07:53Z

you can always push more commits to the PR so you can just fix it Jack Leary ***@***.***> schrieb am Mo., 20. Jan. 2025, 02:11:

…

just realized i made a slight mistake in the code sent in the PR, i used $\rho$ to refer to the length-scale, which i should have denoted $\ell$, in lines 82-83 of formula-gp.R. This is because I had just a few minutes prior written a line about how the MC-Stan docs denote the length-scale as $\rho$ instead. — Reply to this email directly, view it on GitHub <#1725 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADCW2AEKKS5BJE3KX4FBXVD2LQ5KHAVCNFSM6AAAAABVICDHBOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMBRGA4DMNZRHE> . You are receiving this because you commented.Message ID: ***@***.***>

jr-leary7 · 2025-01-20T22:38:54Z

All done !

paul-buerkner added the documentation label Jan 16, 2025

paul-buerkner added this to the brms 2.23.0 milestone Jan 17, 2025

jr-leary7 mentioned this issue Jan 19, 2025

updated docs of GP formula fn to include more details on hyperparameters #1726

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Better documentation of the estimated parameters for Gaussian processes #1725

[Feature Request] Better documentation of the estimated parameters for Gaussian processes #1725

jr-leary7 commented Jan 15, 2025

paul-buerkner commented Jan 16, 2025

jr-leary7 commented Jan 17, 2025 •

edited

Loading

paul-buerkner commented Jan 17, 2025

jr-leary7 commented Jan 17, 2025

paul-buerkner commented Jan 17, 2025 via email

jr-leary7 commented Jan 19, 2025

jr-leary7 commented Jan 20, 2025

paul-buerkner commented Jan 20, 2025 via email

jr-leary7 commented Jan 20, 2025

[Feature Request] Better documentation of the estimated parameters for Gaussian processes #1725

[Feature Request] Better documentation of the estimated parameters for Gaussian processes #1725

Comments

jr-leary7 commented Jan 15, 2025

paul-buerkner commented Jan 16, 2025

jr-leary7 commented Jan 17, 2025 • edited Loading

paul-buerkner commented Jan 17, 2025

jr-leary7 commented Jan 17, 2025

paul-buerkner commented Jan 17, 2025 via email

jr-leary7 commented Jan 19, 2025

jr-leary7 commented Jan 20, 2025

paul-buerkner commented Jan 20, 2025 via email

jr-leary7 commented Jan 20, 2025

jr-leary7 commented Jan 17, 2025 •

edited

Loading