-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Return more things #18
Comments
Interesting! This sounds like a good feature to add. Perhaps we could take advantage of the For the moment, take a look at this example in which I compute the priors and keep track of the last value in library(fmcmc)
set.seed(1231)
n <- 1e3
pars <- c(mean = 2.6, sd = 3)
# Generating data and writing the log likelihood function
D <- rnorm(n, pars[1], pars[2])
fun <- function(x) {
# Computing the prior, and storing it in LAST_MCMC
priorval <- dnorm(x, c(2, 3), log = TRUE)
LAST_MCMC$priorval <- priorval
sum(dnorm(D, x[1], x[2], log = TRUE)) + sum(priorval)
}
# Calling MCMC, but first, loading the coda R package for
# diagnostics
library(coda)
ans <- MCMC(
fun, initial = c(mu=1, sigma=1), nsteps = 2e3,
kernel = kernel_normal_reflective(scale = .1, ub = 10, lb = 0)
)
last_("priorval")
#> mu sigma
#> -1.0795954 -0.9221123 Created on 2021-05-20 by the reprex package (v2.0.0) I will add this example to the manual, thanks! |
This looks cool, but right now it can store only single values, right? I was thinking of returning values of several (derived) parameters at every iteration, so that the final object would not only contain parameter draws, but also these computed values for every iteration. I am doing some parameter transforms and would love to report my transformed parameter alonside the pretty boring uniform prior. You could probably report "generated values" as well, if you RNG a a random value at every iteration using the drawn parameter value. This could be useful for prior predictive checks. BTW, STAN reports lp__ which is a vector of log-likelihoods, did you ever consider recording it into the fit object as well? |
That is possible too, for example, if you wanted to keep values for each iteration you'd need to replace the library(fmcmc)
set.seed(1231)
n <- 1e3
pars <- c(mean = 2.6, sd = 3)
# Generating data and writing the log likelihood function
D <- rnorm(n, pars[1], pars[2])
fun <- function(x) {
# Computing the prior, and storing it in LAST_MCMC
priorval <- dnorm(x, c(2, 3), log = TRUE)
LAST_MCMC$priorval <- rbind(LAST_MCMC$priorval, priorval)
sum(dnorm(D, x[1], x[2], log = TRUE)) + sum(priorval)
}
# Calling MCMC, but first, loading the coda R package for
# diagnostics
library(coda)
ans <- MCMC(
fun, initial = c(mu=1, sigma=1), nsteps = 2e3,
kernel = kernel_normal_reflective(scale = .1, ub = 10, lb = 0)
)
head(last_("priorval"))
#> mu sigma
#> priorval -1.418939 -2.918939
#> priorval -1.418939 -2.918939
#> priorval -1.429507 -2.657561
#> priorval -1.429036 -2.526366
#> priorval -1.547106 -2.339950
#> priorval -1.688518 -2.572419 Created on 2021-06-07 by the reprex package (v2.0.0) Regarding returning the vector of computed likelihoods, I can see how beneficial that could be, especially if you are dealing with a function that is particularly slow to compute. I'll add that to the next iteration. |
Hi and thank you for this. Extremely helpful.
library(fmcmc)
library(coda)
set.seed(1231)
n <- 1e3
pars <- c(mean = 2.6, sd = 3)
# Generating data and writing the log likelihood function
D <- rnorm(n, pars[1], pars[2])
fun <- function(x) {
LAST_MCMC$prior <- rbind(LAST_MCMC$prior, x)
# Computing the prior, and storing it in LAST_MCMC
priorval <- dnorm(x, c(2, 3), log = TRUE)
LAST_MCMC$priorval <- rbind(LAST_MCMC$priorval, priorval)
loglik <- sum(dnorm(D, x[1], x[2], log = TRUE)) + sum(priorval)
LAST_MCMC$loglik <- rbind(LAST_MCMC$loglik, loglik)
loglik
}
ans <- MCMC(
fun, initial = c(mu=1, sigma=1), nsteps = 2e3,
kernel = kernel_normal_reflective(scale = .1, ub = 10, lb = 0)
)
dim(last_("prior"))
#> [1] 2002 2
head(last_("prior"))
#> mu sigma
#> x 1.0000000 1.000000
#> x 1.0000000 1.000000
#> x 0.9894871 1.135263
#> x 0.9899534 1.206998
#> x 0.8791363 1.314170
#> x 0.7593717 1.181495
dim(last_("priorval"))
#> [1] 2002 2
head(last_("priorval"))
#> mu sigma
#> priorval -1.418939 -2.918939
#> priorval -1.418939 -2.918939
#> priorval -1.429507 -2.657561
#> priorval -1.429036 -2.526366
#> priorval -1.547106 -2.339950
#> priorval -1.688518 -2.572419
dim(last_("loglik"))
#> [1] 2002 1
head(last_("loglik"))
#> [,1]
#> loglik -6431.756
#> loglik -6431.756
#> loglik -5336.329
#> loglik -4902.609
#> loglik -4495.771
#> loglik -5317.829
head(ans)
#> Markov Chain Monte Carlo (MCMC) output:
#> Start = 1
#> End = 7
#> Thinning interval = 1
#> mu sigma
#> 1 0.9894871 1.135263
#> 2 0.9899534 1.206998
#> 3 0.8791363 1.314170
#> 4 0.8791363 1.314170
#> 5 0.8791363 1.314170
#> 6 0.8791363 1.314170
#> 7 1.1162015 1.397068 Created on 2021-06-23 by the reprex package (v2.0.0)
LAST_MCMC$prior <- NULL
LAST_MCMC$priorval <- NULL
LAST_MCMC$loglik <- NULL Again, thanks for this tip. This looks really helpful and extendable. |
All great points. Let me go one by one: Sample size not matchingYes, there is a difference. The current version actually calls the main function two times before entering the MCMC loop, thus, if running without thinning o burn-in, you should have two extra observations Still, you can know the place of the step by accessing the calling environment with library(fmcmc)
set.seed(1231)
n <- 1e3
pars <- c(mean = 2.6, sd = 3)
# Generating data and writing the log likelihood function
D <- rnorm(n, pars[1], pars[2])
fun <- function(x) {
# Getting the calling frame. This function is wrapped by `MCMC`, so we need
# to go two levels up to take a look at the environment where `MCMC` is calling
# the function.
mcmc_env <- parent.frame(2) # This should contain -i-, the number of step
# Computing the prior, and storing it in LAST_MCMC
priorval <- dnorm(x, c(2, 3), log = TRUE)
LAST_MCMC$priorval <- rbind(
LAST_MCMC$priorval,
c(ifelse(exists("i", envir = mcmc_env), mcmc_env$i, NA), priorval)
)
sum(dnorm(D, x[1], x[2], log = TRUE)) + sum(priorval)
}
# Calling MCMC, but first, loading the coda R package for
# diagnostics
library(coda)
ans <- MCMC(
fun, initial = c(mu=1, sigma=1), nsteps = 2e3,
kernel = kernel_normal_reflective(scale = .1, ub = 10, lb = 0)
)
head(last_("priorval"))
#> mu sigma
#> [1,] NA -1.418939 -2.918939
#> [2,] 2 -1.429507 -2.657561
#> [3,] 3 -1.429036 -2.526366
#> [4,] 4 -1.547106 -2.339950
#> [5,] 5 -1.688518 -2.572419
#> [6,] 6 -1.786716 -2.669918 Created on 2021-06-24 by the reprex package (v2.0.0) If you run the chain using ans <- MCMC(
fun, initial = c(mu=1, sigma=1), nsteps = 2e3,
kernel = kernel_normal_reflective(scale = .1, ub = 10, lb = 0),
burnin = 0,
thin = 1
)
nrow(last_("priorval"))
2002
nrow(ans)
2000 THIS IS SOMETHING I HAVE FIXED ON THE DEVELOPMENT VERSION. LAST_MCMC only works with multicore = FALSEYes, that is true. I need to figure out a way to solve that. Perhaps, just like the library(fmcmc)
set.seed(1231)
n <- 1e3
pars <- c(mean = 2.6, sd = 3)
# Generating data and writing the log likelihood function
D <- rnorm(n, pars[1], pars[2])
fun <- function(x) {
# Getting the calling frame. This function is wrapped by `MCMC`, so we need
# to go two levels up to take a look at the environment where `MCMC` is calling
# the function.
mcmc_env <- parent.frame(2) # This should contain -i-, the number of step
# Computing the prior, and storing it in LAST_MCMC
priorval <- dnorm(x, c(2, 3), log = TRUE)
LAST_MCMC$priorval <- rbind(
LAST_MCMC$priorval,
c(ifelse(exists("i", envir = mcmc_env), mcmc_env$i, NA), priorval)
)
sum(dnorm(D, x[1], x[2], log = TRUE)) + sum(priorval)
}
# Calling MCMC, but first, loading the coda R package for
# diagnostics
library(coda)
library(parallel)
cl <- makePSOCKcluster(2)
clusterEvalQ(cl, library(fmcmc)) # to create the LAST_FMCMC obj
#> [[1]]
#> [1] "fmcmc" "stats" "graphics" "grDevices" "utils" "datasets"
#> [7] "methods" "base"
#>
#> [[2]]
#> [1] "fmcmc" "stats" "graphics" "grDevices" "utils" "datasets"
#> [7] "methods" "base"
clusterExport(cl, c("D"))
ans <- MCMC(
fun, initial = c(mu=1, sigma=1), nsteps = 2e3,
kernel = kernel_normal_reflective(scale = .1, ub = 10, lb = 0),
burnin = 0,
thin = 1,
# Running multicore with my own cluster
multicore = TRUE,
cl = cl,
nchains = 2
)
#> Warning: While using multiple chains, a single initial point has been passed via
#> `initial`: c(1, 1). The values will be recycled. Ideally you would want to start
#> each chain from different locations.
clusterEvalQ(cl, head(last_("priorval")))
#> [[1]]
#> mu sigma
#> [1,] NA -1.418939 -2.918939
#> [2,] NA -1.418939 -2.918939
#> [3,] 1 -1.373670 -2.662039
#> [4,] 2 -1.426267 -3.076379
#> [5,] 3 -1.304141 -2.593316
#> [6,] 4 -1.275321 -2.866264
#>
#> [[2]]
#> mu sigma
#> [1,] NA -1.418939 -2.918939
#> [2,] NA -1.418939 -2.918939
#> [3,] 1 -1.193565 -2.714025
#> [4,] 2 -1.194003 -2.840829
#> [5,] 3 -1.126817 -2.769653
#> [6,] 4 -1.095577 -2.511297
stopCluster(cl) Created on 2021-06-24 by the reprex package (v2.0.0) Things become fuzzy when nchains > 1This is very related to the previous question. Again, this could also be hacked by a Flush things in LAST_MCMCThat is a great suggestion, I can certainly add a function HIH |
Hi, thank you for detailed answers.
I think this could be made slightly more accessible, i.e. without requiring the parent frame hack. Since you are saving some samples in
If those initial two samples are for some reason important, I have no objections. I just want to be able to merge generated variables with posterior draws easily (and indexing via
Again, the more the package resembles the "senior brothers" Stan and JAGS the better. Mimicking the structure of the
This whole "make your own cluster" trick is a thing from the past. Similar to
And my proposal would cover the same. Keep the "chain id" by the "draw id".
Not sure I like the name, but ok. Why can't you flush automatically every time the |
I think we need a non-pure function
|
Still haven't figure out the right names, but I think this is close to being closed (pun intended). See this new vignette that will be included in version 0.5-0 https://uscbiostats.github.io/fmcmc/articles/advanced-features.html#saving-information-as-it-happens-1 |
Ok. I created a user function which I find useful and which I would love to see in fmcmc #' Add userdata to the fmcmc output
#'
#' Combines list of dataframes produced by fmcmc::get_userdata()
#' with mcmc.list() produced by fmcmc::MCMC ensuring that the
#' chains and iterations match
#'
#' @param x mcmc.list to add userdata to. Note, that userdata
#' is taken from the environment (whatever fmcmc::get_userdata()
#' returns)
#'
#' @return combined mcmc.list
#' @export
#' @importFrom coda mcpar mcmc as.mcmc.list
#' @examples
add_userdata <- function(x){
ud <- fmcmc::get_userdata()
stopifnot("Number of chains do not match"=
(length(x)==length(ud)))
res <- mapply(function(x,y){
iters_y <- as.numeric(rownames(x))
min_iters_y <- min(iters_y)
max_iters_y <- max(iters_y)
mcpar_x <- coda::mcpar(x)
mcpar_y <- c(min_iters_y, max_iters_y, 1)
stopifnot("Iterations are not matching"=
all.equal(mcpar_x, mcpar_y))
my <- as.matrix(y, dimnames=list(iters_y,colnames(y)))
coda::mcmc(cbind(x,my), start = min_iters_y, end = max_iters_y)
},x=x, y=ud, SIMPLIFY = FALSE)
coda::as.mcmc.list(res)
} As seen from the documentation, this function merges the userdata and the results from sampling into a single object. What do you think? |
Thanks for the suggestion! I am not entirely sure about the name, but I think we could include it in the next version. Would you be able to make a pull request with this, including a test? LMK if you need help with it. |
Awesome package. I switched to using
fmcmc
and love it! Really streamlined interface, burgeoning functionality, stellar performance.There's one feature I miss from
adaptMCMC
and that is the ability to return multiple things from the logdensity function. Here's a quote describing the function argument inadaptMCMC::MCMC()
I understand that parsing the returned list will cost performance, but it could enable more interesting output from MCMC and, in particular, allow users to transform parameters (similar to
transformed parameters
block in STAN).The text was updated successfully, but these errors were encountered: