diff --git a/R/pglmm-utils.R b/R/pglmm-utils.R index c821fbd..0f0ca71 100755 --- a/R/pglmm-utils.R +++ b/R/pglmm-utils.R @@ -1291,10 +1291,7 @@ predict.communityPGLMM <- function(object, newdata = NULL, ...) { #' @export #' simulate.communityPGLMM <- function(object, nsim = 1, seed = NULL, - use.u = FALSE, re.form = NULL, ...) { - if(use.u & object$bayes) { - stop("Sorry, simulate.communityPGLMM currently doesn't support use.u = TRUE, but we are working on it!") - } + re.form = NULL, ...) { if(!is.null(seed)) set.seed(seed) #sim <- INLA::inla.posterior.sample(nsim, object$inla.model) @@ -1304,14 +1301,14 @@ simulate.communityPGLMM <- function(object, nsim = 1, seed = NULL, # for gaussion, binomial, and poisson distributions. nn <- nrow(object$iV) - if(is.null(re.form) | use.u){ + if(is.null(re.form)){ sim <- pglmm_predicted_values(object, re.form = NULL, type = "link")$Y_hat sim <- sim %*% matrix(1, 1, nsim) if(object$family == "gaussian") sim <- sim + sqrt(object$s2resid) * matrix(rnorm(nsim * nn), nrow = nn) } else { re.form = deparse(NA) - if(deparse(re.form) == "~0" | deparse(re.form) == "NA" | !use.u){ + if(deparse(re.form) == "~0" | deparse(re.form) == "NA"){ # condition on none of the random effects sim <- (object$X %*% object$B) %*% matrix(1, 1, nsim) chol.V <- backsolve(chol(object$iV), diag(nn)) @@ -1332,7 +1329,7 @@ simulate.communityPGLMM <- function(object, nsim = 1, seed = NULL, sim <- apply(mu_sim, MARGIN = 2, FUN = function(x) rbinom(length(x), Ntrials, x)) } } else { # beyes version - if(deparse(re.form) == "~0" | deparse(re.form) == "NA" | !use.u) + if(deparse(re.form) == "~0" | deparse(re.form) == "NA") warning("re.form = NULL is the only option for bayes models at this moment", immediate. = TRUE) mu_sim <- do.call(rbind, lapply(object$inla.model$marginals.fitted.values, diff --git a/README.Rmd b/README.Rmd index a84bbb4..84f4a91 100644 --- a/README.Rmd +++ b/README.Rmd @@ -23,7 +23,7 @@ To install this package: devtools::install_github("daijiang/phyr") # or install from binary file (may not be the latest version) # macOS -install.packages("https://raw.githubusercontent.com/daijiang/phyr/master/phyr_0.1.6.tgz", repos = NULL) +install.packages("https://raw.githubusercontent.com/daijiang/phyr/master/phyr_1.0.3.tgz", repos = NULL) # Windows install.packages("https://raw.githubusercontent.com/daijiang/phyr/master/phyr_0.1.6.zip", repos = NULL) ``` diff --git a/README.md b/README.md index 2bda332..6e1ce0f 100644 --- a/README.md +++ b/README.md @@ -14,7 +14,7 @@ To install this package: devtools::install_github("daijiang/phyr") # or install from binary file (may not be the latest version) # macOS -install.packages("https://raw.githubusercontent.com/daijiang/phyr/master/phyr_0.1.6.tgz", repos = NULL) +install.packages("https://raw.githubusercontent.com/daijiang/phyr/master/phyr_1.0.3.tgz", repos = NULL) # Windows install.packages("https://raw.githubusercontent.com/daijiang/phyr/master/phyr_0.1.6.zip", repos = NULL) ``` diff --git a/docs/articles/benchmarks.html b/docs/articles/benchmarks.html index ed3553f..5318809 100644 --- a/docs/articles/benchmarks.html +++ b/docs/articles/benchmarks.html @@ -94,7 +94,7 @@
vignettes/benchmarks.Rmd
benchmarks.Rmd
vignettes/pglmm.Rmd
pglmm.Rmd
“pc.prior.auto” is a good choice to generate priors for binomial models. If you are interested in the details of this kind of prior (known as a penalizing complexity prior), check out this paper: https://arxiv.org/abs/1403.4630.
The results of this model are consistent with the ML model, which is good to see. Now we also have lower and upper credible intervals. We can look at the full approximate marginal posterior distributions of the random effects and fixed effects with the plot_bayes
function.
What we are looking for is that the posterior distribution mode is well away from zero, and that it looks relatively symmetrical. If it were skewed and crammed up against the left side of the plot, near zero, we would conclude that the effect is weak (remembering that variance components cannot be less than or equal zero, so there will always be some positive probability mass). The most obvious effects (well away from zero) are again the species random effect, the species-by-disturbance random effect, and the nested phylogenetic effect. In this plot, the 95% credible interval is also plotted, along with the posterior mean (the point and bar at the bottom of each density). For the random effects these aren’t too meaningful, but they can help distinguish genuine effects in the fixed effects. If these credible intervals overlap zero in the fixed effects, the density will be plotted with a lighter color to suggest it is not a strong effect (although this is not relevant for this dataset, because both fixed effects are strong).
-mod_bayes
as an exampleThe next thing we might want to check is whether assumptions of the model are met by the data. The typical way to do this is by plotting and/or analyzing the model residuals. In non-Gaussian models such as this one, this can be less straightforward. However, phyr
output supports the DHARMa
package, which can generated a generalized type of residual known as randomized quantile residuals (or sometimes Dunn-Smyth residuals). These can be calculated and inspected for nearly any error distribution. We can produce standard diagnostic plots for our pglmm
model by simply calling DHARMa::simulateResiduals
on the model object.
resids <- DHARMa::simulateResiduals(mod_bayes, plot = FALSE) @@ -306,13 +306,20 @@DHARMa::plotResiduals(resids, mod_bayes$data$disturbance)
Obviously no problems there.
+Obviously no problems there. This is also the case for the model fitted by Maximum Likelihood method (mod
).
+resids_mod <- DHARMa::simulateResiduals(mod, plot = FALSE) +#> Warning in checkModel(fittedModel): DHARMa: fittedModel not in class of +#> supported models. Absolutely no guarantee that this will work! +plot(resids_mod) +
Another question about our model is simply how well it fits the data. A metric appealing in its simplicity is the classic R2 metric, which purports to estimate the proportion of the variance in the response explained by the model predictors. Once again, though, when using non-Gaussian errors this can become a bit tricky. An additional complication is created by model with random effects. Given that random effects are very flexible model components (for example, nothing stops you from fitting a random effect for each observation in your dataset), a straight-up calculation of variance explains isn’t meaningful. That said, methods that can produce a useful R2 metric in the complex situation have been developed. The package rr2
is able to calculate several flavors of R2, and supports phyr
’s pglmm
model object. Let’s try it!
+rr2::R2(mod) #> Models of class pglmm do not have a R2_resid method. #> R2_lik R2_pred @@ -321,7 +328,7 @@#> Models of class pglmm do not have a R2_resid method. #> Models of class pglmm with bayes = TRUE do not have a R2_lik method. #> R2_pred -#> 0.4577072 +#> 0.4577102
There we go! R2 = 0.44 for
mod
and R2 = 0.46 formod_bayes
! We can think of this as saying roughly 44% (or 46%) of our response’s variance has been explained by our model, taking into account the various sources of covariance we modeled, subject to a boatload of assumption and caveats of course. That’s handy! See Ives (2018) for the full description of how this R2 works.
Assessments of predictive accuracy of Species Distribution Models are generally based on confusion matrices that record the numbers of true positive, false positive, true negative, and false negative. Such matrices are straightforward to construct for models providing presence-absence predictions, which is true for most SDMs. Commonly used measures for SDMs include Specificity (true negative rate), Sensitivity (true positive rate), True Skill Statistic (TSS = Sensitivity + Specificity - 1), and area under the receiver operating characteristic (ROC) curve (AUC) (Allouche et al. 2006). Combining the observed values with the predictions generated by the pglmm_predicted_values()
function, we can calculate such measures to evaluate the performance of our models.
+pred_mod = phyr::pglmm_predicted_values(mod)$Y_hat obs_mod = mod$data$pres roc_mod <- pROC::roc(obs_mod, pred_mod, percent = T, direction = "<", levels = c(0,1)) @@ -345,7 +352,7 @@
We can see that the AUC is pretty high (0.9273677) with the model fitted with maximum likelihood framework (
-mod
). True positive rates and true negative rates are also both reasonably high. What about the model fitted with Bayesian framework (mod_bayes
)? Not surprising, the results are similar to those ofmod
.+pred_mod_bayes = phyr::pglmm_predicted_values(mod_bayes)$Y_hat obs_mod_bayes = mod_bayes$data$pres roc_mod_bayes <- pROC::roc(obs_mod_bayes, pred_mod_bayes, percent = T, @@ -363,7 +370,7 @@
Now let’s compare the model that does not account for phylogenetic relationships, that is, the regular joint species distribution models.
-+mod_bayes_no_phy <- phyr::pglmm(pres ~ disturbance + (1 | sp) + (1 | site) + (disturbance | sp), data = oldfield$data, diff --git a/docs/articles/phyr_example_empirical_files/figure-html/call_dharma-1.png b/docs/articles/phyr_example_empirical_files/figure-html/call_dharma-1.png index feb1c06..11de495 100644 Binary files a/docs/articles/phyr_example_empirical_files/figure-html/call_dharma-1.png and b/docs/articles/phyr_example_empirical_files/figure-html/call_dharma-1.png differ diff --git a/docs/articles/phyr_example_empirical_files/figure-html/call_dharma_mod-1.png b/docs/articles/phyr_example_empirical_files/figure-html/call_dharma_mod-1.png new file mode 100644 index 0000000..9015ab9 Binary files /dev/null and b/docs/articles/phyr_example_empirical_files/figure-html/call_dharma_mod-1.png differ diff --git a/docs/articles/phyr_example_empirical_files/figure-html/perfBayes-1.png b/docs/articles/phyr_example_empirical_files/figure-html/perfBayes-1.png index 0f3b2a4..56f4553 100644 Binary files a/docs/articles/phyr_example_empirical_files/figure-html/perfBayes-1.png and b/docs/articles/phyr_example_empirical_files/figure-html/perfBayes-1.png differ diff --git a/docs/articles/phyr_example_empirical_files/figure-html/plot_bayes-1.png b/docs/articles/phyr_example_empirical_files/figure-html/plot_bayes-1.png index 9f45ece..ef8c3f2 100644 Binary files a/docs/articles/phyr_example_empirical_files/figure-html/plot_bayes-1.png and b/docs/articles/phyr_example_empirical_files/figure-html/plot_bayes-1.png differ diff --git a/docs/articles/plot-re.html b/docs/articles/plot-re.html index 7b2ec36..43e65c4 100644 --- a/docs/articles/plot-re.html +++ b/docs/articles/plot-re.html @@ -94,7 +94,7 @@Plot random terms of communityPGLMM
Daijiang Li
-2020-07-15
+2020-07-18
Source:vignettes/plot-re.Rmd
diff --git a/docs/pkgdown.yml b/docs/pkgdown.yml index 092d715..4827779 100644 --- a/docs/pkgdown.yml +++ b/docs/pkgdown.yml @@ -2,5 +2,5 @@ pandoc: 2.7.3 pkgdown: 1.5.1.9000 pkgdown_sha: caf7cc74e008e8d0d6e53eeda93cd3cad3545bf2 articles: [] -last_built: 2020-07-16T01:00Z +last_built: 2020-07-18T19:08Z diff --git a/docs/reference/fixef.html b/docs/reference/fixef.html index 1776580..6ebcb9f 100644 --- a/docs/reference/fixef.html +++ b/docs/reference/fixef.html @@ -142,7 +142,7 @@plot-re.Rmd
Extract fixed-effects estimates
# S3 method for communityPGLMM -fixef(object, ...)+fixef(object, ...)Arguments
diff --git a/docs/reference/pglmm-plot-data.html b/docs/reference/pglmm-plot-data.html index 0e04256..22275f4 100644 --- a/docs/reference/pglmm-plot-data.html +++ b/docs/reference/pglmm-plot-data.html @@ -143,7 +143,7 @@
Plot the original dataset and predicted values (optional)
function requires the packagesggplot2
andggridges
to be installed. -plot_data( +plot_data( x, sp.var = "sp", site.var = "site", @@ -155,7 +155,7 @@+plot_bayes(x, n_samp = 1000, sort = TRUE, ...)Plot the original dataset and predicted values (optional)
) # S3 method for communityPGLMM -plot_bayes(x, n_samp = 1000, sort = TRUE, ...)Arguments
diff --git a/docs/reference/pglmm-plot-re.html b/docs/reference/pglmm-plot-re.html index 1617740..10dcea8 100644 --- a/docs/reference/pglmm-plot-re.html +++ b/docs/reference/pglmm-plot-re.html @@ -149,7 +149,7 @@
Visualize random terms of communityPGLMMs
save time. -pglmm_plot_ranef( +pglmm_plot_ranef( formula = NULL, data = NULL, family = "gaussian", @@ -171,7 +171,7 @@Visualize random terms of communityPGLMMs
... ) -communityPGLMM.show.re( +communityPGLMM.show.re( formula = NULL, data = NULL, family = "gaussian", @@ -193,7 +193,7 @@Visualize random terms of communityPGLMMs
... ) -pglmm_plot_re( +pglmm_plot_re( formula = NULL, data = NULL, family = "gaussian", @@ -215,7 +215,7 @@Visualize random terms of communityPGLMMs
... ) -communityPGLMM.plot.re( +communityPGLMM.plot.re( formula = NULL, data = NULL, family = "gaussian", @@ -334,7 +334,7 @@Arg
+ random.effects Optional pre-build list of random effects. If
NULL
(the default), -the functionprep_dat_pglmm
will prepare the random effects for you from the information +the functionprep_dat_pglmm
will prepare the random effects for you from the information informula
,data
, andcov_ranef
.random.effect
allows a list of pre-generated random effects terms to increase flexibility; for example, this makes it possible to construct models with both phylogenetic correlation and spatio-temporal autocorrelation. diff --git a/docs/reference/pglmm-predicted-values.html b/docs/reference/pglmm-predicted-values.html index a654579..f048b12 100644 --- a/docs/reference/pglmm-predicted-values.html +++ b/docs/reference/pglmm-predicted-values.html @@ -149,6 +149,8 @@Predicted values of PGLMM
x, cpp = TRUE, gaussian.pred = c("nearest_node", "tip_rm"), + re.form = NULL, + type = c("link", "response"), ... ) @@ -174,6 +176,17 @@Arg
+ When family is gaussian, which type of prediction to calculate? Option nearest_node will predict values to the nearest node, which is same as lme4::predict or fitted. Option tip_rm will remove the point then predict the value of this point with remaining ones.
+ +re.form ++ (formula,
NULL
, orNA
) specify which random effects to condition on when predicting. +IfNULL
, include all random effects (i.e Xb + Zu); +ifNA
or~0
, include no random effects (i.e. Xb).+ type +character string - either
"link"
, the default, or +"response"
indicating the type of prediction object returned.... diff --git a/docs/reference/pglmm.html b/docs/reference/pglmm.html index bee629e..a055023 100644 --- a/docs/reference/pglmm.html +++ b/docs/reference/pglmm.html @@ -688,7 +688,7 @@Examp pglmm(Y ~ 1 + (1|sp__) + (1|site) + (1|sp__@site), data = d, cov_ranef = list(sp = phy))
#> Linear mixed model fit by restricted maximum likelihood #> #> Call:Y ~ 1 -#> <environment: 0x7fc4a64e8190> +#> <environment: 0x7ff1b3fd2570> #> #> logLik AIC BIC #> -43.14 98.28 98.48 @@ -801,7 +801,7 @@Examp mod.f
#> Linear mixed model fit by restricted maximum likelihood #> #> Call:Y ~ 1 -#> <environment: 0x7fc4a64e8190> +#> <environment: 0x7ff1b3fd2570> #> #> logLik AIC BIC #> -680.7 1379.4 1402.5 diff --git a/docs/reference/pglmm_compare.html b/docs/reference/pglmm_compare.html index b579afb..0160487 100644 --- a/docs/reference/pglmm_compare.html +++ b/docs/reference/pglmm_compare.html @@ -465,7 +465,7 @@Examp pglmm_compare(Y ~ X1, family = "binomial", phy = phy, data = sim.dat)
#> Generalized linear mixed model for binomial data fit by restricted maximum likelihood #> #> Call:Y ~ X1 -#> <environment: 0x7fc491145850> +#> <environment: 0x7ff198f534b0> #> #> logLik AIC BIC #> -58.2 124.4 130.2 @@ -485,7 +485,7 @@Examp ape::binaryPGLMM(Y ~ X1, phy = phy, data = sim.dat)
#> #> #> Call:Y ~ X1 -#> <environment: 0x7fc491145850> +#> <environment: 0x7ff198f534b0> #> #> Random effect (phylogenetic signal s2): #> s2 Pr diff --git a/docs/reference/prep_dat_pglmm.html b/docs/reference/prep_dat_pglmm.html index 68a6f40..7819055 100644 --- a/docs/reference/prep_dat_pglmm.html +++ b/docs/reference/prep_dat_pglmm.html @@ -151,7 +151,7 @@Prepare data for
prep.re.effects = TRUE, family = "gaussian", add.obs.re = TRUE, - bayes + bayes = FALSE )pglmm
Arguments
diff --git a/docs/reference/simulate.communityPGLMM.html b/docs/reference/simulate.communityPGLMM.html index e0028f9..7dc0fa7 100644 --- a/docs/reference/simulate.communityPGLMM.html +++ b/docs/reference/simulate.communityPGLMM.html @@ -142,7 +142,7 @@Simulate from a communityPGLMM object
# S3 method for communityPGLMM -simulate(object, nsim = 1, seed = NULL, use.u = FALSE, re.form = NA, ...)+simulate(object, nsim = 1, seed = NULL, re.form = NULL, ...)Arguments
@@ -160,21 +160,11 @@
Arg
- an optional seed to be used in
set.seed
immediately before the simulation so as to generate a reproducible sample.- use.u -- (logical) if
TRUE
, generate a simulation - conditional on the current random-effects estimates; ifFALSE
- generate new Normally distributed random-effects values. (Redundant - withre.form
, which is preferred:TRUE
corresponds to -re.form = NULL
(condition on all random effects), while -FALSE
corresponds tore.form = ~0
(condition on none - of the random effects).)re.form -+ formula for random effects to condition on. If -
NULL
, condition on all random effects; ifNA
or~0
, - condition on no random effects. See Details.(formula,
NULL
, orNA
) specify which random effects to condition on when predicting. +IfNULL
, include all random effects and the conditional modes of those random effects will be included in the deterministic part of the simulation (i.e Xb + Zu); +ifNA
or~0
, include no random effects and new values will be chosen for each group based on the estimated random-effects variances (i.e. Xb + Zu * u_random).... diff --git a/man/simulate.communityPGLMM.Rd b/man/simulate.communityPGLMM.Rd index bc41e92..3e96a60 100644 --- a/man/simulate.communityPGLMM.Rd +++ b/man/simulate.communityPGLMM.Rd @@ -4,7 +4,7 @@ \alias{simulate.communityPGLMM} \title{Simulate from a communityPGLMM object} \usage{ -\method{simulate}{communityPGLMM}(object, nsim = 1, seed = NULL, use.u = FALSE, re.form = NULL, ...) +\method{simulate}{communityPGLMM}(object, nsim = 1, seed = NULL, re.form = NULL, ...) } \arguments{ \item{object}{A fitted model object with class 'communityPGLMM'.} @@ -14,14 +14,6 @@ \item{seed}{an optional seed to be used in \code{\link{set.seed}} immediately before the simulation so as to generate a reproducible sample.} -\item{use.u}{(logical) if \code{TRUE}, generate a simulation - conditional on the current random-effects estimates; if \code{FALSE} - generate new Normally distributed random-effects values. (Redundant - with \code{re.form}, which is preferred: \code{TRUE} corresponds to - \code{re.form = NULL} (condition on all random effects), while - \code{FALSE} corresponds to \code{re.form = ~0} (condition on none - of the random effects).)} - \item{re.form}{(formula, \code{NULL}, or \code{NA}) specify which random effects to condition on when predicting. If \code{NULL}, include all random effects and the conditional modes of those random effects will be included in the deterministic part of the simulation (i.e Xb + Zu); if \code{NA} or \code{~0}, include no random effects and new values will be chosen for each group based on the estimated random-effects variances (i.e. Xb + Zu * u_random).} diff --git a/phyr_0.1.6.tgz b/phyr_0.1.6.tgz deleted file mode 100644 index cd2e575..0000000 Binary files a/phyr_0.1.6.tgz and /dev/null differ diff --git a/phyr_1.0.3.tgz b/phyr_1.0.3.tgz new file mode 100644 index 0000000..9ecf211 Binary files /dev/null and b/phyr_1.0.3.tgz differ diff --git a/vignettes/phyr_example_empirical.Rmd b/vignettes/phyr_example_empirical.Rmd index b914aaa..2410e26 100644 --- a/vignettes/phyr_example_empirical.Rmd +++ b/vignettes/phyr_example_empirical.Rmd @@ -142,7 +142,7 @@ plot_bayes(mod_bayes, sort = TRUE) What we are looking for is that the posterior distribution mode is well away from zero, and that it looks relatively symmetrical. If it were skewed and crammed up against the left side of the plot, near zero, we would conclude that the effect is weak (remembering that variance components cannot be less than or equal zero, so there will always be some positive probability mass). The most obvious effects (well away from zero) are again the species random effect, the species-by-disturbance random effect, and the nested phylogenetic effect. In this plot, the 95% credible interval is also plotted, along with the posterior mean (the point and bar at the bottom of each density). For the random effects these aren't too meaningful, but they can help distinguish genuine effects in the fixed effects. If these credible intervals overlap zero in the fixed effects, the density will be plotted with a lighter color to suggest it is not a strong effect (although this is not relevant for this dataset, because both fixed effects are strong). -## Model Assumption Checks: `mod_bayes` as an example +## Model Assumption Checks The next thing we might want to check is whether assumptions of the model are met by the data. The typical way to do this is by plotting and/or analyzing the model residuals. In non-Gaussian models such as this one, this can be less straightforward. However, `phyr` output supports the `DHARMa` package, which can generated a generalized type of residual known as randomized quantile residuals (or sometimes Dunn-Smyth residuals). These can be calculated and inspected for nearly any error distribution. We can produce standard diagnostic plots for our `pglmm` model by simply calling `DHARMa::simulateResiduals` on the model object. @@ -157,7 +157,12 @@ The residual plots look pretty good, though some of the tests failed. Specifical DHARMa::plotResiduals(resids, mod_bayes$data$disturbance) ``` -Obviously no problems there. +Obviously no problems there. This is also the case for the model fitted by Maximum Likelihood method (`mod`). + +```{r call_dharma_mod, fig.width=16, fig.height=9, out.width="90%"} +resids_mod <- DHARMa::simulateResiduals(mod, plot = FALSE) +plot(resids_mod) +``` ## Goodness of Fit