-
Notifications
You must be signed in to change notification settings - Fork 76
Suggested Notation
Joram Soch edited this page Jan 17, 2025
·
5 revisions
By nature, the proofs and definitions in "The Book of Statistical Proofs" use mathematical notation. On this page, we list recommendations how to denote certain statistical objects, e.g. probabilities, distributions and models, separated by StatProofBook chapters. Generally speaking, StatProofBook contributors are not bound to this notation – that is, submitting as such is regarded as more desirable than submitting in form –, but contributors should try to adhere to the suggested notation as close as possible.
For more information, see the guidelines on using LaTeX and MathJax.
-
E
– single random event -
A, B, C
– multiple random events -
A_1, \ldots, A_k
– mutually exclusive random events -
\bar{A}, \bar{B}, \bar{C}
– complements of random events -
X, Y, Z
– scalar random variables, random vectors or random matrices -
x, y, z
– realizations or values of random variables (exception: random matrices) -
\mathcal{X}, \mathcal{Y}, \mathcal{Z}
– sets of possible values of random variables -
x \in \mathcal{X}, y \in \mathcal{Y}, z \in \mathcal{Z}
– indexing all possible values -
P(A), P(B), P(C)
– axiomatic definition of probability -
p(x), q(x)
– (marginal) probability densities or probability masses -
\mathrm{Pr}(X=a), \mathrm{Pr}(X \in A)
– specific statements about random variables -
p(x,y)
– joint probability -
p(x|y)
– conditional probability -
f_X(x)
– probability density (PDF) or probability mass function (PMF) -
F_X(x)
– cumulative distribution function (CDF) -
Q_X(p)
– quantile function (QF) a.k.a. inverse CDF. -
M_X(t)
– moment-generating function (MGF)
-
\mathrm{E}(X)
– expected value (mean) -
\mathrm{Var}(X)
– variance -
\mathrm{Skew}(X)
– skewness -
\mathrm{Cov}(X,Y)
– covariance -
\mathrm{Corr}(X,Y)
– correlation -
\Sigma_{XX}
– covariance matrix -
C_{XX}
– correlation matrix -
x = \left\lbrace x_1, \ldots, x_n \right\rbrace
– sample -
\bar{x}
– sample mean -
s^2, s_x^2
– sample variance -
\hat{s}, \hat{s}_x
– sample skewness -
s_{xy}
– sample covariance -
r_{xy}
– sample correlation -
S, S_{xy}
– sample covariance matrix -
R, R_{xy}
– sample correlation matrix
-
\mathrm{median}(X)
– median -
\mathrm{mode}(X)
– mode -
\sigma(X)
– standard deviation -
\mathrm{FWHM}(X)
– full width at half maximum -
\mathrm{min}, \mathrm{max}
– minimum, maximum -
\mu_n(c)
– n-th moment about c -
\mu_n'
– n-th raw moment -
\mu_n
– n-th central moment -
\mu_n^{*}
– n-th standardized moment
-
\mathrm{H}(X)
– (Shannon) entropy -
\mathrm{H}(X|Y)
– conditional entropy -
\mathrm{H}(X,Y)
– joint entropy (of two random variables) -
\mathrm{H}(P,Q)
– cross-entropy (of two probability distributions) -
\mathrm{h}(X)
– differential entropy -
\mathrm{h}(X|Y)
– conditional differential entropy -
\mathrm{h}(X,Y)
– joint differential entropy (of two random variables) -
\mathrm{h}(P,Q)
– differential cross-entropy (of two probability distributions) -
\mathrm{I}(X,Y)
– mutual information -
\mathrm{KL}[P||Q]
– Kullback-Leibler divergence (between two probability distributions) -
\mathrm{KL}[p(x)||q(x)]
– Kullback-Leibler divergence (between two PMFs or PDFs)
-
\lambda
– hyper-parameters, parameters of a distribution -
\mathcal{D}(\lambda)
– parametrized probability distribution -
X \sim \mathcal{D}(\lambda)
– random variable following probability distribution -
f_X(x) = \mathcal{D}(x; \lambda)
– PDF or PMF of probability distribution -
F_X(x) = \int_{-\infty}^x \mathcal{D}(z; \lambda) \, \mathrm{d}z
– CDF of probability distribution -
Y = \sum_{i=1}^p a_i X_i
– linear combination of random variables -
Y = AX + b
– linear transformation of random variable(s) -
\mathrm{E}(X)
– expected value of random variable -
\mathrm{median}(X)
– median of random variable -
\mathrm{mode}(X)
– mode of random variable -
\mathrm{Var}(X)
– variance of random variable -
\mathrm{Cov}(X)
– covariance of random vector
-
\mathcal{U}(a, b)
– discrete uniformation distribution -
\mathrm{Bern}(p)
– Bernoulli distribution -
\mathrm{Bin}(n, p)
– binomial distribution -
\mathrm{BetBin}(n,\alpha,\beta)
– beta-binomial distribution -
\mathrm{Poiss}(\lambda)
– Poisson distribution -
\mathrm{Cat}([p_1,\ldots,p_k])
– categorical distribution -
\mathrm{Mult}(n,[p_1,\ldots,p_k])
– multinomial distribution
-
\mathcal{U}(a, b)
– continuous uniformation distribution -
\mathcal{N}(\mu, \sigma^2)
– univariate normal distribution -
t(\nu)
– univariate t-distribution -
\mathrm{Gam}(a,b)
– gamma distribution -
\mathrm{Exp}(\lambda)
– exponential distribution -
\ln \mathcal{N}(\mu, \sigma^2)
– log-normal distribution -
\chi^2(k)
– chi-squared distribution -
\ln \mathcal{N}(\mu, \sigma^2)
– log-normal distribution -
F(d_1, d_2)
– F-distribution -
\mathrm{Bet}(\alpha, \beta)
– beta distribution -
\mathrm{Wald}(\gamma, \alpha)
– Wald distribution -
\mathrm{ex-Gaussian}(\mu, \sigma, \lambda)
– ex-Gaussian distribution
-
\mathcal{N}(\mu, \Sigma)
– multivariate normal distribution -
t(\mu, \Sigma, \nu)
– multivariate t-distribution -
\mathrm{NG}(\mu, \Lambda, a, b)
– normal-gamma distribution -
\mathrm{Dir}(\alpha)
– Dirichlet distribution -
\mathcal{MN}(M, U, V)
– matrix-normal distribution -
\mathcal{W}(V, n)
– Wishart distribution -
\mathrm{NW}(M, U, V, \nu)
– normal-Wishart distribution
-
y
– measured data -
m
– generative model -
\theta
– model parameters -
\lambda
– model hyper-parameters -
\mathcal{L}_m(\theta)
– likelihood function -
p(y|\theta,m)
– likelihood function -
\mathrm{LL}(\theta)
– log-likelihood function -
\hat{\theta}
– estimated model parameters (maximum likelihood) -
\hat{\theta}_\mathrm{MAP}
– estimated model parameters (maximum-a-posteriori) -
\hat{y}
– fitted/predicted data -
p(\theta|m)
– prior distribution -
p(\theta|y,m)
– posterior distribution -
p(y_\mathrm{new}|m)
– prior predictive distribution -
p(y_\mathrm{new}|y,m)
– posterior predictive distribution -
p(y|m)
– marginal likelihood -
\log p(y|m)
– log model evidence
-
y, Y
– univariate/multivariate measured data -
x, X
– single predictor/design matrix -
\beta, B
– univariate/multivariate regression coefficients -
\varepsilon, E
– univariate/multivariate noise -
\sigma^2, \Sigma
– noise variance/measurement covariance -
I_n
– noise covariance matrix (i.i.d.) -
V
– noise covariance matrix (not i.i.d.) -
n
– number of observations -
v
– number of measurements -
p
– number of regressors -
y_i
– i-th observation (univariate GLM) -
y_{ij}
– i-th observation of j-th measurement (multivariate GLM) -
y_{ij}
– j-th observation of i-th category (one-way ANOVA) -
y_{ijk}
– k-th observation of (i,j)-th cell (two-way ANOVA) -
y = \left\lbrace y_1, \ldots, y_n \right\rbrace
– data set consisting of n data points
-
y
– measured data -
m
– generative model -
f
– generative model family -
n
– number of observations -
k
– number of free model parameters
-
\sigma^2
– noise variance -
\hat{\sigma}^2
– residual variance -
R^2
– coefficient of determination -
R^2_\mathrm{adj}
– adjusted coefficient of determination -
\mathrm{SNR}
– signal-to-noise ratio -
\mathrm{MLL}{m}
– maximum log-likelihood -
\mathrm{IC}{m}
– information criterion
-
p(y|m)
– model evidence -
\mathrm{LME}{m}
– log model evidence -
\mathrm{Acc}{m}
– (Bayesian) model accuracy (term) -
\mathrm{Com}{m}
– (Bayesian) model complexity (penalty) -
m \in f
– indexing all models in a family -
p(y|f)
– family evidence -
\mathrm{LFE}{f}
– log family evidence -
\mathrm{BF}_{12}
– Bayes factor -
\mathrm{LBF}_{12}
– log Bayes factor -
p(m|y)
– posterior model probability -
p(\theta|y)
– marginal posterior distribution