Probability theory
Random variables are usually written in upper case roman letters: X, Y, etc.
Particular realizations of a random variable are written in corresponding lower case letters. For example, x1, x2, …, xn could be a sample corresponding to the random variable X. A cumulative probability is formally written {\displaystyle P(X\leq x)} to differentiate the random variable from its realization.
The probability is sometimes written {\displaystyle \mathbb {P} } to distinguish it from other functions and measure P so as to avoid having to define “P is a probability” and {\displaystyle \mathbb {P} (X\in A)}
is short for {\displaystyle P(\{\omega \in \Omega :X(\omega )\in A\})}
, where {\displaystyle \Omega }
is the event space and {\displaystyle X(\omega )}
is a random variable. {\displaystyle \Pr(A)}
notation is used alternatively.
{\displaystyle \mathbb {P} (A\cap B)} or {\displaystyle \mathbb {P} [B\cap A]}
indicates the probability that events A and B both occur. The joint probability distribution of random variables X and Y is denoted as {\displaystyle P(X,Y)}
, while joint probability mass function or probability density function as {\displaystyle f(x,y)}
and joint cumulative distribution function as {\displaystyle F(x,y)}
.
{\displaystyle \mathbb {P} (A\cup B)} or {\displaystyle \mathbb {P} [B\cup A]}
indicates the probability of either event A or event B occurring (“or” in this case means one or the other or both).
σ-algebras are usually written with uppercase calligraphic (e.g. {\displaystyle {\mathcal {F}}} for the set of sets on which we define the probability P)
Probability density functions (pdfs) and probability mass functions are denoted by lowercase letters, e.g. {\displaystyle f(x)}, or {\displaystyle f_{X}(x)}
.
Cumulative distribution functions (cdfs) are denoted by uppercase letters, e.g. {\displaystyle F(x)}, or {\displaystyle F_{X}(x)}
.
Survival functions or complementary cumulative distribution functions are often denoted by placing an overbar over the symbol for the cumulative:{\displaystyle {\overline {F}}(x)=1-F(x)}, or denoted as {\displaystyle S(x)}
,
In particular, the pdf of the standard normal distribution is denoted by φ(z), and its cdf by Φ(z).
Some common operators:
- E[X] : expected value of X
- var[X] : variance of X
- cov[X, Y] : covariance of X and Y
- X is independent of Y is often written {\displaystyle X\perp Y}
or {\displaystyle X\perp \!\!\!\perp Y}
, and X is independent of Y given W is often written
{\displaystyle X\perp \!\!\!\perp Y\,|\,W} or{\displaystyle X\perp Y\,|\,W}
{\displaystyle \textstyle P(A\mid B)}, the conditional probability, is the probability of {\displaystyle \textstyle A}
given {\displaystyle \textstyle B}
, i.e., {\displaystyle \textstyle A}
after {\displaystyle \textstyle B}
is observed.
Statistics
Greek letters (e.g. θ, β) are commonly used to denote unknown parameters (population parameters).
A tilde (~) denotes “has the probability distribution of”.
Placing a hat, or caret, over a true parameter denotes an estimator of it, e.g., {\displaystyle {\widehat {\theta }}} is an estimator for {\displaystyle \theta }
.
The arithmetic mean of a series of values x1, x2, …, xn is often denoted by placing an “overbar” over the symbol, e.g. {\displaystyle {\bar {x}}}, pronounced “x bar”.
Some commonly used symbols for sample statistics are given below:
the sample mean {\displaystyle {\bar {x}}},
the sample variance s2,
the sample standard deviation s,
the sample correlation coefficient r,
the sample cumulants kr.
Some commonly used symbols for population parameters are given below:
the population mean μ,
the population variance σ2,
the population standard deviation σ,
the population correlation ρ,
the population cumulants κr,
{\displaystyle x_{(k)}} is used for the {\displaystyle k^{\text{th}}}
order statistic, where {\displaystyle x_{(1)}}
is the sample minimum and {\displaystyle x_{(n)}}
is the sample maximum from a total sample size n.