Notation in probability and statistics

Probability theory

Random variables are usually written in upper case roman letters: XY, etc.

Particular realizations of a random variable are written in corresponding lower case letters. For example, x1x2, …, xn could be a sample corresponding to the random variable X. A cumulative probability is formally written {\displaystyle P(X\leq x)}{\displaystyle P(X\leq x)} to differentiate the random variable from its realization.

The probability is sometimes written {\displaystyle \mathbb {P} }\mathbb {P}  to distinguish it from other functions and measure P so as to avoid having to define “P is a probability” and {\displaystyle \mathbb {P} (X\in A)}{\displaystyle \mathbb {P} (X\in A)} is short for {\displaystyle P(\{\omega \in \Omega :X(\omega )\in A\})}{\displaystyle P(\{\omega \in \Omega :X(\omega )\in A\})}, where {\displaystyle \Omega }\Omega  is the event space and {\displaystyle X(\omega )}X(\omega ) is a random variable. {\displaystyle \Pr(A)}\Pr(A) notation is used alternatively.

{\displaystyle \mathbb {P} (A\cap B)}\mathbb {P} (A\cap B) or {\displaystyle \mathbb {P} [B\cap A]}{\displaystyle \mathbb {P} [B\cap A]} indicates the probability that events A and B both occur. The joint probability distribution of random variables X and Y is denoted as {\displaystyle P(X,Y)}{\displaystyle P(X,Y)}, while joint probability mass function or probability density function as {\displaystyle f(x,y)}f(x,y) and joint cumulative distribution function as {\displaystyle F(x,y)}{\displaystyle F(x,y)}.

{\displaystyle \mathbb {P} (A\cup B)}\mathbb {P} (A\cup B) or {\displaystyle \mathbb {P} [B\cup A]}{\displaystyle \mathbb {P} [B\cup A]} indicates the probability of either event A or event B occurring (“or” in this case means one or the other or both).

σ-algebras are usually written with uppercase calligraphic (e.g. {\displaystyle {\mathcal {F}}}{\mathcal {F}} for the set of sets on which we define the probability P)

Probability density functions (pdfs) and probability mass functions are denoted by lowercase letters, e.g. {\displaystyle f(x)}f(x), or {\displaystyle f_{X}(x)}f_{X}(x).

Cumulative distribution functions (cdfs) are denoted by uppercase letters, e.g. {\displaystyle F(x)}F(x), or {\displaystyle F_{X}(x)}F_X(x).

Survival functions or complementary cumulative distribution functions are often denoted by placing an overbar over the symbol for the cumulative:{\displaystyle {\overline {F}}(x)=1-F(x)}{\overline {F}}(x)=1-F(x), or denoted as {\displaystyle S(x)}S(x),

In particular, the pdf of the standard normal distribution is denoted by φ(z), and its cdf by Φ(z).

Some common operators:

  • X is independent of Y is often written {\displaystyle X\perp Y}X\perp Y or {\displaystyle X\perp \!\!\!\perp Y}X\perp \!\!\!\perp Y, and X is independent of Y given W is often written

{\displaystyle X\perp \!\!\!\perp Y\,|\,W}X\perp \!\!\!\perp Y\,|\,W or{\displaystyle X\perp Y\,|\,W}X\perp Y\,|\,W

{\displaystyle \textstyle P(A\mid B)}\textstyle P(A\mid B), the conditional probability, is the probability of {\displaystyle \textstyle A}\textstyle A given {\displaystyle \textstyle B}\textstyle B, i.e., {\displaystyle \textstyle A}\textstyle A after {\displaystyle \textstyle B}\textstyle B is observed.


Greek letters (e.g. θβ) are commonly used to denote unknown parameters (population parameters).

A tilde (~) denotes “has the probability distribution of”.

Placing a hat, or caret, over a true parameter denotes an estimator of it, e.g., {\displaystyle {\widehat {\theta }}}{\widehat {\theta }} is an estimator for {\displaystyle \theta }\theta .

The arithmetic mean of a series of values x1x2, …, xn is often denoted by placing an “overbar” over the symbol, e.g. {\displaystyle {\bar {x}}}{\bar {x}}, pronounced “x bar”.

Some commonly used symbols for sample statistics are given below:

the sample mean {\displaystyle {\bar {x}}}{\bar {x}},

the sample variance s2,

the sample standard deviation s,

the sample correlation coefficient r,

the sample cumulants kr.

Some commonly used symbols for population parameters are given below:

the population mean μ,

the population variance σ2,

the population standard deviation σ,

the population correlation ρ,

the population cumulants κr,

{\displaystyle x_{(k)}}{\displaystyle x_{(k)}} is used for the {\displaystyle k^{\text{th}}}k^{\text{th}} order statistic, where {\displaystyle x_{(1)}}{\displaystyle x_{(1)}} is the sample minimum and {\displaystyle x_{(n)}}{\displaystyle x_{(n)}} is the sample maximum from a total sample size n.

Leave a Reply

Your email address will not be published. Required fields are marked *