Multivariate Normal Distribution PDF
Document Details
Tags
Summary
This document provides a detailed explanation of multivariate normal distributions. It covers univariate and bivariate cases, explaining the concepts and formulas involved.
Full Transcript
**Multivariate Normal Distribution** Just as the univariate normal distribution tends to be the most important statistical distribution in univariate statistics, the multivariate normal distribution is the most important distribution in multivariate statistics. The question one might ask is, \"Why...
**Multivariate Normal Distribution** Just as the univariate normal distribution tends to be the most important statistical distribution in univariate statistics, the multivariate normal distribution is the most important distribution in multivariate statistics. The question one might ask is, \"Why is the multivariate normal distribution so important?\" There are three reasons why this might be so: 1. *Mathematical Simplicity*. It turns out that this distribution is relatively easy to work with, so it is easy to obtain multivariate methods based on this particular distribution. 2. *Multivariate version of the Central Limit Theorem*. You might recall in the univariate course that we had a central limit theorem for the sample mean for large samples of random variables. A similar result is available in multivariate statistics that says if we have a collection of random vectors 𝑋1,𝑋2,⋯𝑋𝑛 that are independent and identically distributed, then the sample mean vector, 𝑥¯, is going to be approximately multivariate normally distributed for large samples. 3. Many natural phenomena may also be modeled using this distribution, just as in the univariate case. **Univariate Normal Distributions** Before defining the multivariate normal distribution we will visit the univariate normal distribution. A random variable *X *is normally distributed with mean 𝜇 and variance 𝜎2 if it has the probability density function of *X* as: This result is the usual bell-shaped curve that you see throughout statistics. In this expression, you see the squared difference between the variable *x* and its mean, 𝜇. This value will be minimized when x is equal to 𝜇. The quantity −𝜎−2(𝑥−𝜇)2 will take its largest value when *x* is equal to 𝜇 or likewise since the exponential function is a monotone function, the normal density takes a maximum value when x is equal to 𝜇. The variance 𝜎2 defines the spread of the distribution about that maximum. If 𝜎2 is large, then the spread is going to be large, otherwise, if the 𝜎2 value is small, then the spread will be small. As shorthand notation we may use the expression below: ![](media/image2.png) indicating that X is distributed according to (denoted by the wavey symbol \'tilde\') a normal distribution (denoted by *N*), with mean 𝜇 and variance 𝜎2. **Bivariate Normal Distribution** To further understand the multivariate normal distribution it is helpful to look at the bivariate normal distribution. Here our understanding is facilitated by being able to draw pictures of what this distribution looks like. We have just two variables, 𝑋1 and 𝑋2, and these are bivariately normally distributed with mean vector components 𝜇1 and 𝜇2 and variance-covariance matrix shown below: In this case, we have the variances for the two variables on the diagonal and on the off-diagonal, we have the covariance between the two variables. This covariance is equal to the correlation times the product of the two standard deviations. The determinant of the variance-covariance matrix is simply equal to the product of the variances times 1 minus the squared correlation. ![](media/image4.png) The inverse of the variance-covariance matrix takes the form below: **Joint Probability Density Function for Bivariate Normal Distribution** Substituting in the expressions for the determinant and the inverse of the variance-covariance matrix we obtain, after some simplification, the joint probability density function of (𝑋1, 𝑋2) for the bivariate normal distribution as shown below: ![](media/image6.png) **Multivariate Normal Distributions** If we have a *p* x 1 random vector 𝑋 that is distributed according to a multivariate normal distribution with a population mean vector 𝜇 and population variance-covariance matrix Σ, then this random vector, 𝑋, will have the joint density function as shown in the expression below: \|Σ\| denotes the determinant of the variance-covariance matrix Σ and Σ−1is just the inverse of the variance-covariance matrix Σ. Again, this distribution will take maximum values when the vector 𝑋 is equal to the mean vector 𝜇, and decrease around that maximum. If *p* is equal to 2, then we have a bivariate normal distribution and this will yield a bell-shaped curve in three dimensions. The shorthand notation, similar to the univariate version above, is ![](media/image8.png) We use the expression that the vector 𝑋 \'is distributed as\' multivariate normal with mean vector 𝜇 and variance-covariance matrix Σ. **Some things to note about the multivariate normal distribution:** 1. The following term appearing inside the exponent of the multivariate normal distribution is a quadratic form: 2. If the variables are uncorrelated then the variance-covariance matrix will be a diagonal matrix with variances of the individual variables appearing on the main diagonal of the matrix and zeros everywhere else: ![](media/image11.png)