Fun with Multivariate Gaussians

So I was listening to a lecture on multivariate Gaussians and thinking to myself, do I really understand them? This post contains some interesting tidbits of information that I discovered by Googling and writing some Python code available here https://github.com/mariakesa/Neural-Data-Analysis/blob/master/ZebraFish/GaussianZebraFish.ipynb

The first thing that caught my attention while reading a Medium post on Gaussians (1) is that apparently you can decompose the covariance matrix using eigendecomposition. So I decided to get myself a covariance matrix. I took the covariance of 100 random cells in the Misha Ahrens whole-brain zebrafish data (3). I then decomposed this covariance using eigendecomposition and ran a for loop to store all rank one eigenvecor matrix building blocks obtained by doing an outer product of each eigenvector with its transpose (don't forget to multiply by the corresponding eigenvalue). I then made a multivariate normal distribution with mean 0 and the covariance corresponding to one eigen matrix building block and sampled from it. I reduced the dimensionality of the sampled points to two using PCA. Samples from a Gaussiaan distribution with rank one covariance produced samples that lied on a line. I then did the same procedure for the sum of two rank one matrices. The samples from such a matrix populated a 2D and 3D space using PCA. I was surpised that in 3 PC dimensions the data didn't lie in a plane.

Reading a textbook on linear algebra by the wonderful MIT professor Gilbert Strang (4), I came across a very insightful sentence on diagonalization of the covariance matrix: "Diagonalizing the covariance matrix V means finding M independent experiments as combinations of M original experiments." What a deep sentence! 

Even more insightful was the paragraph from (2):
"In this case, the eigenvectors points along the axes of the ellipse that is the cross-section of the Gaussian. Each eigenvalue is the variance of the data along the direction of its corresponding eigenvector. This means the following: if all the data were to be perpendicularly projected onto a line defined by the eigenvector, the corresponding eigenvalue would be the variance of the projection. The square root of the variance is the standard deviation, and useful contours of the Gaussian are defined via the standard deviation."

Also according to (2), the matrix of eigenvectors is a rotation matrix!

Wow for linear algebra:-)
 
References
(4) "Linear Algebra and Learning from Data", Gilbert Strang, https://math.mit.edu/~gs/learningfromdata/
(5) Udacity course on Eigenvalues and Eigenvectors https://classroom.udacity.com/courses/ud104/

Kommentaarid