Maximum Likelihood Estimation

The animation (generated with Cinderella) visualizes the effect of the covariance matrix of observations onto the Maximum Likelihood estimation of a parameter.

The animation shows the estimation of the single variable θ lying on a line (white), i.e. we assume E(y)=xθ, with the design matrix X=x=[x1,x2], |x|=1, based on the observed point y=[y1,y2]), not sitting on the line. The simple least squares solution is the point xθ|I2 being the point on the line closest to y. In case the two coordinates y1 and y2 have the joint covariance matrix Σ then the best point is xθ|Σ. The tangent point T of the (yellow) tangent at the ellipse (representing Σ) which is parallel to the line yields the direction from y to the optimal point.

You may change the configuration by moving the white line on x or the point y. The semi-axes s1 and s2 of the covariance matrix can be changed by moving the red points in the left upper corner. The direction of the major semiaxis can be changed by rotating the red arrow at M. Changing the covariance matrix changes the estimated point. You may adapt the notation, substituting (θ,X,y) by (x,A,l).

Generally, changing the three parameters of the covariance matrix also lead to changes of the estimated point xθ|Σ. The difference to the simple least squares solution with Σ=I2 only stays unchanged, if the major semiaxis of Σ is parallel or perpendicular to the direction x. So: Rotate the ellipse such that a semiaxis is parallel to x, then changing the semiaxes s1 or s2 does not lead to changes of the estimate. This is the result of Rao's lemma 5a, (1967, p. 10).

Changing the line to y2=y1, hence x ∝ [1,1], we can realize the general mean of the two observations. By choosing a point on the y1-axis as observation, say y=[3,0], we may be able to find a covariance matrix of the observations which leads to a point y2=y1 outside the range [3,0]. However, when assuming no correlation between y1 and y2, i.e. φ=0 or φ=90°, the mean always stays within the interval [y1,y2].