Monday, August 11, 2025

Covariance



Significance of covariance & meaning of high vs low values

Covariance measures how two features vary together:

  • Positive covariance → When feature F1F_1 is above its mean, feature F2F_2 tends to also be above its mean. (They move in the same direction.)

  • Negative covariance → When F1F_1 is above its mean, F2F_2 tends to be below its mean. (They move in opposite directions.)

  • Near zero covariance → No consistent linear relationship — knowing one feature’s deviation from the mean tells you nothing about the other.

Numerically:

  • Large magnitude (positive or negative) means strong linear relationship.

  • Small magnitude means weak or no relationship.


Why covariance matters in PCA

  • The covariance matrix encodes all pairwise relationships between features.

  • If features are highly correlated (large positive or negative covariance), PCA will combine them into a principal component that captures their shared variation, so you don’t have redundancy.

  • If covariances are near zero, features are largely independent; PCA will mostly keep them separate unless variances are drastically different.


💡 Analogy
Think of the covariance matrix as a “map” of how all features move together.
The eigenvectors are “routes” through this map that maximize variance.
PCA rotates your view to look along those routes, and you project the original data (not the covariance matrix itself) into that rotated view.


No comments:

Post a Comment