Bigdata and data science by Kartheek Dachepalli: How R2 is different from RMSE, MAE

Let’s unpack this step-by-step — because R², RMSE, and MAE all involve “difference between actual and predicted,” but they measure different things and answer different questions.

1. R² (Coefficient of Determination) — Variance Captured

Think of your target values (y_actual) as having some spread (variance) around their mean.
If you didn’t have a model and just guessed the mean for everyone, that’s your baseline.
R² asks:

"How much better is my model compared to just guessing the mean every time?"

Formula

R^2 = 1 - \frac{\text{Sum of Squared Errors of Model}}{\text{Sum of Squared Errors of Mean Model}}

Where:

SSE_model = Σ(Actual − Predicted)²
SSE_mean = Σ(Actual − Mean)²

Intuition

R² = 1.0 → Model perfectly predicts all values (100% of variance explained).
R² = 0.0 → Model is no better than guessing the mean.
R² < 0.0 → Model is worse than guessing the mean (ouch).

Example in Credit Limit Prediction

Let’s say actual limits for 5 customers are:

Actual:  10k, 12k, 15k, 20k, 25k
Mean:    16.4k

Variance is the spread around 16.4k.

Case A: Terrible Model

Predicted: 16.4k for everyone (mean model) →
SSE_model = SSE_mean → R² = 0.

Case B: Decent Model

Predicted: 9k, 13k, 14k, 21k, 26k →
SSE_model is much smaller than SSE_mean → R² ≈ 0.85.
This means the model explains 85% of the variation in limits between customers.

2. RMSE & MAE — Error Magnitude

These do not compare to a baseline — they tell you how far off predictions are, on average.
RMSE penalizes large mistakes more heavily than MAE (because it squares the errors before averaging).
Both are absolute accuracy metrics, not relative to variance.

Example with Same Data

If predictions are:

Actual:    10k, 12k, 15k, 20k, 25k
Predicted: 9k, 13k, 14k, 21k, 26k

Errors: 1k, 1k, 1k, 1k, 1k

MAE = (1k + 1k + 1k + 1k + 1k) / 5 = 1k
RMSE = sqrt((1² + 1² + 1² + 1² + 1²) / 5) = 1k
R² = very high, because variance explained is high.

Key Difference

R²: “How much of the pattern in the data did I capture?”
RMSE / MAE: “How far off am I, in the actual unit (e.g., $)?”

You can have:

High R² but high RMSE → You’re good at ranking & trend, but still making large dollar errors.
Low R² but low RMSE → Everyone gets about the same prediction, close to average, but model doesn’t capture much variation between people.

Bigdata and data science by Kartheek Dachepalli

Sunday, August 10, 2025

How R2 is different from RMSE, MAE