Let’s go through ROC-AUC just like we did for KS — with intuitive explanation, formulas, and a step-by-step example using 10 observations.
π What is ROC-AUC?
π¦ ROC = Receiver Operating Characteristic Curve
It plots:
- 
X-axis: False Positive Rate (FPR) = FP / (FP + TN) 
- 
Y-axis: True Positive Rate (TPR) = TP / (TP + FN) 
Each point on the ROC curve represents a threshold on the predicted probability.
π§ AUC = Area Under the Curve
- 
AUC = Probability that the model ranks a random positive higher than a random negative 
- 
AUC ranges from: - 
1.0 → perfect model 
- 
0.5 → random guessing 
- 
< 0.5 → worse than random 
 
- 
✅ ROC-AUC Formula (Conceptually)
There are two main interpretations:
1. Integral of the ROC Curve:
2. Rank-Based Interpretation (Used in practice):
π Example: 10 Observations
We'll reuse your 10 data points:
| Obs | Actual (Y) | Predicted Score | 
|---|---|---|
| 1 | 1 | 0.95 | 
| 2 | 0 | 0.90 | 
| 3 | 1 | 0.85 | 
| 4 | 0 | 0.80 | 
| 5 | 0 | 0.70 | 
| 6 | 1 | 0.60 | 
| 7 | 0 | 0.40 | 
| 8 | 0 | 0.30 | 
| 9 | 1 | 0.20 | 
| 10 | 0 | 0.10 | 
- 
Total Positives (P) = 4 
- 
Total Negatives (N) = 6 
π Step-by-Step: Rank-Based AUC Calculation
Let’s find all (positive, negative) score pairs and count how many times:
- 
Positive score > Negative score → Correct 
- 
Positive score == Negative score → 0.5 credit 
- 
Positive score < Negative score → Wrong 
Step 1: List All Positive-Negative Pairs
Positive scores: 0.95, 0.85, 0.60, 0.20
Negative scores: 0.90, 0.80, 0.70, 0.40, 0.30, 0.10
Total Pairs = 4 × 6 = 24
Step 2: Count Favorable Pairs
| Pos Score | Compared to Neg Scores | Wins | 
|---|---|---|
| 0.95 | > all (0.90 ... 0.10) | 6 | 
| 0.85 | > all except 0.90 | 5 | 
| 0.60 | > 0.40, 0.30, 0.10 | 3 | 
| 0.20 | > 0.10 only | 1 | 
| Total | 6+5+3+1 = 15 wins | 
No ties, so:
π§ Interpretation:
- 
Model has 62.5% chance of ranking a random defaulter higher than a non-defaulter. 
- 
Better than random, but not great. 
π ROC Curve (Optional Idea):
If we plot TPR vs FPR at various thresholds:
- 
Start at (0,0) 
- 
End at (1,1) 
- 
The area under that curve will match AUC = 0.625 
 
No comments:
Post a Comment