The KS (Kolmogorov-Smirnov) Statistic is a powerful and commonly used evaluation metric for binary classification models, especially in finance, credit scoring, and risk modeling.
๐ What is KS Statistic?
The KS statistic measures the maximum difference between the cumulative distribution functions (CDFs) of the predicted scores for the positive class (events) and negative class (non-events).
Formula:
Where:
-
: Cumulative distribution of positive class (e.g., default)
-
: Cumulative distribution of negative class (e.g., non-default)
๐ง Intuition:
-
It tells how well the model separates the two classes.
-
A higher KS value means better separation of good and bad cases.
-
KS = 0: no separation (useless model)
-
KS = 1: perfect separation (ideal but unrealistic)
๐ Usage by Domain
Domain | Why KS is Used |
---|---|
Banking / Credit Risk | Industry standard for measuring discriminatory power between defaulters and non-defaulters |
Insurance | Distinguishing claimants vs non-claimants |
Fraud Detection | Separating fraudulent from legitimate transactions |
Marketing | Used less commonly; better suited metrics include precision@k and lift |
✅ Typical KS Value Interpretation:
KS Score | Model Quality |
---|---|
< 0.2 | Poor |
0.2 - 0.3 | Fair |
0.3 - 0.4 | Good |
> 0.4 | Excellent |
No comments:
Post a Comment