The KS (Kolmogorov-Smirnov) Statistic is a powerful and commonly used evaluation metric for binary classification models, especially in finance, credit scoring, and risk modeling.
๐ What is KS Statistic?
The KS statistic measures the maximum difference between the cumulative distribution functions (CDFs) of the predicted scores for the positive class (events) and negative class (non-events).
Formula:
Where:
- 
: Cumulative distribution of positive class (e.g., default) 
- 
: Cumulative distribution of negative class (e.g., non-default) 
๐ง Intuition:
- 
It tells how well the model separates the two classes. 
- 
A higher KS value means better separation of good and bad cases. 
- 
KS = 0: no separation (useless model) 
- 
KS = 1: perfect separation (ideal but unrealistic) 
๐ Usage by Domain
| Domain | Why KS is Used | 
|---|---|
| Banking / Credit Risk | Industry standard for measuring discriminatory power between defaulters and non-defaulters | 
| Insurance | Distinguishing claimants vs non-claimants | 
| Fraud Detection | Separating fraudulent from legitimate transactions | 
| Marketing | Used less commonly; better suited metrics include precision@k and lift | 
✅ Typical KS Value Interpretation:
| KS Score | Model Quality | 
|---|---|
| < 0.2 | Poor | 
| 0.2 - 0.3 | Fair | 
| 0.3 - 0.4 | Good | 
| > 0.4 | Excellent | 
 
No comments:
Post a Comment