Bigdata and data science by Kartheek Dachepalli: KS Calculation

Sunday, August 3, 2025

let's walk through a step-by-step example of the KS statistic using 10 observations with:

🧾 Sample Data: 10 Observations

Rank	Actual (Y)	Score	Cumulative Positives	Cumulative Negatives	(+ve%) - (-ve%)
1	1	0.95	1 / 4 = 0.25	0 / 6 = 0.00	0.25
2	0	0.90	1 / 4 = 0.25	1 / 6 = 0.167	0.083
3	1	0.85	2 / 4 = 0.50	1 / 6 = 0.167	0.333
4	0	0.80	2 / 4 = 0.50	2 / 6 = 0.333	0.167
5	0	0.70	2 / 4 = 0.50	3 / 6 = 0.500	0.00
6	1	0.60	3 / 4 = 0.75	3 / 6 = 0.500	0.25
7	0	0.40	3 / 4 = 0.75	4 / 6 = 0.667	0.083
8	0	0.30	3 / 4 = 0.75	5 / 6 = 0.833	-0.083
9	1	0.20	4 / 4 = 1.00	5 / 6 = 0.833	0.167
10	0	0.10	4 / 4 = 1.00	6 / 6 = 1.000	0.00

Look for the maximum difference between:

The maximum value in the last column ((Cumulative positives%) - (Cumulative negatives %)) is:

\boxed{0.333} \text{ at Rank 3 (score = 0.85)}

KS = 0.333 → The maximum separation between defaulters and non-defaulters occurs when the score threshold is around 0.85
At that point:
- You've captured 50% of defaulters
- Only 16.7% of non-defaulters
This is the optimal score threshold for maximum model discrimination