From Data to Decisions

From Data to Decisions

Think ROC-AUC Fails on Imbalanced Data? Think Again

Why the myth that “ROC-AUC fails with imbalanced data” persists, even among experts.

Samuele Mazzanti's avatar
Samuele Mazzanti
Jul 16, 2025
∙ Paid
8
3
Share

Data science is full of myths. One that is surprisingly persistent is that the ROC-AUC metric (Area Under the ROC Curve) is not reliable when applied to imbalanced datasets (that is, those where positive cases make up a small fraction of the total, like 1% or 5%).

Since many real-world datasets are indeed imbalanced, if this myth were true, we’d rarely b…

Keep reading with a 7-day free trial

Subscribe to From Data to Decisions to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Samuele Mazzanti
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture