How We Track $1.09 Trillion in Medicaid Spending

Medicaid is the largest source of federal funding to states, covering over 90 million Americans. In fiscal year 2023, the program distributed $1.09 trillion in provider payments. Yet until now, there has been no single, searchable public platform to explore where that money goes. That's why we built OpenMedicaid.

The Data Behind the Platform

OpenMedicaid ingests provider-level spending data published by the U.S. Department of Health and Human Services (HHS). This dataset covers every provider who billed Medicaid — doctors, hospitals, pharmacies, clinics, and specialists — across all 50 states, D.C., and U.S. territories.

We process 227 million+ records spanning multiple years, linking provider identities, specialties, geographic regions, procedure codes, and payment amounts into a unified, searchable database.

The raw HHS data arrives as massive flat files — difficult to query, impossible to visualize, and practically unusable for journalists, researchers, or concerned citizens. Our pipeline normalizes this data, resolves provider identities across years, and builds the indexes that make instant search possible.

Fraud Detection: Statistical and ML Approaches

Medicaid fraud costs taxpayers an estimated $80–100 billion per year. OpenMedicaid doesn't just display data — it actively flags anomalies.

Our fraud detection system uses a layered approach:

  • Statistical outlier detection: Providers billing significantly above peer averages for the same procedures in the same state are flagged. We use z-score analysis and interquartile range methods to identify billing patterns that deviate from the norm.
  • Specialty-based benchmarking: A cardiologist billing 10x the national average for echocardiograms isn't necessarily fraudulent — but it warrants scrutiny. We benchmark within specialty and geography.
  • Machine learning classification: Using historical patterns of confirmed fraud cases, we train models to recognize suspicious billing combinations — things like unusually high volumes of expensive procedures, billing for incompatible services, or sudden spikes in claims.
  • Network analysis: Some fraud involves rings of providers referring patients to each other. We map referral networks and flag unusually dense clusters.

1,860 providers flagged

These flagged providers collectively billed $229.6 billion — representing potential fraud, waste, or abuse that deserves investigation.

Key Findings

Our analysis has uncovered striking patterns:

  • The top 1% of providers by billing volume account for a disproportionate share of total Medicaid spending
  • Certain procedure codes show extreme geographic variation — suggesting either different medical practices or different billing practices
  • Provider turnover rates in some states correlate with higher fraud flag rates, suggesting "pop-up" billing operations
  • 1,860 providers have been flagged across our detection systems, collectively billing $229.6 billion

What You Can Do With OpenMedicaid

The platform is built for multiple audiences:

  • Journalists can search any provider, see their billing history, compare them to peers, and identify stories
  • Researchers can analyze spending patterns by state, specialty, procedure, or year
  • Policy makers can see where Medicaid dollars flow and identify areas for reform
  • Citizens can look up their own providers and understand how public money is spent

Every search, every filter, every visualization is free and open to everyone. No paywall, no login required.

Explore the Data

Visit openmedicaid.org to start exploring. Search by provider name, state, specialty, or procedure. View fraud risk scores, billing trends, and peer comparisons. All data is sourced directly from HHS and updated regularly.

OpenMedicaid is one of 134 data platforms built by TheDataProject.AI — all dedicated to making public data usable, searchable, and accessible to everyone.

Share this article