We Found Insurance Fraud in Our Crash Data

April 22, 2025

When we set out to build geospatial risk scores for vehicle crashes at Matrisk AI, we never expected that a side by side look at Vehicle Identification Numbers and crash timelines would hint at possible insurance fraud. But data sometimes surprises you. Below, I’ll walk through how we stumbled upon this discovery, what we found, and why it might matter for anyone insuring vehicles.


A curious hunch

Our main focus has been risk scores: Where are crashes most frequent? What are the road conditions? Which stretches of highway see severe outcomes? Yet, a handful of states in our dataset disclose VINs for each crash, and that simple addition changed everything.

Why VINs are a big deal

I still remember the moment this hit home. Some years ago, a police officer casually told me, “You should get a dashcam, insurance fraud is common around here.” His offhand comment stuck with me, but life moved on. Fast forward to our modern data pipeline, and suddenly those words didn’t seem so casual after all.


Linking crashes together

After filtering out invalid VINs, we narrowed the dataset to roughly ~15 million crashes. (We also removed all drug and alcohol related crashes, since it’s unlikely someone committing insurance fraud would be under the influence.) From there, our pipeline:

  1. Counts how often the same VIN appears in a short interval (e.g., 6–12 months).
  2. Flags overlapping or “back to back” insurance coverage for the same VIN.
  3. Identifies repeated patterns of single vehicle collisions, nighttime crashes, and reported injuries.

We’re not conducting a law enforcement sting, but we do want to spot anomalies that might warrant a closer look.

While most VINs have ≤2 crashes, a small subset has ≥5 within 12 months While most VINs have just ≤2 crashes, a subset had ≥5 crashes within a 12-month rolling window.


Suspicious indicators

Industry white papers have pointed out the same red flags time and again. For example, the FBI highlighted a ring in Connecticut where one group of vehicles participated in over 50 staged accidents in just a few years. We’ve found similar signals:

We combined these factors into a weighted propensity score to prioritize which VINs were worth a closer inspection. It’s not a definitive fraud meter, just a helpful starting point.

VIN Max Crashes (6m) Max Insurers (6m) Max Injuries (12m) Late Night Collisions (12m) Single-Vehicle Crashes (12m) Propensity Score
VIN1 3 3 2 3 3 7.6
VIN2 3 3 0 0 3 6.8
VIN3 3 3 0 0 3 6.8
VIN4 3 2 0 2 3 6.5
VIN5 3 3 0 1 2 6.5

Insurance switching

Public crash data typically doesn’t provide official coverage start/end dates. As a workaround, we inferred coverage intervals based on the first and last incident listing a particular insurer for a VIN. If two or more such intervals overlapped (or happened back to back within days), we flagged it.

It’s not ironclad evidence of fraud, but seeing multiple carriers handle the same VIN in a tight window definitely raised some eyebrows.

Overlapping insurance See the overlap? Frequent insurance switching and coverage starting just days before a crash seems almost premeditated.


A deeper look

VIN1 caught our eye, so we stepped through the collision history:

  1. Short intervals between reported crashes
    • 2018-05-20 (late night collision, property damage, single vehicle)
    • 2018-06-30 (late night again, minor injury, single vehicle)
    • 2018-10-10 (early morning, minor injury, single vehicle)
  2. Insurance switching
    • Allstate covered the first incident. Progressive coverage began just before the first crash and ended after the second crash, at which point State Farm took over.
  3. Suspicious environments
    • The collisions happened at times and locations with low traffic and no eyewitnesses.
  4. Single vehicle collisions
    • Each crash was conveniently a single vehicle collision (phantom deer)

Does this prove fraud? Absolutely not. It might be a string of bad luck, or a driver who’s genuinely error prone. But if I were in claims management, I’d be curious to dig further.


Why this matters

Fraud is a tricky subject. By some estimates, fraudulent claims cost the industry billions of dollars each year. We think there’s value in showing how simple VIN analysis, layered with location and insurer data, can highlight “risky” patterns early on.

Crash Map Note: All vehicle and insurance details are anonymized for illustration.


Footnotes

If you’re curious about how we generate our risk scores—or if you just want to compare notes—feel free to reach out to our team. We’d love to hear your thoughts.