What actually predicts vulnerability exploitation?
Almost every organization prioritizes vulnerabilities by CVSS severity. Almost every organization is doing it wrong. Here is the evidence, measured over a large corpus of CVEs, using CISA KEV (the Known Exploited Vulnerabilities catalog) as ground truth for "actually exploited in the wild."
Base rate to keep in mind: only about one CVE in two hundred is in KEV. Any useful signal has to beat that base rate by a lot. Most don't. A few do, dramatically.
Key findings (the short version)
- EPSS is far better than CVSS at predicting exploitation. High-EPSS CVEs are exploited at a much higher rate than critical-CVSS ones; the gap is roughly an order of magnitude.
- CVSS is nearly redundant once you have EPSS. Combining them barely beats EPSS alone.
- An exploit-tagged reference is a strong filter. Nearly every exploited CVE has one; among CVEs without one, exploitation is vanishingly rare.
- Exploitation has a CWE fingerprint. Memory-corruption primitives like type confusion lead; deserialization, auth-bypass, and OS-command-injection also run well above base rate.
- And an ecosystem fingerprint. NuGet/.NET and Maven/Java packages are the most weaponized; Rust (crates.io) the least.
- Time-to-exploit has collapsed. The median gap from CVE publication to KEV listing has fallen from years for older CVEs to a handful of days for recent ones. The patch window is now a week.
- One intuitive hypothesis was false: "low-EPSS memory-safety bugs are a hidden exploited blind spot." The data says no.
The rest of this page shows the work.
EPSS vs CVSS: it's not close
EPSS (the Exploit Prediction Scoring System, from FIRST) estimates the probability a CVE will be exploited in the next 30 days. CVSS estimates severity, how bad it would be if exploited. People routinely use CVSS as a stand-in for "should I worry," and the data says that's a mistake.
Bucket CVEs by EPSS score and the exploitation rate climbs steeply and monotonically: the high-EPSS band is exploited at many times the base rate, while the low-EPSS long tail is exploited far below it. Bucket the same CVEs by CVSS and the gradient is shallow by comparison, even the critical band sits only modestly above base rate. The high-EPSS lift is roughly an order of magnitude larger than the critical-CVSS lift. That is the entire argument against CVSS-driven patching in one comparison.
And CVSS adds almost nothing on top of EPSS. Among CVEs that are both high-EPSS and critical-CVSS, the exploitation rate is barely above high-EPSS alone. Once you know the exploitation probability, the severity score is close to redundant for the "what's likely to be hit" question. (It still matters for impact, see prioritization.)
The single best filter: is there a public exploit reference?
NVD tags references on a CVE. One tag, Exploit, turns out to be the strongest single signal in the corpus, used as a filter:
- Nearly every exploited (KEV) CVE has an exploit-tagged reference.
- Among the CVEs without one, the exploitation rate is vanishingly small, a literal handful.
In other words: no public exploit reference ≈ not exploited. Filtering to CVEs that have one shrinks the haystack by most of its volume while keeping nearly all of the real threats. It's a necessary-condition filter you can apply before anything else.
(Caveat, stated honestly: this signal is partly contemporaneous with exploitation becoming known, NVD often adds the tag as the exploit surfaces, so it's powerful for triaging the existing backlog, weaker as a pure leading indicator for a brand-new CVE. We tested whether NVD's edit history leads exploitation; it doesn't.)
Exploitation has a CWE fingerprint
Some weakness classes get weaponized far more than others. Ranked roughly by exploitation rate, the classes that run well above base rate are:
- CWE-843 type confusion (the highest)
- CWE-288 auth bypass via alternate path
- CWE-502 deserialization of untrusted data
- CWE-78 OS command injection
- CWE-306 missing authentication
- CWE-416 use-after-free
- CWE-94 code injection
- CWE-787 out-of-bounds write
Memory-corruption primitives (type confusion, UAF, OOB write) and injection/auth classes dominate. This is a usable prior: a missing-auth or deserialization bug on a reachable surface deserves more of your attention than its CVSS might suggest.
And an ecosystem fingerprint
Ranked by exploitation rate, package ecosystems (OSV data) fall out roughly in this order, most-weaponized first: NuGet (.NET) and Maven (Java) at the top, then Packagist (PHP) and RubyGems, then PyPI (Python), npm and Go, with crates.io (Rust) at the bottom by a wide margin.
Enterprise, compiled, long-lived stacks get weaponized; memory-safe Rust sits at the bottom. Useful context when you're triaging a polyglot SBOM.
The big one: time-to-exploit has collapsed
This is the finding with the most operational weight. Track the median gap from a CVE's publication to its KEV listing by publication year, and it falls off a cliff: for CVEs published several years ago the median ran into years; for recently-published CVEs it is down to a handful of days.
So where a disclosure once bought you a couple of years before known exploitation, the recent median is a single-digit number of days. (Honest caveat: recent years are right-censored, CVEs that will be exploited later aren't in KEV yet, but the fast tail is undeniable, and some CVEs are KEV-listed before NVD even finishes publishing them.)
The implication is blunt: a vulnerability feed that is a week stale is now structurally too slow. Freshness stopped being hygiene and became the product.
The hypothesis the data killed
It's worth showing a negative result, because it's the kind of plausible idea that ships as a feature without anyone checking it.
Hypothesis: "EPSS under-rates memory-safety bugs. A low-EPSS but memory-corruption CVE is a hidden exploited blind spot worth flagging."
Reality: Low-EPSS memory-safety CWEs (OOB write, UAF, type confusion, etc.) are exploited at a rate that sits below the overall base rate, no lift at all. Memory-safety CWEs only dominate the low-EPSS-exploited set by raw count (there are simply a lot of them), not by rate. The apparent lift, when we dug in, came entirely from the exploit-tag, not the CWE class.
So "flag low-EPSS memory-corruption bugs" would have produced noise dressed up as insight. Worth knowing before building it. (This proof-first reflex, measure the premise cheaply before you build, runs through all our experiments.)
A practical signal hierarchy
Putting it together, here's the order of operations the data supports for "should I care about this CVE?":
- Filter on a public exploit reference. No exploit reference → almost certainly not exploited. (Cuts most of the corpus.)
- Rank by EPSS, not CVSS. The high-EPSS band is your shortlist.
- Layer in KEV / SSVC "active" as ground-truth confirmation. If CISA says it's exploited, it's exploited.
- Use CWE and ecosystem as priors for the un-scored or freshly-published long tail (where EPSS hasn't warmed up).
- Weight recency hard. Given the time-to-exploit collapse, a fresh high-EPSS CVE in your stack is a this-week problem, not a this-quarter one.
- Use CVSS for impact, not likelihood. It answers "how bad if," not "how likely."
FAQ
Is EPSS better than CVSS for prioritization? For exploitation likelihood, decisively yes, by a wide margin in this data. CVSS still matters for impact. The right model uses EPSS (and KEV/SSVC) for likelihood and CVSS for severity, not CVSS for both.
Does a public proof-of-concept mean a CVE will be exploited? Not on its own, but its absence is highly predictive of non-exploitation: nearly all in-the-wild-exploited CVEs have an exploit-tagged reference, and CVEs without one are almost never exploited. Treat "has a public exploit" as a necessary-condition filter.
Which vulnerability types are most exploited? By rate: memory-corruption primitives like type confusion (CWE-843) lead, followed by auth-bypass (CWE-288), deserialization (CWE-502), OS command injection (CWE-78), and missing authentication (CWE-306).
How fast are vulnerabilities exploited after disclosure? The median has collapsed from years for older CVEs to a handful of days for recent ones. Plan for a patch window of days, not months.
What is CISA KEV? The Known Exploited Vulnerabilities catalog, CISA's authoritative public list of CVEs confirmed exploited in the wild. It's the ground truth used throughout this analysis.
Next: how to turn these signals into a priority ranking that beats CVSS, and whether an LLM can reason over this corpus better than a base model.