
The False Positive Tax: Why eslint-plugin-security Might Be Costing You More Than It Saves
Every false positive is a withdrawal from your team's trust account
Here's a scenario that plays out on engineering teams constantly: a security linter fires on a code review. The developer investigates, confirms it's a false positive, adds a suppression comment, and moves on. Repeat this enough times, and something subtle but serious happens — engineers stop reading security alerts. The linter is still running. The CI badge is still green. But the humans have checked out.
This is the false positive tax, and a new empirical analysis of eslint-plugin-security puts hard numbers on what most developers have only felt intuitively.
---
What the benchmark actually measured
The analysis takes a disciplined 1:1 approach: for every true positive a tool catches, how many false positives does it generate alongside it? This framing matters because raw catch counts are meaningless without precision context. A tool that flags 50 issues but 40 of them are noise isn't better than a tool that flags 15 issues cleanly.
The results are striking. Across the tested plugins:
- `eslint-plugin-security`: 4 false positives, 84% precision
- `secure-coding`: 0 false positives, 100% precision
An 84% precision rate sounds reasonable until you do the math at scale. If your codebase generates 100 security alerts per sprint, you're manually triaging ~16 phantom issues. That's real engineering time spent on noise — time that could go toward fixing actual vulnerabilities.
<> "A tool that flags many things but misses important vulnerability patterns gives a false sense of safety. Lint noise is not security value."/>
---
Where eslint-plugin-security tends to misfire
The false positives in eslint-plugin-security often stem from rules that are syntactically aware but semantically naive. The classic example is the detect-object-injection rule, which flags bracket notation access on objects:
1// eslint-plugin-security will flag this
2const value = obj[userInput];
3
4// But it also flags this — which is completely safe
5const KEYS = ['name', 'email', 'role'] as const;
6type SafeKey = typeof KEYS[number];
7
8function getSafeField(key: SafeKey, user: User) {
9 return user[key]; // ← flagged, despite being type-safe
10}The rule can't distinguish between a genuinely dangerous dynamic property lookup and one that's been constrained by TypeScript's type system, a validated enum, or runtime checks. It pattern-matches the syntax and fires regardless.
This is a fundamental limitation of pattern-based linting versus semantic analysis. ESLint rules operate on the AST with limited data-flow awareness. They see what code looks like, not what it does.
---
The deeper problem: false positives compound
Noise doesn't just waste time linearly — it compounds. Here's what typically happens on teams that tolerate a noisy security linter:
1. Suppression creep: Developers add // eslint-disable comments liberally. Some of those suppressions cover real issues that happen to sit near false-positive-prone patterns.
2. Rule weakening: The team disables the noisiest rules at the config level, reducing coverage broadly.
3. Alert fatigue: Engineers develop a mental filter that deprioritizes security findings, including the real ones.
4. False security: The tool is still active, reports still show findings, but the actual signal-to-noise ratio has degraded to the point where coverage is largely theater.
This is why precision matters as much as recall. A tool with 100% recall (catches everything) but 30% precision (most alerts are noise) is, in practice, less secure than one with 80% recall and 90% precision — because the high-noise tool trains your team to ignore it.
---
Practical approach: measure precision in your own codebase
Benchmark results are directionally useful, but your codebase is not the benchmark. The right approach is to run a precision audit on your own code before standardizing on any security plugin:
1# Run your security linter and capture output
2npx eslint --rulesdir . --format json src/ \
3 --rule '{"security/detect-object-injection": "warn"}' \
4 > security-audit.json
5
6# Count findings by rule
7cat security-audit.json | jq '[.[].messages[]] | group_by(.ruleId) | map({rule: ..ruleId, count: length})'Then manually review a sample — say, 20 alerts per rule. Classify each as true positive, false positive, or ambiguous. If a rule is hitting more than 20-30% false positives in your stack, it's worth either disabling it or scoping it tightly to the files where it's genuinely relevant.
1// .eslintrc.js — scoping a noisy rule to only untrusted-input handlers
2module.exports = {
3 overrides: [
4 {
5 files: ['src/api/handlers/**/*.ts', 'src/middleware/**/*.ts'],
6 rules: {
7 'security/detect-object-injection': 'error',
8 },
9 },
10 ],
11 rules: {
12 // Disabled globally due to 35% FP rate in our codebase
13 'security/detect-object-injection': 'off',
14 },
15};This isn't silencing security concerns — it's concentrating the rule's attention where it actually adds value.
---
How to think about security linting in your stack
ESLint security plugins occupy a specific and limited position in a defense-in-depth strategy. They're fast, developer-proximate, and good at catching a category of risky patterns early in the workflow. But they are not a substitute for:
- SAST tools with data-flow analysis (Semgrep, CodeQL, Snyk Code)
- Dependency vulnerability scanning (npm audit, Socket, Dependabot)
- Code review with security-aware reviewers
- Runtime protections (CSP, input validation, output encoding)
The benchmark's broader finding — that some plugins catch vulnerabilities while generating noise, others are cleaner but less exhaustive — reinforces that there's no single right answer. The question is which tradeoffs fit your team's workflow and risk tolerance.
<> Security linting is most valuable when it's trusted. A linter your team ignores is worse than no linter, because it creates the illusion of coverage./>
---
Why this matters
The false positive tax isn't just an annoyance metric. It's a security metric in disguise. Teams that invest time in tuning their security linters — measuring precision, scoping noisy rules, pairing lint with deeper analysis — end up with security tooling that developers actually engage with. That engagement is what makes the whole system work.
The practical next steps are straightforward:
- Run your current security linter on production code and sample-audit the output. What's your actual precision rate?
- Compare plugins before committing. The benchmark shows meaningful differences in precision across
eslint-plugin-security, SonarJS, Microsoft SDL, and others. - Disable broadly, enable narrowly for rules with documented false positive problems in your stack.
- Track suppression comments over time — suppression creep is an early warning sign that a rule has stopped earning its place.
The best security linter isn't the one that shouts the most. It's the one your team still trusts six months from now.
