audytx vs Checkov, Trivy, KICS & Terrascan
Five scanners run against 28 real-world AWS Terraform corpora — 7 with known security findings (recall) and 21 production-grade community modules expected to be clean (precision). Every number below is measured, scored by a deterministic script, and reproducible from a public repo. Headline: audytx and Checkov are the only two tools that detect all 31 IAM privilege-escalation paths — and on the 21 clean modules audytx fires ~36× fewer false positives than Checkov. It also logs the fewest clean-module false positives of all five (33, just ahead of KICS's 34) — and KICS reaches that only by detecting 3% of the privesc paths to audytx's 100%.
Table 1 — IAM privilege-escalation: precision & recall
Corpus: BishopFox iam-vulnerable — 31 documented AWS IAM privilege-escalation paths, one Terraform file per path. A tool "detects" a path if it fires at least one HIGH/CRITICAL finding on the file implementing it. audytx and Checkov are scored on HIGH; KICS and Trivy on HIGH+CRITICAL.
| Tool | HIGH findings | TP | FP | FN | Precision | Recall |
|---|---|---|---|---|---|---|
| audytx | 135 | 31 | 104 | 0 | 23% | 100% |
| Checkov | 269 ¹ | 31 | 238 | 0 | 12% | 100% |
| KICS | 9 | 1 | 8 | 30 | 11% | 3% |
| Terrascan | DNF ² | — | — | — | — | — |
| Trivy | 7 | 0 | 7 | 31 | 0% | 0% |
¹ Checkov run offline (no Bridgecrew API) emits no per-check severity, so this
is total failed checks, not HIGH-only — generous for Checkov's recall,
conservative for its precision.
² Terrascan exceeded the 5-minute timeout on this corpus (large module graph).
audytx and Checkov are the only two tools that detect all 31 paths.
audytx does it with ~2× the precision and half the alert volume. Most of
audytx's 104 "FP" here are legitimate secondary detections (e.g. AWS_OPS_038
firing on the same privesc file the primary rule already claimed) plus the
corpus's own intentional FP-test fixtures — TP matching counts only one finding
per path, which structurally undercounts audytx precision.
Table 2 — False positives on 21 clean production modules
This is the wedge. Each corpus is a well-regarded, actively-maintained AWS community Terraform module with an expected high-severity count of 0. Lower is better — every HIGH finding here is noise a reviewer has to triage. Raw counts shown as measured (nothing subtracted).
| Corpus (clean module) | audytx | Checkov | Trivy | KICS | Terrascan |
|---|---|---|---|---|---|
| cloudposse-s3-bucket | 1 | 36 | 1 | 0 | 2 |
| terraform-aws-alb | 4 | 54 | 13 | 2 | 6 |
| terraform-aws-apigateway-v2 | 0 | 20 | 7 | 1 | 5 |
| terraform-aws-autoscaling | 0 | 11 | 6 | 0 | 1 |
| terraform-aws-cloudfront | 0 | 24 | 8 | 0 | 6 |
| terraform-aws-ecr | 0 | 5 | 1 | 2 | 0 |
| terraform-aws-ecs | 0 | 86 | 16 | 2 | 16 |
| terraform-aws-eks | 6 | 88 | 38 | 1 | 1 |
| terraform-aws-eventbridge | 7 | 57 | 18 | 12 | 24 |
| terraform-aws-iam | 3 | 287 | 1 | 0 | 4 |
| terraform-aws-kms | 0 | 1 | 0 | 0 | 11 |
| terraform-aws-lambda | 6 | 112 | 23 | 7 | 9 |
| terraform-aws-rds | 2 | 124 | 7 | 1 | 10 |
| terraform-aws-s3-bucket | 3 | 129 | 18 | 4 | 348 ³ |
| terraform-aws-secure-baseline | 1 | 107 | 8 | 2 | 8 |
| terraform-aws-security-group | 0 | 10 | 2 | 0 | 0 |
| terraform-aws-sns | 0 | 4 | 0 | 0 | 0 |
| terraform-aws-sqs | 0 | 1 | 0 | 0 | 0 |
| terraform-aws-step-functions | 0 | 6 | 1 | 0 | 1 |
| terraform-aws-vpc | 0 | 25 | 3 | 0 | 1 |
| trussworks-s3-private | 0 | 6 | 4 | 0 | 1 |
| Total | 33 | 1,193 | 175 | 34 | 454 |
³ Terrascan fires 348 HIGH alerts on a single module (terraform-aws-s3-bucket) —
a mass-rule blowup that alone accounts for 77% of its total.
audytx's 33 = ~36× fewer than Checkov (1,193),
14× fewer than Terrascan (454), and 5× fewer than Trivy (175) — and the fewest
raw false positives of all five, just ahead of KICS (34), which reaches that only
by detecting 3% of the privesc paths (Table 1) to audytx's 100%. Several of
audytx's 33 trace to documented justified exceptions (real issues in the modules'
own example code, tracked in clean-modules.yaml); the rest are new
rules (EKS secrets-encryption, deprecated Lambda runtimes) correctly flagging
issues in the modules' examples/ — see the honest column below.
Table 3 — Recall corpora (raw HIGH counts)
Six additional corpora with deliberately insecure configurations. We have no path-level ground truth here beyond iam-vulnerable, so these are raw HIGH counts, not scored precision/recall — more is not automatically better, since a chunk of any tool's count is noise. Shown for completeness.
| Corpus | audytx | Checkov | Trivy | KICS | Terrascan |
|---|---|---|---|---|---|
| KaiMonkey | 40 | 109 | 112 | 0 | 21 |
| iam-role-chain | 4 | 9 | 0 | 1 | 0 |
| learn-terraform-provision-eks-cluster | 2 | 3 | 3 | 0 | 0 |
| sadcloud | 47 | 201 | 26 | 53 | 58 |
| terraform-aws-eks-blueprints | 29 | 210 | DNF ⁴ | 13 | 7 |
| terragoat | 52 | 466 | 93 | 70 | 35 |
⁴ Trivy timed out on eks-blueprints (5-min limit); Terrascan DNF on iam-vulnerable (Table 1).
How the precision gap happens — a worked example
The Table 2 result is not fewer rules — it's cross-resource reasoning. audytx pre-computes relationship graphs and suppresses findings that context proves benign, showing the rationale instead of dropping them silently. Here's the mechanism on one fixture (testbed #11), illustrative of why the clean-module counts diverge so far.
Serverless messaging — SQS DLQ chain, sync + polled-async Lambdas, TTL'd DynamoDB
Single-resource scanner
Lambda DLQ missing
Lambda DLQ missing
point-in-time recovery not enabled
queue has no DLQ of its own
audytx — same resources, with context
suppressed — sync via API Gateway; a Lambda DLQ only fires on async invokes, so it would never receive an event
suppressed — polled-async via SQS event-source mapping; failures handled by the queue's redrive_policy, not a function DLQ
suppressed — TTL configured for ephemeral request logs; PITR is mismatched for data that self-expires
suppressed — this queue is the dead-letter queue; requiring it to have its own DLQ is infinite regress
Multiply this across DLQ identity, Lambda invocation graphs, encryption variants, data lifetime, network exposure, IAM trust/policy reachability, tag environment and IMDSv2 inheritance — 17 reasoning axes in the live engine — and you get the 48-vs-1,193 gap in Table 2.
Methodology
Everything needed to reproduce the run, exactly as it was performed.
| Tool | Version | How it was run |
|---|---|---|
| audytx | 0.14.6 | Live GitHub App scan → Code Scanning SARIF (version-verified) |
| Checkov | 3.2.520 | pip install · checkov -d <dir> --framework terraform -o json |
| Trivy | 0.71.0 | trivy config <dir> --severity HIGH,CRITICAL |
| KICS | 2.1.20 | kics scan -p <dir> -t Terraform |
| Terrascan | 1.19.9 | terrascan scan -i terraform -d <dir> |
Corpus: 28 AWS-Terraform repos/modules in the public
audytx-testbed, each
on a bench/<name> branch. 5-minute timeout per tool per corpus
(timeouts = DNF). No suppression files for any tool. Scoring is
scripts/score.py (Python-3 stdlib only) — given the same inputs it
produces a byte-identical scorecard every run. TP matching: a finding counts
once per ground-truth path if its file and resource/rule reference that path;
extra findings on the same path count as FP (conservative for audytx).
Where audytx is weaker — the honest column
This is a benchmark, not a sales sheet. The places audytx loses:
- Checkov has more raw coverage. Many legitimate Checkov findings (X-Ray tracing, code signing, function-in-VPC, reserved concurrency, TLS-version pinning) are real concerns audytx does not yet flag. If you want breadth-first "tell me everything potentially wrong," Checkov has more rules. audytx's catalog is a curated set focused on patterns it understands deeply enough to reason about — depth over breadth, by design.
- The IAM precision number is honest, not flattering. On the deliberately-vulnerable iam-vulnerable corpus, audytx fires 135 HIGH findings for 31 paths. Even accounting for legitimate secondary detections and the corpus's own FP-test fixtures, that is a lot of alerts — appropriate for a corpus that is wall-to-wall privesc, but it's not a "low volume" story there. The low-volume story is Table 2.
- Clean-module false positives: 27 (v0.4.1) → 33 (v0.14.8).
Investigating that rise surfaced a real bug —
AWS_OPS_010(public Lambda Function URL) had inverted match-logic and fired on phantom module-synthesized URLs; fixed in v0.14.8, which removed 15 of them. The remaining handful are new rules (EKS secrets-encryption, deprecated Lambda runtimes) correctly flagging real issues in the modules' ownexamples/code — not bogus matches. Net, audytx is again the lowest-false-positive tool of the five (33), ~36× below Checkov. - The ground truth was authored by us. The iam-vulnerable path list follows directly from BishopFox's upstream docs and audytx rules were not tuned against it, but it is our scoring file. The unmatched-findings audit is published for independent checking.
- Single run, AWS-only. Each tool was scanned once (Terrascan in particular shows timeout variance on large corpora), and the whole corpus is AWS Terraform — this says nothing about multi-cloud or CloudFormation, which audytx deliberately does not cover.
Reproduce it yourself
The corpus, the ground truth, and the scorer are public. You do not have to take our numbers on faith.
# 1. Clone the public benchmark corpus git clone https://github.com/victorsinha/audytx-testbed cd audytx-testbed # 2. Run any competitor on a corpus (example: Checkov on a clean module) checkov -d corpus/terraform-aws-rds --framework terraform -o json | jq '.summary' # 3. audytx numbers come from the live Code Scanning SARIF on each bench branch gh api "repos/victorsinha/audytx-testbed/code-scanning/analyses?ref=refs/heads/bench/terraform-aws-rds" \ --jq '[.[] | select(.tool.name=="audytx")] | sort_by(.created_at) | last | .id' # 4. Re-score everything deterministically python3 scripts/score.py results ground-truth
Full write-up — every table, footnote and caveat — lives in
docs/benchmark-v1.md in the engine repo.
audytx vs Checkov: AWS Terraform security scanner comparison
The short version: both detect 100% of documented IAM privilege-escalation paths. audytx fires 36× fewer false positives on clean production modules (33 vs 1,193). Checkov has more raw rule coverage. For teams whose scanner is muted because of noise, audytx's precision matters more — for teams that want maximum breadth and tolerate triage work, Checkov delivers more rules.
Choose audytx when
- Your team has turned off or started ignoring another scanner due to alert fatigue
- You need 100% IAM privesc recall AND low noise (audytx is the only tool that delivers both)
- You want the reasoning behind each suppressed finding — not just a pass/fail
- You use AI coding agents and need an MCP server for pre-PR checks
- You want free PR comments without a Bridgecrew account or API key
Choose Checkov when
- You need maximum rule breadth: X-Ray tracing, code signing, function-in-VPC, TLS version pinning, reserved concurrency — Checkov has these, audytx doesn't yet
- You're already on the Bridgecrew/Prisma Cloud platform and want native integration
- You run multi-cloud or CloudFormation (audytx is AWS + Terraform only by design)
- You want a broad "tell me everything possibly wrong" sweep rather than a high-precision review
audytx vs Trivy: Terraform IaC scanner comparison
Trivy is a multi-purpose security scanner (containers, images, SBOMs, IaC). Its Terraform coverage focuses on common misconfigurations and has 0% recall on IAM privilege-escalation paths — Trivy fires 7 HIGH findings on iam-vulnerable, none of which are correct privilege-escalation detections. audytx has 5× fewer false positives on clean modules (33 vs 175) while detecting all 31 IAM privesc paths Trivy misses entirely.
Choose audytx when
- IAM security is a priority — Trivy has no IAM attack-path detection
- You want cross-resource reasoning and context-aware suppression
- You're Terraform-on-AWS focused and want depth over breadth
- You need MCP server integration for AI coding agents
Choose Trivy when
- You need a single tool covering containers, images, SBOMs, and IaC together
- You scan multiple cloud providers or CloudFormation (Trivy supports both)
- You want Kubernetes manifest and Helm chart scanning alongside Terraform
- You need offline / airgapped scanning with self-contained binaries
audytx vs KICS: Terraform security tool comparison
KICS (Keeping Infrastructure as Code Secure, by Checkmarx) scores close to audytx on clean-module false positives (34 vs 33) — but reaches that only by detecting 3% of IAM privilege-escalation paths (1 of 31) versus audytx's 100%. KICS trades recall for precision. audytx achieves both: the lowest false-positive count and full IAM attack-path coverage.
Choose audytx when
- You need both low false positives AND comprehensive IAM privesc detection
- You want each suppressed finding shown with its rationale — not just fewer alerts
- You want a GitHub App (install in 60s, no CI step) instead of a CLI tool
- MCP server support for AI coding agents matters
Choose KICS when
- You need multi-cloud support: KICS covers Azure, GCP, Kubernetes, Docker, Ansible, CloudFormation
- You're already on the Checkmarx platform for SAST and want a unified tool
- IAM attack-path detection is not a priority and low alert volume is
- You prefer a self-hosted CLI with no external calls
audytx vs Terrascan: Terraform static analysis comparison
Terrascan (by Tenable) timed out on the iam-vulnerable corpus (5-minute limit exceeded) and produced 454 false positives on 21 clean modules — 14× more than audytx. A single module (terraform-aws-s3-bucket) triggered 348 HIGH alerts in a mass-rule blowup, accounting for 77% of Terrascan's total clean-module count.
Choose audytx when
- Scan time reliability matters — Terrascan timed out on large module graphs
- You need IAM privilege-escalation path detection (Terrascan DNF'd on this corpus)
- 14× lower false-positive volume is a meaningful team-productivity gain
- You want a GitHub App with PR comments and SARIF upload, not a local CLI
Consider Terrascan when
- You need broad multi-cloud coverage (Azure, GCP, Docker, Kubernetes) alongside Terraform
- You want OPA-based custom policies with Rego
- You're on the Tenable platform and want native integration
- You need airgapped / fully self-hosted scanning
See your own numbers
One click to install. Free on every repo, public or private — there's no plan to choose.
Install audytx →