Roadmap
What audytx is building and why. Phases are ordered by leverage — each one deepens the moat rather than expanding the surface.
The thesis: every incumbent (Checkov, Trivy, Snyk, KICS) wins on breadth and rule count, produces real recurring false positives, and flags without fixing. That leaves a precise, defensible lane: the zero-setup AWS + Terraform reviewer that is accurate (context-aware, low false-positive rate), reasons about IAM the way attackers do, and teaches you why — running in the cloud so a solo dev on a Chromebook gets the same review as a funded startup.
The bet behind the sequencing: more and more of the world's Terraform is written by coding agents — and when the author is an agent, the reviewer has to be consumable by the agent too. Context-aware cross-resource reasoning matters more in that world, because no human on the team holds the context. That's why the agent surface ships first, and why the test corpus of the future is AI-generated Terraform.
Foundation — benchmark, fixes, IAM depth, guardrails
Shipped · v0.2–v0.3The original four phases, complete: prove the accuracy moat with a measured benchmark, make every finding fixable and teachable, go deep on IAM the way attackers do, and guard against the credential / bill-shock incident.
What shipped
- 5-tool benchmark on 28 corpora — 100% recall on 31 IAM privilege-escalation paths (tied with Checkov; KICS 3%, Trivy 0%) at ~36× fewer false positives than Checkov on 21 clean production modules (33 vs 1,193 — the fewest of the five). See the full comparison.
- One-click GitHub suggestions — single-line and multi-line fixes anchored to the exact offending lines, plus plain-English "why this matters" per finding
- The IAM attack graph — privilege-escalation catalog gated on exploitability, trust-graph reasoning, multi-hop role chaining, and cross-resource attack paths
ATTACK_PATH_001–008 - Secrets + bill-shock fusion — hardcoded-credential detection and cost×security signals (GPU + admin IAM, hardcoded keys + expensive compute)
- 17 context-reasoning axes — the false-positive suppression layer, with every suppression surfaced with its rationale
The agent surface — MCP + autofix loop
Shipped · v0.4.0When the code author is an agent, the reviewer must be consumable by the agent. audytx is now an MCP server — the same engine the GitHub App runs, callable before the PR exists. One line of config, no CI, no token:
claude mcp add --transport http audytx https://audytx.com/mcpscan_terraform— findings with file:line evidence, severity, fix snippets, and the context-suppressed findings with their rationaleautofix_terraform— the server applies the precisely-anchored sound fixes, re-scans, loops; returns fixed files + what remains. Same soundness bar as GitHub one-click suggestions: never a corrupting edit
Benchmark v1.0 — ground truth
ShippedTurn the internal benchmark into a publishable, attack-proof artifact: label ground truth on the IAM corpus against its documented privilege-escalation paths so every tool gets a true precision/recall score (not a raw finding count), re-pull all corpora from SARIF like-for-like, and publish the harness with the write-up.
Never silent, never blind
Shipped · v0.5A check-run per scan so a failed or skipped scan is a visible status instead of silence, and real usage telemetry (installs → scans → outcomes). A reviewer you can't tell is working is a reviewer you stop trusting.
Findable everywhere
DeferredGitHub Marketplace listing, the published benchmark, and the MCP endpoint announced where agent builders look. Zero-setup only matters if you can find the thing to not-set-up.
The AI-generated-Terraform corpus
ShippedGenerate Terraform from frontier models against realistic infra prompts, catalog the characteristic failure modes (plausible-but-overbroad IAM, hallucinated module arguments, missing companion resources), calibrate the engine against them — and publish "what AI-generated Terraform gets wrong." The test corpus of the market that's coming, re-run per model generation.
Parser ceiling — module calls + expressions
Shipped · v0.6–v0.7
Expand registry module calls (starting with the most-used
terraform-aws-modules), resolve variable defaults / tfvars / locals, and handle
count/for_each. Real repos — and AI-generated ones
especially — compose modules heavily; this raises the ceiling of every
reasoning axis and attack path on real-world code.
IAM v2 — the effective-permission engine
Shipped · v0.8–v0.10Wildcard action expansion against a real service-action table, Condition / NotAction / permission-boundary math, identity×resource-policy intersection. Attack paths become graph reachability over effective permissions instead of curated patterns — found by search, not by hand-written rules.
Suppression integrity
Shipped · v0.11Every context-suppression is a potential false negative, so the suppressions get their own adversarial test suite: for each axis, real HCL where the suppression must not fire — published as a false-negative rate alongside the false-positive benchmark. Precision and recall of the reasoning itself.
Opt-in plan ingestion
Shipped · v0.12
Accept terraform plan JSON via an Actions artifact as a secondary,
never-required input — it resolves what HCL heuristics can't (final computed
values, count expansion). Mandatory plan ingestion stays off the table: it
kills zero-setup.
Cost×security depth → team governance
HorizonBefore/after cost delta fused with security context ("expensive AND cryptojacking-shaped"), then the org layer: merge gates on Critical findings, cross-PR baselines, compliance evidence export, multi-repo visibility. Explicitly demand-gated — individual devs and agents have to love it first.
Who this is for
Early-stage startups
3–10 people, no dedicated security or platform engineer. Needs PR-level guidance without a CI pipeline config or a week of setup.
Solo developers
Refuses the overhead of maintaining local CLI scanners. The cloud-executes-it-for-you value prop: install the GitHub App once, every future PR is reviewed.
Students / resource-constrained devs
Can't run Checkov + tfsec + Infracost locally on every push. A Chromebook gets the same quality review as a funded team's workstation.
Coding agents
Writing more of the world's Terraform every month. One MCP endpoint gives the agent the full context-aware review — and the autofix loop — before the PR exists.
What's deliberately not on the roadmap
- Multi-cloud (Azure, GCP). AWS-deep is the v0.x wedge. Every resource-type and reasoning axis is AWS-specific by design — breadth across clouds at the cost of IAM depth isn't the trade.
- CloudFormation, Pulumi, CDK input. Terraform-only today. The parser crate has a CloudFormation stub; it isn't wired to the PR path.
- Unsupervised commits to your branches. The MCP autofix loop returns fixed files to the agent that asked for them, gated to precisely-anchored sound fixes — audytx never pushes commits to your repo. On the PR surface, fixes stay one-click suggestions you apply.
- Mandatory
terraform planingestion. It needs cloud credentials and CI wiring — exactly the setup burden audytx exists to remove. An opt-in plan-JSON artifact is Phase 9. - SAML SSO / enterprise onboarding. Won't build the paperwork until the product is sticky with individuals. Enterprise procurement can wait.
Follow the build
Install audytx and every new axis, rule, and phase lands on your next PR — no upgrade step.