A year ago, announcing an autonomous agent for compliance work got you a headline. Today it gets you a shrug. In June 2025, one of the SMB-focused audit tools shipped an agent that answers security questionnaires on its own. By that August, another had shipped an agent that scores vendor risk. By this spring, the same tools were demoing agents that draft policies, map controls, and chase evidence without a human in the loop. The companies built for founders doing their first attestation now ship the same agent vocabulary as the platforms built for enterprise GRC teams.
A compliance agent is no longer a differentiator. It is the price of being in the room.
That’s not a complaint. It is the most useful thing that has happened to this category in years, because it forces everyone to stop competing on whether they have agents and start answering a harder question: what are the agents reading when they reason.
What Agents Are Good At Now
The capability is real, and today's agents do a few things well. They read a 200-question security questionnaire and draft answers in minutes instead of days. They take a control objective and propose policy language that is close enough to edit rather than write from scratch. They watch a vendor's posture over time and flag the one change that matters out of a hundred that do not. One vendor reports its agent drafts more than 80% of questionnaire responses with a 95% acceptance rate.
For a GRC team that spends up to 70% of its day collecting evidence and rebuilding the same spreadsheets, that speed is worth a lot. The work that used to eat a week now happens while you are in a meeting about something else. This is the part the demos get right.
The trouble is that speed is the easy half of the problem.
{{ banner-image }}
The Failure Mode the Demos Skip
But here is what doesn’t make it into the launch video. An agent that drafts a questionnaire answer in thirty seconds is just as fast when the answer is wrong. A generative model produces text that is fluent and confident by design, which means a wrong answer looks exactly like a right one. If your agent states that you run quarterly penetration tests because a stale policy doc said so, and you actually run them annually, you have not saved a week. You have signed a false attestation faster than a human could have made the same mistake.
And this is not a hypothetical. Late last year Deloitte Australia delivered a roughly 237-page independent assurance report to a government client, and a researcher at the University of Sydney found the document was salted with fabricated citations and a misquoted court judgment. Deloitte agreed to refund part of the fee (Fortune). That was a global firm with a reputation to protect, and the model still produced something plausible and untrue inside a deliverable meant to certify the truth. An agent reasoning over the wrong inputs fails fluently, and someone signs it.
The reason is almost always upstream of the model. The agent answered from a screenshot somebody uploaded eight months ago, a spreadsheet that was accurate the day it was exported, or a policy that describes the company as it wished to be rather than as it is. The model did its job. The inputs lied to it.
Agentic GRC Is Only as Honest as Its Data Layer
This is the line that separates the products, and it has nothing to do with the agent. Agentic GRC splits cleanly into two architectures once you look past the demo.
In the first, the agent reasons over artifacts people uploaded: screenshots, exported spreadsheets, PDFs, attestation letters typed into a portal. Every one of those is a snapshot of a moment that has already passed, with no way for the agent to know whether the underlying control still exists. The agent is fast, articulate, and working from a photograph of the past.
In the second, the agent reasons over structured data pulled directly from the systems that run your business: your cloud accounts, your identity provider, your code repositories, your ticketing system. The evidence arrives with full metadata and timestamps, refreshed continuously, traceable back to the source that produced it. When the agent says a control is in place, it is reading the live state of the environment, not a story about it.
The second kind is the only one a Big Four auditor or a firm like Schellman will actually trust, because the data carries its own proof of where it came from. It is also the only kind that supports continuous compliance instead of a point-in-time snapshot, because source-connected data updates itself while uploaded artifacts rot the moment they are saved. And the market is echoing this. One competitor's own 2026 messaging now draws the distinction between assessments built on "live trust artifacts" and ones built on "stale uploads." When your rivals start arguing your point for you, the argument is settled.
For an enterprise running ISO 27001, PCI-DSS, HIPAA, and FedRAMP at once across several subsidiaries, the gap compounds with every framework and every entity. An agent on uploaded artifacts gives you four confident answers you cannot defend. An agent on source-connected data gives you four answers an auditor will sign.
How To Evaluate AI Agents in GRC
So when a vendor shows you an agent, watch the demo, but interrogate the data layer.
- Where does this answer come from? Ask the agent to justify a single claim and trace it. If the trail ends at an upload, you are looking at a photograph. If it ends at a source system with a timestamp, you are looking at evidence.
- What happens when the environment changes? Revoke an access grant or change a configuration, then ask the agent the same question an hour later. Source-connected data reflects the change. Uploaded data does not know it happened.
- Would an auditor accept this without a follow-up request? If the evidence needs a human to explain its provenance, the agent did not finish the job.
- Does it hold up across frameworks and entities at once? Single-framework, single-entity answers are easy. Enterprise reality is not.
The agent is now a commodity. The thing worth paying for sits underneath it, doing the unglamorous work of making sure the answers are true. Anyone can ship an agent that talks fast. The question is whether it’s trustworthy, and that was always a question about the data.
Anecdotes runs agents on audit-grade data pulled directly from your source systems, the kind of evidence the world's top auditors trust. See how it works at anecdotes.ai.




.png)

