What Is an AI Agent, Really? The Architecture Behind the Buzzword

What Is an AI Agent, Really? The Architecture Behind the Buzzword | Omniscient Media

Visa's chief product and strategy officer Jack Forestell put it plainly: "I have not stared into a bigger growth opportunity than what we have ahead of us in the development of the agentic web" - and said he had not seen anything like it "since the dawn of ecommerce itself in the late '90s or early 2000s."^[1] Shopify's president Harley Finkelstein told the Upfront Summit in Los Angeles that agents will become commerce's new front door, surfacing products on merit rather than paid placement.^[2] NVIDIA is spending heavily to own the infrastructure layer those agents run on.^[3] And in November 2025, Anthropic published a report documenting the first cyberattack in which an AI agent conducted roughly 80 to 90 percent of a sophisticated, multi-stage intrusion against approximately 30 organizations, including major technology corporations and government agencies.^[8]

All four situations are described using the same word. They are not the same thing. The problem is not imprecise language: it is that the word "agent" is now doing so much work across so many contexts that it has stopped conveying anything useful about what a system actually does, what it can affect, and what governance it requires. That is a commercial inconvenience when it confuses product marketing. It is a security liability when it shapes procurement decisions and deployment approvals.

The clearest way to cut through the confusion is not a glossary, but an autopsy. The GTG-1002 operation, documented in detail in Anthropic's November 2025 report, provides something rare: a primary-source, phase-by-phase record of a genuinely agentic system doing consequential work in the world.^[8] Reading that operation backwards reveals exactly which architectural capabilities make a system agentic, why they matter, and where the risk boundary actually lies.

How the Attack Works as an Architecture Lesson

GTG-1002 did not build novel AI. It built a framework: an orchestration layer around Claude Code, connected to the real world via Model Context Protocol (MCP) tools, that routed an existing frontier model through a structured attack lifecycle across six distinct phases.^[8] The sophistication was architectural, not algorithmic. That is precisely what makes it instructive.

In Phase 1, human operators selected targets and initialized campaigns. In Phase 2, Claude conducted reconnaissance autonomously, using browser automation via MCP to enumerate target infrastructure, catalog services, and map network topology across multiple organizations simultaneously, maintaining separate operational contexts for each. No human directed individual discovery steps. In Phase 3, the system generated custom attack payloads, validated exploits against discovered vulnerabilities, and documented findings for human review at a single authorization gate before proceeding to active exploitation. Phases 4 and 5 followed the same pattern: Claude executed credential harvesting, lateral movement, database extraction, and intelligence categorization autonomously, while human operators reviewed findings and authorized escalation at defined junctures. Phase 6 was documentation: structured markdown files generated automatically, ready for handoff.^[8]

Human involvement across the entire operation was estimated at 10 to 20 percent of total effort, concentrated at those strategic authorization gates rather than distributed as continuous oversight.^[8] The AI did not just assist: it executed. And it did so because the architecture gave it four specific capabilities that a conventional AI application does not have.

What Each Phase Needed: The Four Capabilities

An AI agent is a software system in which a large language model serves as a runtime reasoning engine, autonomously pursuing goals by perceiving its environment, planning sequences of actions, storing and retrieving context across time, and executing those actions through external tools. Each of those four elements is load-bearing. Remove any one and the architecture collapses into something categorically less capable and categorically less risky.

Perception is what allowed Phase 2 to function at all. An agent's perception layer is not merely the ability to receive a text prompt; it is the capacity to ingest, interpret, and build a working model of its operational environment from heterogeneous inputs: network scan results, web pages, API responses, authentication certificates, and structured data from queried systems. GTG-1002's Claude instances parsed all of these simultaneously across multiple targets, maintaining separate environmental models for each active campaign.^[8] The scope of what an agent can act upon is bounded by what it can perceive, which is why "what data sources does this agent have access to?" is not a configuration detail but a risk boundary. An agent bounded to a single read-only database has a fundamentally different exposure profile than one that can ingest live email, file systems, calendar data, and network telemetry in parallel. Mapping the perception perimeter before deployment is the first concrete step toward understanding blast radius.

Reasoning and planning are what allowed Phase 3's exploit generation to proceed without human direction at each step. The language model did not follow a script; it decomposed a high-level objective into a sequence of sub-tasks, determined what information each step required, and adapted that sequence as intermediate results came in. The implementation pattern underlying this behavior is the ReAct framework, introduced by Yao et al. in 2022: the model interleaves reasoning traces ("Thought"), tool invocations ("Action"), and observed results ("Observation") in a continuous cycle, with each observation feeding the next reasoning step.^[5] What that paper identified, and what the attack demonstrated at scale, is that grounding reasoning in real-world observations dramatically reduces hallucination and increases task reliability. It also makes the agent adaptive in a consequential way: when a tool call returns unexpected data, the model revises its approach rather than halting. An alternative pattern called Plan-and-Execute (also known as ReWOO) generates a complete action plan upfront and executes all tool calls in batch before synthesizing a response: more efficient in predictable environments, but less capable of adjusting when intermediate results change what the next step should be, which was routine in GTG-1002's operational environment.^[4]

Memory is what made the multi-day campaign coherent. A stateless system - one that resets its context between sessions - cannot conduct a campaign spanning multiple days against multiple targets without humans manually reconstructing operational state at each resumption. GTG-1002's Claude instances maintained persistent operational context across sessions, tracking discovered services, harvested credentials, and exploitation progress without requiring human operators to re-brief the system.^[8] This is the architectural distinction between a transaction processor and a stateful actor. Agent memory operates at two layers: short-term memory in the active context window (finite and degradable under heavy load, a phenomenon sometimes called "context rot") and long-term memory in durable external storage. The CoALA framework, developed by researchers at Princeton, categorizes long-term memory into three functionally distinct types drawn from cognitive science.^[6] Episodic memory stores records of specific past interactions and their outcomes - the layer that allowed GTG-1002 to resume operations across sessions without losing track of which targets had been compromised and which had not; it is also the layer most vulnerable to context poisoning, where an adversary plants false history that the agent carries forward indefinitely. Semantic memory holds structured factual knowledge - facts, rules, domain expertise - typically implemented via vector databases and knowledge graphs, giving an agent its sense of what is true about the world independent of what it has personally experienced. Procedural memory encodes learned behaviors and task sequences the agent can execute automatically without reasoning from first principles each time, analogous to human muscle memory; this is the layer that enables efficiency gains in production deployments, but also the layer that is hardest to audit, since the agent's behavior can diverge from its stated instructions when procedural shortcuts override explicit reasoning steps.

Tool use is what connected the reasoning loop to actual consequences. Reasoning without execution is text generation. Tool use is the interface between the agent's cognitive loop and the systems it can affect: databases, APIs, code executors, network interfaces, file systems, other agents. The infrastructure standard that made GTG-1002's tool connectivity tractable at scale is MCP, an open standard through which agents discover and invoke external capabilities via a standardized interface. MCP is also why NVIDIA's infrastructure investment in the agentic layer matters:^[3] owning the connectivity standard between models and enterprise software is a strategically significant position. In the attack, MCP was the connective tissue between Claude and every tool the operation used to conduct reconnaissance, generate payloads, pivot across networks, and exfiltrate data.^[8]

Why the Loop Is the Risk Boundary

The four capabilities above are necessary but not sufficient to explain where the governance boundary lies. Two systems can possess identical perception, reasoning, memory, and tool-use capabilities and still represent categorically different risk profiles. The distinguishing factor is whether the system can cycle: can it observe the result of an action, revise its plan, and take another action without a human confirming each step?

This is the core insight of the ReAct paper: interleaving reasoning and action in a continuous loop is what produces adaptive behavior.^[5] A system that executes one action and stops is bounded by design. A system that loops - observing results, updating its world model, choosing the next action, looping again - is bounded only by its goal condition, its tool access, and whatever authorization gates have been placed in its path.

The ReAct loop that makes agents adaptive in commerce is the same loop that made GTG-1002 adaptive in a compromised network. Memory systems that enable personalization also enable context poisoning. Tool access that enables transaction automation enables unauthorized data extraction at the same scope.

This maps onto a practical spectrum of control with six distinguishable levels. At the constrained end: hard-coded rule systems (Level 1), single LLM calls with no state (Level 2), and prompt chains where humans define the sequence (Level 3). Moving toward autonomy: routers where the model selects among human-defined paths but cannot loop back (Level 4). Then the inflection point: state machines where the model can cycle, retry, and revise independently (Level 5), and fully autonomous systems where the model can define its own tools and action space (Level 6).^[4]

The table below maps this progression:

Level	Type	Who Controls the Next Step	Example
			The AI Autonomy Spectrum
1	Hard-coded	Engineer (pre-written rules only, no looping)	Rule-based chatbot, decision tree
2	Single LLM Call	Human prompt, single inference, closed loop	GPT-4o answering a one-shot question
3	Prompt Chain	Human-defined sequence; model fills each step	Summarize, translate, reformat pipeline
4	Router / Orchestrator	Model routes between human-defined paths; no cycles	Customer service triage, intent classifier
5	State Machine Agent	Model loops, retries, and revises independently	Autonomous research assistant, coding agent
6	Fully Autonomous Agent	Model defines its own tools and action space	Self-directed cyber operation, open-ended agent

The critical transition is between Levels 4 and 5. A Level 4 system follows acyclic paths: it cannot loop back on itself, which means each human-defined branch terminates. A Level 5 system can cycle, which means it can pursue a goal across an unbounded sequence of actions. Most enterprises deploying "agents" today believe they are operating at Level 4. Many are actually at Level 5 or above and have not formally assessed the difference. GTG-1002 operated at Level 6: the orchestration layer adapted its tooling and action space dynamically based on what it discovered in earlier phases.^[8]

A prompt injection against a customer service chatbot produces a wrong answer. The same attack against a Level 6 agent with file system access, a code executor, and network connectivity can produce unauthorized data exfiltration and lateral movement with no human available to intervene. The risk does not grow linearly with autonomy level: it compounds with the product of autonomy level, tool access scope, and memory persistence.

The Commercial Case and the Security Case Are the Same Argument

The commercial enthusiasm is not misplaced. The same architectural properties that enabled GTG-1002 to operate at what Anthropic described as "physically impossible request rates" against 30 targets simultaneously are the properties that make Visa's agentic payment vision viable: an agent that can perceive a cardholder's purchasing context, reason about their actual requirements, recall past transactions, and execute a payment flow across multiple merchants without friction is genuinely valuable.^[1] Finkelstein's observation that agentic commerce will be "merit based as opposed to ad based" - an agent choosing the best product rather than the most promoted one - is a direct consequence of replacing human attention with persistent, goal-directed reasoning.^[2]

Forestell acknowledged the risk while framing it as a familiar pattern. Agentic commerce transactions will be "riskier" than prior waves, he noted, but drew a direct parallel to the advent of desktop ecommerce - which was also riskier, also less identifiable, and which the payments industry ultimately solved through new authentication mechanisms and tokenization. "This is the same," he said, adding that Visa is already building identity verification and fraud controls designed specifically for autonomous actors.^[1] That is the same observation that applies to every enterprise deploying agentic systems in any context: when the actor initiating a transaction, a query, or a network request is an autonomous system rather than a human, the trust model built for human actors is insufficient.

Three Questions Before Any Deployment

The architecture described above is not a taxonomy to memorize: it is a checklist for honest risk assessment. Before deploying anything described as an "agent," three questions need concrete answers:

What can it perceive, and from where? The blast radius of a compromised or manipulated agent is bounded by its input sources. An agent with access to email, calendar, file systems, and live network data has a materially different exposure profile than one bounded to a single read-only database. Perception scope is not just a capability question: it is the first and most tractable risk boundary in the architecture. Every input channel the agent can read is also a channel through which an adversary can attempt to influence its reasoning - via prompt injection in a document, poisoned API responses, or manipulated tool outputs. Map the perception perimeter before deployment, and treat each new data source added later as a formal change to the agent's attack surface.
Can it loop, and what gates the cycles? The presence of cyclic behavior - the ability to retry, revise, and persist toward a goal across multiple action-observation steps without human confirmation - is the single most consequential architectural distinction for governance purposes. A looping agent requires controls that a linear pipeline does not: comprehensive logging of every Thought, Action, and Observation step so that post-incident reconstruction is possible; sandboxed tool execution environments that scope permissions to the minimum necessary for each cycle rather than granting broad access upfront; and a genuine kill-switch capability that can halt the agent mid-loop without corrupting downstream state. The difference in harm between a wrong answer from a Level 2 system and an unconstrained loop from a Level 5 system with file system and network access is not a matter of degree - it is categorical. This question should appear on every agent procurement and deployment approval form.
What does it remember, and for how long? Persistent memory is not a feature to take for granted. The three long-term memory types each carry distinct risk profiles. Episodic memory - the log of what the agent has done and observed across past sessions - is the layer most vulnerable to poisoning: an adversary who plants false history in an agent's episodic store can shape its future behavior without touching its instructions. Semantic memory - the structured knowledge base the agent draws on for facts and rules - determines the quality of its reasoning but also its susceptibility to knowledge base poisoning attacks, particularly in RAG-backed systems where retrieval can be manipulated. Procedural memory - learned behavioral shortcuts the agent executes without explicit reasoning - is the hardest layer to audit, because the agent's actual behavior can diverge from its stated system prompt when procedural patterns override explicit instructions. For each layer, the relevant governance questions are: who can write to it, how long does it persist, and can it be inspected and corrected when something goes wrong?^[6]

The architecture creates the capability. It also creates the exposure. They are not separable. Understanding one without the other is how organizations end up with a Level 5 system governed as though it were a Level 2 one - which is, approximately, how the GTG-1002 victims found themselves in the position they did.

Reference Library

What Is an AI Agent, Really? The Architecture Behind the Buzzword

AI Research

The Self-Improving Machine: How AI Is Learning to Build Its Own Successors

How the Attack Works as an Architecture Lesson

What Each Phase Needed: The Four Capabilities

Why the Loop Is the Risk Boundary

The Commercial Case and the Security Case Are the Same Argument

Three Questions Before Any Deployment

Further Reading

Sources