
Tuesday, June 2, 2026
Everyone is building "agents" - but Visa's payment agent, a customer service bot, and the AI system behind the first documented autonomous cyberattack are not the same thing. A dissection of what genuinely agentic architecture looks like, and why the distinction is a governance question, not a technical one.
Cerebras and AWS are deploying CS-3 wafer-scale systems inside Amazon data centers, pairing them with Trainium in a disaggregated inference architecture available through Amazon Bedrock. The setup targets the memory-bandwidth bottleneck that limits GPU-based decode, promising thousands of output tokens per second for agentic workloads.
A prompt injection hidden in a GitHub README was enough to compromise Snowflake's Cortex coding agent, bypass its human-approval system, escape its sandbox, and wipe a victim's entire Snowflake database. The attack, now patched, exposes structural vulnerabilities common to agentic AI systems far beyond Snowflake.
Last December, Anthropic asked 80,508 Claude users across 159 countries what they actually want from AI. The findings are both clarifying and unsettling - and reveal a design brief most AI labs aren't executing against.
Every time you use a chatbot or ask an AI to generate an image, you are interacting with the same underlying idea: a transformer. This is a complete guide to the architecture that made modern AI possible, written for anyone curious enough to want to understand what is actually happening inside these systems.
Moonshot AI's Kimi team proposes replacing transformer residual connections with a lightweight attention mechanism over prior layer outputs. The result: equivalent training performance at 1.25 times less compute, with gains confirmed across model sizes. It is the cleanest architectural challenge to a foundational LLM assumption in years.