The state of LLMs — June 2026

We’re six months into 2026 and the LLM landscape has accelerated past every prediction made in January. New models. New pricing. A $965 billion valuation. Agent platforms launching weekly. Here’s where things stand as of early June.

The big three just got bigger

Anthropic released Claude Opus 4.8 on May 28. It’s an incremental upgrade over Opus 4.7 — stronger coding, better agentic task handling, more consistency on long-running work. The pattern is clear: Anthropic is shipping Opus point-releases every 4-6 weeks, each one incrementally better at the same things. No paradigm shift. Just relentless iteration.

On the same day, Anthropic announced a $65 billion Series H at a $965 billion post-money valuation. They also acquired Stainless, an API infrastructure company, signaling that the API business (not just the models) is the long-term play. KPMG deployed Claude across 276,000 employees. The enterprise land grab is in full swing.

Google shipped Gemini 3.5 at I/O on May 19. The headline is “frontier intelligence with action” — models that don’t just think but do. Gemini 3.5 Pro and Flash, plus Gemini Omni for multimodal tasks. Google’s positioning is distinct from Anthropic and OpenAI: they’re building models into every product (Search, Workspace, Android) rather than selling API access as the primary business.

OpenAI upgraded to GPT-5.5 as their flagship. The pricing page shows GPT-5.5 at $5/M input tokens, GPT-5.5-pro at $30/M, and a full cascade of 5.4 variants (standard, mini, nano) for different cost tiers. Plus new reasoning models: o3-pro, o4-mini, o4-mini-deep-research. OpenAI’s strategy is coverage — a model at every price point, from nano to pro.

The new tier: “agent platforms”

The most significant shift this month isn’t a model release — it’s the emergence of agent platforms as products.

Mistral launched Vibe on May 22-28. It’s not a model. It’s a platform where AI agents perform tasks across tools — search, code execution, API calls — while the user watches. Powered by Mistral Medium 3.5, Vibe represents the shift from “ask the AI a question” to “give the AI a job.”

Claude Code and Cursor Agent have been doing this for months. OpenAI Codex does it from the terminal. The agent pattern is settling: give the AI a goal, let it explore, propose a plan, and execute with human approval at each gate.

The distinction between “model company” and “platform company” is collapsing. Anthropic sells an API and an agent. Mistral sells an API and an agent. Google embeds agents into every product. The model is the engine; the agent is the product.

Open-source: the quiet acceleration

Qwen 3.5 and 3.6 are live. Alibaba’s Qwen team continues to ship at a pace that rivals the frontier labs, and the models are competitive with GPT-4-class performance. Qwen 3.5-omni handles multimodal, Qwen 3.5-max-preview pushes the performance ceiling. The gap between open-source and proprietary is now measured in weeks, not months.

Meta’s research releases continue: Muse Spark (personal superintelligence research), TRIBE v2 (brain-process modeling), SAM 3.1 (real-time video detection). None of these are consumer products, but they’re laying groundwork for the next generation.

DeepSeek hasn’t shipped a new model since mid-2025. R1 and V3 remain competitive, but the lack of updates is notable in a field where everyone else ships monthly.

What’s actually changed since January

In January 2026, the conversation was about whether LLMs would plateau. The answer six months in: no. They accelerated. But the acceleration isn’t in raw benchmark scores — it’s in capabilities that matter for real work:

Agentic behavior went from experimental to production. Every major model can now execute multi-step tasks with tools. The question isn’t “can it reason” — it’s “can it do something useful with the reasoning.”
Pricing collapsed. GPT-5.4-nano at under $1/M tokens. Gemini Flash at consumer-accessible pricing. The cost of running an AI agent for an hour is approaching the cost of a cup of coffee.
The enterprise pivot is complete. Anthropic’s KPMG deal. Google’s Workspace integration. OpenAI’s enterprise contracts. These companies aren’t selling to developers anymore — they’re selling to Fortune 500 procurement departments.
Open-source is no longer a differentiator. When Qwen 3.6 matches GPT-4 quality and Mistral’s models are competitive at every tier, being “the open-source option” is table stakes, not an advantage.

What I’m watching for July-December

Opus 5.0. The point releases can’t continue forever. At some point, Anthropic ships a generational upgrade. When and what changes?
Apple’s entry. Apple Intelligence has been quiet since the initial rollout. WWDC 2026 is this month. If Apple makes a serious AI move, it reshapes the consumer landscape overnight.
The first $1 trillion AI company. Anthropic at $965B is close. OpenAI’s next round will likely cross the line. We’re watching the creation of the most valuable companies in history — in real time.
Regulation. The EU AI Act’s next phase. US federal AI legislation (or lack of it). China’s model registration requirements. The legal framework is forming while the technology races ahead.

The LLM story of 2026 so far isn’t “AGI is here.” It’s “AI is everywhere, and it actually works.” That’s less dramatic and more important.

Want this analysis every month? Subscribe to the newsletter. One email when there’s something worth saying — no daily cadence, no algorithm-chasing.

Building with LLMs? The FastAPI Pro Starter gives you a production-grade backend that Claude and GPT-5 actually understand. Docker, CI, structured logging. $29, one-time.