Opus 4.8 🧠, Anthropic at $965B 💰, Microsoft's coding model 👨💻
MCP goes stateless, DeepSeek's permanent price cut, Cognition raises $1B for Devin, and 25+ more stories
This Week’s Big Stories
Must-Read
Anthropic Raised $65B in Series H at $965B Valuation — Anthropic announced a $65 billion Series H round at a $965 billion post-money valuation, making it the most valuable private AI company in history. The company cites $47 billion in run-rate revenue, strong enterprise adoption, and plans to expand compute capacity and research. For context: this is nearly double the ~$500B valuation range that was circulating just months ago. The trajectory from $61B valuation in March 2025 to $965B in May 2026 is unlike anything we’ve seen in enterprise software.
Opus 4.8 — Anthropic released Claude Opus 4.8 with benchmark improvements across the board, adjustable effort controls that let users dial reasoning depth up or down, dynamic workflows in Claude Code, and a faster mode that became significantly cheaper. The effort controls are the most interesting detail — they address the recurring complaint that frontier models burn too many tokens on simple tasks. Paired with the dynamic workflows announcement, Opus 4.8 is as much an infrastructure upgrade as a model upgrade.
TLDR; (30-sec)
Anthropic raises $65B Series H at $965B valuation — $47B run-rate revenue, now the most valuable private AI company
Opus 4.8 ships with adjustable effort controls, dynamic workflows, and a cheaper fast mode
MCP spec release candidate introduces stateless core, OAuth/OIDC authorization, and breaking changes — ships July 28
Cognition raises $1B+ at $26B valuation — Devin cutting project times for Mercedes-Benz, Itaú
xAI lawyers warn staffers to limit contact with Cursor employees — standard during acquisitions, but coming late
DeepSeek makes V4 Pro 75% discount permanent — pricing now below GPT-5, Opus 4.7, and Gemini 3.5 Flash
Anthropic prepares Mythos 1 for broader availability in Claude Code and Claude Security
ElevenLabs releases Music v2 — genre-switching mid-track while maintaining compositional coherence
Anthropic’s SpaceX lease is actually a 180-day agreement with 90-day mutual cancellation — Musk downplayed it
Dynamic workflows rewrote Bun from Zig to Rust — ~960K lines in 6 days, 99.8% test suite pass rate
The 2026-07-28 MCP Specification Release Candidate — The largest revision of the Model Context Protocol since launch. The release candidate introduces a stateless core that scales on ordinary HTTP infrastructure, an extensions system, authorization that aligns with OAuth and OpenID Connect, and a formal deprecation policy. This contains breaking changes. If you maintain MCP servers or clients, start planning now — the final spec ships July 28.
Cognition Raised Over $1B at $26B Valuation — Cognition raised over $1 billion at a $26 billion valuation to expand Devin, its AI software engineer. Devin has cut project times for clients like Mercedes-Benz and Itaú. A $26B valuation for a coding agent company tells you where the market thinks software development is headed — and it’s not a world where developers write every line themselves.
xAI Warns Staffers to Limit Contact With Cursor Employees — xAI’s top lawyer warned employees to carefully moderate interactions with Cursor workers. This is standard during acquisitions, but it’s coming weeks late — employees from both companies have already been working alongside each other. Any accusation that the two sides improperly co-mingled their business could put the deal in jeopardy.
DeepSeek Made Its 75% Discount Permanent — DeepSeek permanently cut V4 Pro prices by 75%. The promotion was originally scheduled to expire at month’s end. DeepSeek’s pricing now sits below OpenAI’s GPT-5, Anthropic’s Claude Opus 4.7, and Google’s Gemini 3.5 Flash. The gap is widest against the frontier reasoning models that enterprise customers rely on for demanding workloads. This is no longer a promotional stunt — it’s a pricing strategy.
Anthropic’s SpaceX Lease — Opinions Vary — The Anthropic-SpaceX compute deal from earlier this month is actually a 180-day lease with a 90-day mutual cancellation clause. Elon Musk downplayed the deal, saying SpaceX hadn’t committed to leasing compute for years. The short-term structure was SpaceX’s request — they may want that compute capacity back.
Anthropic Prepares Mythos 1 for Claude Code and Claude Security — Anthropic appears to be moving Claude Mythos to broader availability. Traces of the model have surfaced on Google Cloud and AWS through vulnerability discovery programs. A general release for Mythos 1 seems imminent. Combined with Opus 4.8, Anthropic is running a two-model strategy: Mythos for security-critical tasks, Opus for everything else.
Quick question — this week was heavily Anthropic-dominated (valuation, Opus 4.8, SpaceX lease, Mythos 1). Do you want more coverage depth on the competitors, or does the coverage match where the actual news was? Hit reply.
Build Tips & Engineering
Introducing Dynamic Workflows in Claude Code — Jarred Sumner used dynamic workflows to rewrite Bun from Zig to Rust — roughly 960,000 lines in six days with 99.8% test suite success. Dynamic workflows let Claude break tasks into subtasks with agents running in parallel until results converge. The Bun rewrite is the most compelling case study yet for agentic coding at scale.
DeepSWE: A Benchmark for Long-Horizon Software Engineering — Datacurve introduces a benchmark spanning 91 repositories in five languages, designed to be contamination-free and reflect real-world complexity. DeepSWE delivers sharper separation metrics for coding agents than SWE-Bench Pro, where most models cluster together. If you’re evaluating coding agents, this is the new bar.
The Cursor Developer Habits Report — Models now use more context to understand codebases, which reduces costs as input and cache-read tokens are cheaper than output tokens. The context-driven approach improves code calibration and increases diff survival rates. The key insight: better context utilization is a cost reduction strategy, not just a quality one.
Sakana Labs: Training Without End-to-End Backpropagation — Sakana Labs breaks neural networks into blocks and trains them independently by treating the forward pass like a diffusion model denoising a signal. This slashes the memory needed to train deep models. If this scales, it could change who can afford to train frontier models.
Evaluating Multi-Agent Systems at Scale — OpenAI outlines practical approaches for evaluating multi-agent systems at production scale. Covers task design, metric selection, and the gap between benchmark performance and real-world reliability. Essential reading as agent deployments move from demos to production.
NVIDIA γ-World: Multi-Agent World Models — NVIDIA introduces a generative world model that supports independently controllable, permutation-symmetric agents. Designed for simulating multi-agent environments with physically realistic interactions.
🧬 Model Releases
Opus 4.8 — Anthropic’s latest flagship with benchmark improvements, adjustable effort controls, dynamic workflows in Claude Code, and a cheaper fast mode. The effort controls let users trade reasoning depth for speed and cost.
ElevenLabs Music v2 — ElevenLabs ships a music generation model capable of switching genres mid-track while maintaining vocal and compositional coherence. The genre-switching capability is the headline feature — most music models break coherence on style transitions.
Apex: Specialized Model for React Native — A React Native coding model trained to build apps by analyzing architecture decisions, fixing framework-specific issues, and reasoning about constraints. Built by Callstack for the React Native ecosystem specifically.
Tools & Product Updates
Introducing Grok Build CLI — xAI launches a new coding agent and CLI in beta for SuperGrok and X Premium Plus subscribers. Supports plan mode reviews, integrates with user conventions, and offers headless mode with specialized subagents for parallel automation.
Secure MCP Tunnel — OpenAI releases a tunnel client that enables connecting private MCP servers to OpenAI products without exposing them to the internet. Uses outbound HTTPS paths for request handling while maintaining server privacy. Enterprise-friendly networking for private MCP infrastructure.
Quick Bits
GPT-5.6 Leaks: Coming in June — Leaks suggest OpenAI’s next model focuses on stronger multi-step reasoning, better agentic workflows, and improved frontend generation.
China Expands Travel Curbs to Top AI Talent — China restricted overseas travel for top AI professionals at private firms, including startup founders, researchers, and executives.
ByteDance Designing Own AI Chips — ByteDance approached external partners to co-design a new chip for its AI infrastructure — part of the broader trend of major AI players building custom silicon.
Mistral Exploring Own Chips — Mistral AI plans to design custom chips to control infrastructure and lower deployment costs as it expands compute capacity.
Claude Mythos Solves Erdős Problem — Mythos’ solution was slightly worse than OpenAI’s, but reportedly found OpenAI’s solution too — with a simpler proof.
Anthropic AI Fluency Scorecard — Anthropic plans to introduce a scorecard in Claude that evaluates user interaction skills across 11 behavioral indicators.
OpenAI Frontier Governance Framework — OpenAI published a framework describing how its safety and security practices align with emerging regulations — covering risk management, model reporting, and incident response.
Harvey Legal Agent Benchmark — Harvey baselined frontier models on its Legal Agent Benchmark under an “all-pass” standard. Claude Opus 4.7 led at 7.1% — a reminder of how far agents are from reliable legal work.
Deep Dives
Anthropic and OpenAI Have Found Product-Market Fit — Both companies are aggressively pricing their APIs because they’ve found product-market fit with coding and general-purpose agent products. Companies spending $200+ per month per user covers costs far better than $10-20/month consumer subscriptions. This piece reframes the AI pricing narrative from “race to zero” to “race to $200/seat.”
Notes on Pope Leo XIV’s Encyclical on AI — The Pope’s document covers the environmental impact of AI, risks of algorithmic decision-making, and how the technology amplifies the power of those with resources. The writing style is surprisingly approachable. Worth reading regardless of your religious background — it’s one of the more grounded ethical frameworks to emerge from a non-technical institution.
How Far Behind Are Open Models? — Open models are generally four to six months behind the best closed models on public benchmarks. The gap was smallest around the time of DeepSeek R1 — it has been growing since. Important context for anyone betting on open-weight models catching up.
Measuring LLMs’ Ability to Develop Exploits — Anthropic’s red team shows that Mythos Preview is capable of developing working exploits. The security implications are significant — this is why Anthropic is running Mythos as a restricted model through vulnerability discovery programs rather than making it broadly available.
That’s a wrap for this week.
If this was useful, the best thing you can do is share it with someone who’d get value from it too. Forward this email, drop it in your team’s Slack, or just send it to that one friend who’s always asking “what happened in AI this week?” Every share helps us keep this going.
Hit reply if I missed something — I read everything. See you next week. Stay curious.


