45B in compute. $3B in ARR. One hire that broke X.
SpaceX sells compute. Cursor outsells Spotify. Karpathy picks a side.
This Week’s Big Stories
Must-Read
Anthropic to Pay SpaceX Nearly $45 Billion for Computing Deal — Anthropic commits roughly $1.25B per month to SpaceX for three years of compute — the single largest AI infrastructure deal ever signed. Anthropic gets access to over 300 MW of capacity and 220,000+ NVIDIA GPUs across SpaceX’s Colossus data centers, including next-gen GB200 hardware. This is a deliberate diversification play: Anthropic already runs on AWS (Trainium) and GCP (TPUs), and now adds a third compute provider that isn’t a hyperscaler. The agreement runs through May 2029 with either side able to exit on 90 days’ notice. Anthropic also expressed interest in developing multi-gigawatt compute capacity in space with SpaceX.
Karpathy Joins Anthropic — Andrej Karpathy — founding member of OpenAI, former Tesla AI lead, and creator of some of the most-watched deep learning courses — joins Anthropic’s pretraining team. This is the biggest AI talent move of the year. Karpathy left OpenAI a second time in early 2024, and his return to frontier R&D signals that Anthropic’s research trajectory was compelling enough to pull him back into a full-time lab role. Combined with the SpaceX compute deal, Anthropic is stacking infrastructure and talent at an unusual pace.
TLDR; (30-sec)
Anthropic signs $45B compute deal with SpaceX — $1.25B/month for three years, the largest AI infrastructure contract ever
Karpathy joins Anthropic’s pretraining team — biggest AI talent move of 2026
Cursor hits $3B ARR with 3,000+ enterprise customers; SpaceX holds a $60B acquisition option when Cursor goes public
Anthropic and Microsoft in talks for Maia chip deal backed by $5B investment — diversifying compute beyond AWS and GCP
Gemini 3.5 Flash launches at Google I/O — designed for agentic workflows and long-horizon tasks
OpenAI moves toward IPO filing, potentially as early as September
Anthropic acquires Stainless, the SDK platform used by OpenAI, Google, and Cloudflare
Anthropic on pace for first profitable quarter — revenue set to reach $10.9B in Q2
OpenAI Q1 revenue hit $5.7B, maintaining a $1B lead over Anthropic
Manus weighs raising $1B to unwind its Meta takeover
Cursor Hits $3 Billion Annual Sales Rate — Cursor crossed $3B ARR with over 3,000 enterprise customers. The number that matters more: SpaceX holds a $60B acquisition option exercisable when Cursor goes public. That prices a developer tooling company at 20x revenue — a multiple that only makes sense if you believe AI-assisted coding becomes the default, not the exception. Cursor’s cloud agent environments (shipped last week) and Composer 2.5 (shipped this week) are both bets on that future.
Anthropic, Microsoft in Talks for AI Chip Deal After $5B Investment — Microsoft is in talks to supply its custom Maia AI chips to Anthropic, backed by a $5B investment. This gives Anthropic a fourth compute substrate alongside AWS Trainium, GCP TPUs, and NVIDIA GPUs — and gives Microsoft a customer for Maia outside its own Azure workloads. The deal would mark the first time a frontier lab runs on custom silicon from a hyperscaler it doesn’t primarily host on.
Gemini 3.5 Flash — Google launched Gemini 3.5 Flash at I/O 2026, purpose-built for agentic workflows and long-horizon tasks. The model is optimized for tool-calling reliability and multi-step reasoning across extended contexts. Google simultaneously announced that its APIs now process 3.2 quadrillion tokens monthly — a 7x increase year-over-year — with 8.5 million developers building on Gemini APIs.
Quick question — which compute provider are you most interested in tracking: the hyperscalers (AWS, GCP, Azure), the chip-makers (NVIDIA, Cerebras), or the new entrants (SpaceX, custom silicon)? Hit reply — helps me calibrate the depth on infrastructure coverage.
OpenAI Reportedly Moves Toward IPO — OpenAI is preparing to file for an IPO, potentially as early as September. The timing follows the dismissal of Musk’s lawsuit and a strong Q1 ($5.7B revenue). An OpenAI IPO would be the most scrutinized tech listing since Meta — especially given the open question of whether current pricing is sustainable at scale.
Anthropic Acquires SDK Startup Stainless — Anthropic acquired Stainless, the SDK generation platform that powers the official SDKs for OpenAI, Google, Cloudflare, and dozens of other API-first companies. This means Anthropic now controls the tooling layer that its direct competitors depend on for developer experience. Stainless auto-generates type-safe, idiomatic SDKs from OpenAPI specs — it’s how most developers interact with these APIs in practice.
Anthropic on Pace for First Profitable Quarter — Anthropic revenue is projected to hit $10.9B in Q2 — enough for their first profitable quarter. Revenue is now growing faster than Google or Facebook grew at the same stage pre-IPO. Combined with the SpaceX compute deal and Stainless acquisition, Anthropic is executing a full-stack strategy: own the models, own the SDKs, diversify the compute, and get profitable before the IPO window opens.
Build Tips & Engineering
Lessons Learned from Building Cloud Agents — Cursor shares hard-won lessons from shipping cloud agent environments. Covers agent state management, failure recovery, and the architecture decisions that make persistent cloud agents viable. Practical reading for anyone building long-running agentic workflows.
How Claude Code Works in Large Codebases — Anthropic details how Claude Code navigates large codebases — context management, file discovery, and the strategies that keep it effective at scale. Useful whether you’re using Claude Code directly or designing similar agent architectures.
Using Claude Code: The Unreasonable Effectiveness of HTML — A counterintuitive finding from the Claude Code team: asking the agent to produce HTML artifacts during development — even for non-web projects — dramatically improves output quality. HTML gives the model a concrete rendering target that constrains its output in productive ways.
Notes on Pretraining Parallelisms and Failed Training Runs — Dwarkesh Patel digs into why pretraining runs fail, covering expert routing, token allocation, and the parallelization approaches (data parallel, FSDP) that can go wrong. Good context for anyone thinking about training infrastructure at scale.
Lighthouse Attention: 17x Faster at 512K Context — Nous Research open-sources a hierarchical attention mechanism that achieves 17x speedup over standard attention at 512K context and 1.4-1.7x end-to-end pretraining speedup. It works by scanning compressed summaries to select key segments, then processing them with FlashAttention. Tested on 530M params / 50B tokens with no quality loss.
Agent Evaluation: A Detailed Guide — Comprehensive walkthrough of how to evaluate AI agents — covering task design, metric selection, and the gap between benchmark performance and real-world reliability. Essential reading as agent deployments move from demos to production.
🧬 Model Releases
Gemini 3.5 Flash — Google’s latest model optimized for agentic workflows, long-horizon tasks, and tool-calling reliability. Released at I/O 2026.
Stable Audio 3.0 — Stability AI ships the next generation of their audio model with improved music generation quality, longer output durations, and better prompt adherence.
Qwen3.7: The Agent Frontier — Alibaba releases Qwen3.7-Max with a 1M-token context window and agent-first design. Headline demo: a 35-hour autonomous kernel optimization run with 1,158 tool calls. Priced at $2.50/$7.50 per million tokens — roughly 6x cheaper than Claude Opus on input.
Cursor Composer 2.5 — Major update to Cursor’s multi-file editing engine with improved context awareness, better diff application, and tighter integration with the new cloud agent environments.
NVIDIA LongLive 1.0 — NVIDIA open-sources a real-time long video generation model. Designed for coherent long-form video synthesis beyond the typical few-second clips.
HRM-Text 1B — A 1B-parameter text model trained on 8 H100s for approximately $800. Proof that meaningful language models can be built on modest hardware budgets.
Tools & Product Updates
OpenAI Guaranteed Capacity — OpenAI launches reserved compute capacity for enterprise customers — guaranteed API throughput with SLA-backed latency commitments. Designed for production workloads that can’t tolerate rate limits.
Gemini Extended Thinking Levels — Google rolls out configurable thinking depth in the Gemini app plus third-party integrations — letting users dial reasoning effort up or down depending on the task.
ChatGPT Personal Finance — OpenAI adds personal finance features to ChatGPT — budgeting, spending analysis, and financial planning tools directly in the chat interface.
Codex Computer Use — OpenAI previews desktop computer use for Codex — the coding agent will soon be able to control other applications beyond the terminal, expanding from code-only to full desktop automation.
Google Agent Executor — Google Cloud launches Agent Executor, a distributed runtime for AI agents. Handles agent orchestration, state management, and fault tolerance at scale — the infrastructure layer for running production agents on GCP.
Introducing the Ettin Reranker Family — Six new open-source rerankers (17M to 1B params) built on ModernBERT with 8K-token context, Apache 2.0 licensed. Full training data and recipe included. Immediate drop-in for RAG pipelines.
Oz: Multi-Harness Control Plane for Cloud Agents — Warp ships an orchestration layer for managing multiple cloud agents simultaneously — routing tasks, handling failures, and maintaining state across agent instances.
xAI Launches Skills for Grok — xAI adds a Skills system to Grok, allowing users to create reusable instruction sets that persist across conversations.
Quick Bits
Microsoft Cancels Claude Code Licenses — Microsoft shifts internal developers from Claude Code to GitHub Copilot CLI. Likely a financial play rather than a technical one — but it signals the competitive lines hardening.
Anthropic’s Consulting Venture Makes First Acquisition — Anthropic’s consulting arm acquires Fractional AI — expanding from pure model provider to implementation partner.
Jury Dismisses All Claims in Musk v. OpenAI — The lawsuit that hung over OpenAI for over two years is done. All claims dismissed. Clears the path for the IPO filing.
Manus Weighs Raising $1B to Unwind Meta Takeover — Manus explores a massive raise to buy back its independence from Meta — a rare reverse-acquisition move in AI.
OpenAI Q1 Revenue: $5.7B — OpenAI maintained a roughly $1B revenue lead over Anthropic in Q1. Both growing fast, but the gap is narrowing.
NVIDIA Vera CPU Arrives — NVIDIA ships its first CPU designed specifically for AI agent workloads at frontier labs.
Alibaba Unveils Zhenwu M890 AI Chip — Alibaba’s latest custom AI chip handles both training and inference — part of the broader trend of frontier labs building their own silicon.
Cerebras Runs Kimi K2.6 at ~1,000 Tokens/sec — Cerebras serves the trillion-parameter open-weight model at 981 tokens/sec — 6.7x faster than the next-fastest GPU cloud and 23x faster than the median provider.
OpenAI Quietly Bought Weights.gg — OpenAI acquired voice-cloning startup Weights.gg and folded the team into its audio/voice division.
Advancing Content Provenance — OpenAI joins C2PA and adds Google’s SynthID watermarks — dual-layer provenance for AI-generated images across ChatGPT and Codex.
Deep Dives
AI Solves a Longstanding Geometry Conjecture — AI systems have now cracked an 80-year-old Erdős problem in combinatorial geometry. This isn’t a benchmark win — it’s a genuine mathematical contribution that human researchers hadn’t been able to close. Worth the read for the methodology alone.
What Political Censorship Looks Like Inside an LLM’s Weights — A detailed forensic analysis of how political censorship is implemented inside Qwen3.5-9B’s weights. The author traces specific knowledge suppression and refusal patterns to identifiable parameter clusters. A sobering look at how model-level content restrictions work in practice.
AI’s Plummeting Prices Are a Software Story, Not a Hardware One — The common narrative is that AI prices drop because chips get cheaper. This analysis argues the real driver is software efficiency — distillation, quantization, inference optimization — and that hardware costs are declining much more slowly than API prices suggest. Important framing for anyone modeling AI economics.
Portability Is a Myth: Why AI Stacks Will Never Be Hardware-Agnostic — A contrarian argument that the dream of portable AI workloads across hardware providers is fundamentally flawed. Each chip architecture demands different optimization strategies, and the performance gaps between optimized and portable code are too large to ignore. Relevant reading alongside the Anthropic-SpaceX and Microsoft-Maia deals this week.
Tokenomics: The 62.5-Minute Rule for Claude’s Cache — Detailed breakdown of Anthropic’s caching economics — when caching pays for itself, the break-even thresholds, and how to structure API calls to maximize cache hits. Practical cost optimization for anyone running Claude at scale.
That’s a wrap for this week.
If this was useful, the best thing you can do is share it with someone who’d get value from it too. Forward this email, drop it in your team’s Slack, or just send it to that one friend who’s always asking “what happened in AI this week?” Every share helps us keep this going.
Hit reply if I missed something — I read everything. See you next week. Stay curious.

