Cursor → xAI, Claude Patches Code, Anthropic Eyes $900B

Plus: Claude connects to Adobe and Blender, OpenAI lands on AWS, and missed targets put the IPO in doubt

May 04, 2026

Hi Everyone

If you’re new here, every week I read through the noise so you don’t have to. The best stories in AI, scored and ranked, with notes on what actually matters for engineers building with these tools. If someone forwarded this to you and you want in, subscribe below. And if you’ve been reading for a while — thank you. It means a lot.

Let’s get into it.

The week’s big story isn’t a model release — it’s consolidation. Cursor, the most operationally successful AI coding tool ever built, sold to xAI for $60 billion to escape negative-margin dependency on the very model labs it was helping compete. Meanwhile Anthropic’s valuation jumped from $350B to a reported $900B+ in the span of a single fundraise. OpenAI ended Azure exclusivity, landed on AWS, and missed its own targets anyway. The model layer is becoming infrastructure. The app layer is folding back into it.

TLDR; (30-sec)

Cursor agreed to a $60B acquisition by SpaceX/xAI — the most operationally successful AI coding tool needed compute relief from its -23% gross margins
Anthropic is closing a ~$50B round that could push its valuation past $900B, with revenue nearing a $40B run rate
Claude Security entered public beta — Opus 4.7 autonomously scans and patches codebases for Enterprise customers
OpenAI ended Azure exclusivity; AWS customers now access OpenAI models via Bedrock Managed Agents
OpenAI missed its own revenue and user targets, raising questions about its Q4 2026 IPO timeline
Google committed up to $40B in Anthropic — $5B confirmed, the rest performance-pegged
Grok 4.3 launched with a better intelligence-to-cost ratio than Grok 4.20 0309 v2
Seven model releases shipped this week: Mistral Medium 3.5, Granite 4.1, Nemotron 3 Nano Omni, Laguna XS.2/M.1, MiMo-V2.5-Pro, GLM-5V-Turbo
China halted Meta’s $2B Manus acquisition; Microsoft’s OpenAI agreement now runs to 2032 with the AGI clause removed
Cursor Security Review launched in beta — always-on PR scanner and codebase vulnerability agent, same week as Claude Security
Claude Code v2.1.126 ships xhigh effort for Opus 4.7, Auto mode without a flag, and fixes subagent MCP inheritance
B200 GPU spot prices surged 114% in six weeks as GPT-5.5 demand hit the market

📰 This Week’s Big Stories

Must-Read

Cursor’s War Chest, xAI’s Redemption — Cursor is the most operationally successful software company of the AI era. Its founders looked at the path to $100B and decided they weren’t willing to underwrite it. They sold to xAI for $60B. The deal gives xAI an application surface to put in front of public market investors before the SpaceX IPO, and it gives Cursor a sponsor with compute and a non-competing model lab. What does it mean when a company doing $2.7B in annualized revenue has gross margins of -23%? It means power users consume more model capacity than the margins can absorb — and Colossus is the fix. (Analysis: Cursor’s $60 Billion Escape Hatch)

Claude Security Is Now in Public Beta — Claude Security, available now to Claude Enterprise customers, uses Opus 4.7 to identify and patch software vulnerabilities. Integrated into tools used by Microsoft Security and Palo Alto Networks, it enables continuous code scanning without custom API integration. Feedback from hundreds of organizations shaped its capabilities before public launch. Autonomous security research was a theoretical use case eighteen months ago. It’s now a product.

Anthropic Nears $900B Valuation Round — Anthropic reportedly moved to close a ~$50B round at a valuation of $900B or higher, driven by strong investor demand and revenue rapidly approaching $40B run rate. The round is expected to close within two weeks. This comes days after Google confirmed it would invest up to $40B — the lower tranche ($5B) is confirmed, with the remainder performance-pegged to compute scaling targets. The compute gap between demand and supply is the use of funds.

An Interview with OpenAI CEO Sam Altman and AWS CEO Matt Garman About Bedrock Managed Agents — Azure exclusivity was actively damaging Microsoft’s investment in OpenAI, so Microsoft gave it up. The amended agreement drops the exclusivity, adds multi-cloud support, and caps revenue-sharing through 2030. OpenAI also released Microsoft from the AGI clause — their agreement now runs to 2032 regardless of whether AGI is achieved. AWS customers will access OpenAI models and the new Bedrock Managed Agents powered by OpenAI in the next few weeks.

OpenAI Misses Key Revenue, User Targets in High-Stakes Sprint Toward IPO — OpenAI missed its own targets for new users and revenue, raising concern among company leaders about whether it can support its massive data center spending. The CFO is reportedly excluded from financial discussions about server procurement and has raised concerns about funding future compute contracts. Board directors have been questioning CEO Sam Altman’s push for more compute despite the business slowdown. The Q4 2026 IPO timeline now looks unlikely. (Deeper take)

Google Will Invest as Much as $40 Billion in Anthropic — Google will invest between $10B and $40B in Anthropic, depending on whether Anthropic meets certain performance targets. Anthropic recently received a $5B investment from Amazon on a similar performance structure. These investments collectively value Anthropic at $350B — the floor before this week’s $900B round closes. The funds are earmarked for compute to close the gap between training and inference demand.

💻 Build Tips & Engineering

KV Cache Locality: The Hidden Variable in Your LLM Serving Cost — The same GPUs, serving the same model, handling the same traffic can produce measurably different throughput and latency depending on which GPU gets which request. “Balanced” and “efficient” are not the same thing when every request carries thousands of tokens that might already be cached somewhere in the cluster. Covers the cost of recomputation, how to measure it, and what changes when load balancers understand token locality.

What You’re Actually Writing When You Write a SKILL.md — A breakdown of the runtime behavior behind skill files and why understanding it changes every decision at the surface level. Required reading if you’re building on Claude Code or any agent framework that uses structured capability prompts.

AI Evals Are Becoming the New Compute Bottleneck — Evaluation costs have escalated to become a compute bottleneck comparable to or exceeding training costs, with some runs costing tens of thousands of dollars. The field has uneven cost distributions across models and tasks, which creates access inequality and hinders external validation. The case for standardized eval documentation and data reuse.

Lessons on Building MCP Servers — Models don’t plan — they look at the conversation, scan the tool list, and grab whatever looks most probable. Making effective MCP chains means making sure the server makes the next call blindingly obvious at every step. Practical framework for building MCP toolchains where servers do the heavy lifting.

Codex Symphony Agent Orchestration — OpenAI’s Symphony is an open-source spec that turns issue trackers into control planes for coding agents, reducing context switching and increasing PR throughput by up to 5x.

Compressing AI Vectors to 2–4 Bits Per Number Without Losing Accuracy — TurboQuant compresses each coordinate in large tables of high-dimensional vectors to 2–4 bits with provably near-optimal distortion, no memory overhead for scale factors, and no training or calibration. Between four and six orders of magnitude faster than alternatives at 4-bit indexing.

Scaling Long-Horizon Coding Agents — A framework from Meta for test-time scaling in coding agents that summarizes past rollouts into structured representations, enabling better selection and reuse to improve benchmark performance.

Stash — Persistent Memory for Agents — Open-source, self-hosted tool that gives agents persistent memory: remember, recall, consolidate, and learn across sessions. Works with any MCP-compatible agent.

🧬 Model Releases

xAI Launches Grok 4.3 — Grok 4.3 improves on cost-per-intelligence relative to Grok 4.20 0309 v2. Higher Intelligence Index score, lower benchmark suite cost. Performs strongly on instruction following and agentic customer support tasks. (May 1)

GLM-5V-Turbo — Integrates multimodal perception directly into reasoning and tool use, improving performance on coding, visual tasks, and agent workflows across heterogeneous inputs.

Granite 4.1 LLMs: How They’re Built — IBM’s Granite 4.1 uses a dense decoder-only architecture at 3B, 8B, and 30B parameters, trained on 15 trillion tokens across five pre-training phases. The 8B model matches the previous 32B Mixture-of-Experts model via a multi-stage RL pipeline. Designed for enterprise reliability and cost efficiency

Mistral Medium 3.5 Powers Remote Vibe Agents — A 128B dense model powering Vibe remote agents for long asynchronous coding tasks in the cloud, starting from the CLI or Le Chat. Runs efficiently on four GPUs. Le Chat’s new Work mode uses this model for complex, multi-step task execution. Competitive SWE-Bench Verified scores.

Introducing NVIDIA Nemotron 3 Nano Omni — Multimodal model for document, audio, and video analysis. Hybrid Mamba-Transformer architecture with specialized vision and audio encoders. Best-in-class on MMlongbench-Doc and VoiceBench. Targets long-context multimodal enterprise workflows.

Laguna XS.2 and M.1: A Deeper Dive — Poolside’s agentic coding models built for long-horizon work. M.1 is the foundation; XS.2 is the smaller but still capable variant. Both free via Poolside’s API and OpenRouter for a limited time. XS.2 weights released under Apache 2.0.

MiMo-V2.5-Pro — Xiaomi open-sourced a 1.02T-parameter Mixture-of-Experts model showing significant advances in agentic tasks, software engineering, and long-horizon coherence.

🛠️ Tools & Product Updates

Claude Connectors for Creative Tools — Anthropic introduced connectors integrating Claude with Adobe, Blender, and Autodesk, enabling natural-language workflows, automation, and cross-tool pipelines for design, 3D, and audio production. (Apr 29)

Anthropic Launches Memory in Claude Agents for Enterprise — Claude Managed Agents can now remember and use information from prior sessions and accumulate knowledge over time without manual prompt updates. Memory is filesystem-based — data stored as files, exportable, managed via APIs, and scoped with permissions. Available in public beta to all Managed Agents users.

Perplexity Expands Enterprise AI Workflows — Perplexity added workflows, enterprise data connectors, and integrations including Teams and Excel to its AI system, targeting structured business tasks and continuous automation.

Cursor Security Review Is Now in Beta — Cursor launched always-on security agents for Teams and Enterprise: a Security Reviewer that checks every PR for vulnerabilities, auth regressions, data-handling risks, agent auto-approvals, and prompt injection — leaving inline comments at the exact diff location with severity and remediation — and a Vulnerability Scanner that runs scheduled codebase scans and pushes findings to Slack. Same week Claude Security went public beta. The security agent category opened up fast.

Continually Improving the Cursor Agent Harness — Cursor’s approach to improving model performance inside the agent harness: vision-driven development, A/B testing, and dynamic context adaptation. Worth reading to understand how the best-used AI coding tool thinks about the gap between raw model capability and practical performance.

Claude Code v2.1.126: xhigh Effort + Auto Mode — Opus 4.7 gets a new xhigh effort level — a tier between high and max — accessible via /effort (now opens an interactive slider with arrow-key navigation when called without arguments) and the model picker. Auto mode for Max subscribers no longer requires --enable-auto-mode. Bug fixes that matter in production: subagents now correctly inherit MCP tools from dynamically-injected servers, /stats was undercounting tokens by excluding subagent usage, and the autocompact thrash loop is fixed.

Anthropic Tests New Bugcrawl Tool for Claude Code — Bug Crawl lets users scan repositories for bugs and get fix suggestions. Still in testing but signals where Claude Code’s utility is heading: not just generation, but ongoing codebase maintenance. (Apr 27)

ElevenLabs Launches Agent Templates — ElevenAgents now ships pre-built frameworks for quick deployment of AI agents, reducing the scaffolding overhead for teams building voice-driven workflows.

Google Prepares Credits System for Gemini — Google is working on a monthly credit allowance model for Gemini, with top-up capability. Makes heavy-workload budgeting more predictable. OpenAI, Anthropic, and Notion already use a similar consumption model.

DeepMind ProEval for GenAI Evaluation — Open-source framework that reduces generative AI evaluation costs while identifying failure modes using surrogate models and transfer learning across benchmarks.

⚡ Quick Bits

Reverse Engineering With AI Unearths High-Severity GitHub Bug — GitHub disclosed CVE-2026-3854, a high-severity remote code execution vulnerability in GitHub Enterprise Server, discovered via AI-assisted reverse engineering of git push options.

OpenAI Codex System Prompt Includes Explicit Directive to “Never Talk About Goblins” — OpenAI appears to be fighting a new failure mode in GPT-5.1 where the model focuses on goblins in unrelated conversations. The Codex system prompt now explicitly prohibits it. The team traced the quirk to reward signals from personality tuning — small incentives shaping model behavior in unexpected directions.

China Blocks Meta Manus Acquisition — China halted Meta’s $2B acquisition of agentic AI startup Manus, ordering the deal unwound after a months-long regulatory probe. Complicates Meta’s push into AI agents.

DeepSeek Cuts V4-Pro Prices by 75% — DeepSeek slashed V4-Pro pricing 75% and cut input cache hit costs by 90%, maintaining competitive pressure on US frontier labs in a tense geopolitical backdrop.

GPU Spot Prices Surge 114% in Six Weeks — NVIDIA’s B200 rental price hit $4.95/hour, up 114%, driven by GPT-5.5 demand. The supply crunch is expected to worsen next year.

Former Google DeepMind Researcher’s AI Startup Raises Record $1.1B Seed — David Silver, former DeepMind RL lead, raised a record seed round for a superintelligence-focused startup. Backed by Nvidia and Google.

Google Grants DoD Broad AI Access — Google agreed to provide the US Department of Defense access to its AI on classified networks for broad lawful uses, after Anthropic declined a similar arrangement.

Ex-Twitter CEO’s AI Startup Raises at $2B Valuation — Parallel Web Systems, founded by Parag Agrawal, raised funds for a platform enabling AI agents to search the web.

AI Has Made Memory Chips One of the World’s Most Profitable Products — Samsung reported Q1 net profit equivalent to $30B+, blowing away its prior quarterly record. The supply crunch is expected to grow worse next year.

Cohere and Aleph Alpha Join Forces — Cohere and Aleph Alpha are partnering to create a sovereign, enterprise-grade AI alternative combining Canadian scale with German research expertise.

Meta’s Loss Is Thinking Machines’ Gain — Thinking Machines Lab has been hiring more from Meta than any other single employer. It just signed a multibillion-dollar cloud deal with Google for access to GB300 chips.

📌 Deep Dives

The World Can’t Keep Up With AI Labs — Coding agents are the first AI product people are paying for at volume and regularly. But compute demand is now growing faster than anyone can build it out. The industry isn’t ready for the agent boom. The most obvious moves for AI labs: cut limits, raise prices. Worth reading for the infrastructure constraint framing.

The Moat or the Commons — American AI was financed on the bet that frontier models would be the next great monopoly business. That assumption is breaking as open-weight models commoditize the capability that the private moat was supposed to protect. The gap between open and closed frontiers is closing. Countries and companies face the same question: subsidize the private moat or the open commons?

Opus 4.7’s New Tokenizer: What It Actually Costs — Anthropic improved Opus 4.7’s input understanding with a new tokenizer. The model price hasn’t changed, but the same inputs now cost 12–27% more than previous models, except for short prompts which became more cost-efficient. If you’re running Opus at scale, measure your token counts before assuming cost parity.

Your AI Might Be Lying to Your Boss — It’s very hard to measure AI’s contribution to a codebase. The best use cases often produce no code at all — just better decisions. Lines of code isn’t a good proxy for code quality, and it’s hard to separate engineer work from model output. The bias appears to be toward reporting higher AI percentages, which is useful for AI companies but can skew incentives.

We're now on Instagram too — follow @ampli.ai for the visual recap every week."

That’s a wrap for this week.

If this was useful, the best thing you can do is share it with someone who’d get value from it too. Forward this email, drop it in your team’s Slack, or just send it to that one friend who’s always asking “what happened in AI this week?” Every share helps us keep this going, and I’m genuinely grateful for each one.

Hit reply if I missed something.

See you next week. Stay curious.

Attia

Ampli AI

Discussion about this post

Ready for more?