Cursor 3: A Unified Workspace for Agent-Driven Development
Cursor's redesigned interface focuses on multi-repo workflows and coordination between local and cloud agents for agent-driven development.
Five different articles today on model coordination and integration architecture. That's not a coincidence.
We've moved past "can AI generate code" to "can AI systems work together reliably." Cursor's multi-repo focus, Weaviate's persistent context experiments, the MCP integration benchmark showing 25-point accuracy gaps β same problem, different angles. Orchestration is where the real work is now.
The MCP benchmark hits hardest. Five integration approaches, most failing silently on real prompts. This mirrors what I see in my own workflows: demo runs flawlessly, production breaks in unpredictable ways. Tool calling still feels like duct tape and prayer.
Simon's scan-for-secrets piece couldn't come at a better time. More people shipping faster means more surface area for accidentally leaked API keys. If you're not scanning before every commit, you're playing with fire.
Cursor's redesigned interface focuses on multi-repo workflows and coordination between local and cloud agents for agent-driven development.
Qwen3.6-Plus delivers improved multimodal reasoning and a transformative 'vibe coding' experience as a foundation for native multimodal agents.
Google DeepMind's new generation of open models optimized for reasoning and agent workflows under Apache 2.0 license.
Weaviate's Engram shows how persistent context built on vector search improves agent workflows while revealing challenges in reliable tool usage.
Open models like GLM-5 and MiniMax M2.7 now match frontier models for agent tasks at lower cost and latency, making real-world workflows viable.
New Python scanning tool helps developers check for accidentally exposed API keys and secrets in their directories before publishing.
Simon Willison is redesigning his LLM library to handle new vendor features like server-side tool execution across hundreds of different models.
Benchmark of five MCP integration approaches shows 25-point accuracy gap, with most approaches failing silently on real-world prompts.
Faster-than-expected progress in agentic coding over the past few months is moving AI timeline predictions forward, with automated AI R&D coming soon.
Apollo Research argues funders should heavily incentivize AI safety work that uses $100M+ in compute budgets on automated AI labor for safety research.
Mental models for accepting that AI progress is remarkably regular and rapid, despite many people's skepticism about exponential improvement data.
Historical analysis of how military technology changesβfrom riflemen to dronesβfundamentally alter the relationship between citizens and state power.
The efficient market hypothesis is wrong, there are no adults in charge, and low-hanging fruit is everywhere if you pay attention to your comparative advantage.
Historical analysis of why philosophers and intellectuals failed to anticipate the Industrial Revolution and what that means for predicting transformative change.
First in a series explaining mean field theory as an approach to understanding and interpreting AI model internals, combining explanation with original research.
Engineered yeast now produces realistic vegan egg whites that foam and set like traditional eggs, showing how bioengineering transforms food production.