
Your AI agents need a terminal, not just a vector database
DCI lets AI agents grep, trace, and verify data directly — no embeddings needed. Researchers say it's faster and cheaper than vector search for complex tasks.
Ben Dickson
A 0.12% parameter add-on gives AI agents the working memory RAG can't
A new memory module lets AI agents retain context across long interactions — adding just 0.12% of model parameters with no architectural changes.
Ben Dickson
How RecursiveMAS speeds up multi-agent inference by 2.4x and reduces token usage by 75%
A new framework from UIUC and Stanford lets AI agents share embeddings instead of text — slashing token usage and cutting training costs by more than half.
Ben Dickson
Frontier AI models don't just delete document content — they rewrite it, and the errors are nearly impossible to catch
Weaker AI models delete document content when they fail. Frontier models rewrite it — subtly and silently, making errors far harder for human reviewers to catch.
Ben Dickson
How Sakana trained a 7B model to orchestrate GPT, Claude and Gemini LLMs
A 7B model that learns to route tasks across GPT-5, Claude Sonnet 4, and Gemini 2.5 Pro — using RL instead of hardcoded workflows.
Ben Dickson
Alibaba's Metis agent cuts redundant AI tool calls from 98% to 2% — and gets more accurate doing it
Most AI agents call tools even when they don't need to. Alibaba's new RL framework teaches them when to stop — and accuracy goes up, not down.
Ben Dickson
How to build custom reasoning agents with a fraction of the compute
The technique, called Reinforcement Learning with Verifiable Rewards with Self-Distillation (RLSD), combines the reliable performance tracking of reinforcement learning with the granular feedback of self-distillation.
Ben Dickson
New AI framework autonomously optimizes training data, architectures and algorithms — outperforming human baselines
A self-improving AI framework beat human-designed baselines across data, architecture, and reinforcement learning — with no manual intervention.
Ben Dickson
Are you paying an AI ‘swarm tax’? Why single agents often beat complex systems
New Stanford research challenges the assumption that more agents means better AI — and introduces a simple compute-budget fix that changes the calculus.
Ben Dickson
Train-to-Test scaling explained: How to optimize your end-to-end AI compute budget for inference
AI reasoning does not necessarily require spending huge amounts on frontier models. Instead, smaller models can yield stronger performance on complex tasks while keeping per-query inference costs manageable
Ben Dickson
Meta researchers introduce 'hyperagents' to unlock self-improving AI for non-coding tasks
Creating self-improving AI systems is an important step toward deploying agents in dynamic environments, especially in enterprise production environments, where tasks are not always predictable, nor consistent.

New framework lets AI agents rewrite their own skills without retraining the underlying model
A multi-university research team built a framework that teaches agents to fix their own failure modes — no human intervention required.
Ben Dickson