Multi-agent debate needs a boring data-cleaning cop
Multi-agent debate hurt generation by up to 15.5 points in data cleaning, but a grounded critic rescued detection and repair.
Lead writer and data engineer
Lars Cornelissen is an enterprise data engineer and the lead writer at Data Today. He builds cloud data platforms at Alliander on Snowflake, Matillion, Python, and AWS, and runs the data and AI studio Datastudy. He turns primary AI and data sources into charts and operator-grade conclusions for the people who build and ship with the technology.
Story tips, dataset suggestions, and corrections are welcome at [email protected]. More on how Data Today works.
Multi-agent debate hurt generation by up to 15.5 points in data cleaning, but a grounded critic rescued detection and repair.
Lars Cornelissen ·
Google AI opt out rules in the UK give publishers control over AI Search, but the hard choice is whether to trade traffic for leverage.
Lars Cornelissen ·
The unit distance proof gives OpenAI an AI math win: n^1.014 unit pairs, with humans still doing the verification work.
Lars Cornelissen ·
Vibe coding dominates AI discourse, but most professional developers still avoid it. The 2025 survey data shows why: the output is almost right too often.
Lars Cornelissen ·
Model capability is improving about 15.5 ECI a year and the rate rose after early 2024. The expected plateau never arrived, which complicates every roadmap built around one.
Lars Cornelissen ·
Frontier AI spending has shifted from clever algorithms to power and concrete. A single gigawatt data center now costs about 30 billion dollars to build.
Lars Cornelissen ·
AI agents are not yet mainstream, and the developers who use them report personal speed but little team-wide gain. The fix is teaching agents when to stop.
Lars Cornelissen ·
The performance gap between the best AI models and the rest is collapsing. Aggregate leaderboard scores now hide more than they reveal about real strengths.
Lars Cornelissen ·
AI chip performance per dollar improves about 37 percent a year, yet each new flagship costs more upfront. The GB300 delivers 24 times the value of a P100 at nine times the price.
Lars Cornelissen ·
Organisational AI use jumped to 78 percent in 2024, yet demand for human judgment rose alongside it. The productivity paradox is back in a new form.
Lars Cornelissen ·
The United States holds about 75 percent of global GPU cluster performance. That concentration shapes pricing, latency, and policy for everyone building elsewhere.
Lars Cornelissen ·
Training compute for frontier models has grown about 5 times a year since 2020. The scatter looks clean, but every data point sits to the left of the real question.
Lars Cornelissen ·
Frontier training costs climb about 3.5 times a year while algorithms get 3 times more efficient. The two trends are racing, and the gap decides who can still compete.
Lars Cornelissen ·
Building a one-person data studio used to be impossible on the economics alone. Inference at a fixed quality level fell roughly 280 times in two years, and that changed the math.
Lars Cornelissen ·
Gartner expects over 40 percent of agentic AI projects to be cancelled by end of 2027. The drop-off from demo to production is where the budgets quietly die.
Lars Cornelissen ·
The largest AI data center already rivals 700,000 H100 chips, and a 5-million-equivalent campus is due by 2027. Power and concrete, not chips, set the new pace.
Lars Cornelissen ·
Open-weight models went from a budget compromise to a default choice. On some benchmarks the gap to the best closed models shrank from 8 points to 1.7 in a year.
Lars Cornelissen ·
Frontier AI capability reaches consumer hardware in about eight months. The shrinking gap turns today's hosted-only features into tomorrow's on-device default.
Lars Cornelissen ·
LLM context windows have expanded about 30 times a year since 2023, from a few thousand tokens to over a million. The change quietly rewrites how RAG systems should be built.
Lars Cornelissen ·
The installed stock of AI chips is growing 3.4 times a year, doubling every seven months. The capacity question has quietly shifted from chips to power and buildings.
Lars Cornelissen ·
LLM inference costs fell between 9 and 900 times a year depending on the task. The cheapest gains landed on easy work, while frontier reasoning barely moved.
Lars Cornelissen ·