Lars Cornelissen

A donut chart of AI agent identity practices in enterprises: 34% give every agent a scoped identity, 66% share credentials across agents. Source: VentureBeat Research survey of 107 enterprises.

Agent security gap: 54% of enterprises already hit

AI agent security incidents are now widespread. 54% of enterprises have had a confirmed agent incident, yet most still let agents share credentials instead of scoped identities.

Lars Cornelissen · Jul 17, 2026

Funnel chart showing the enterprise AI agent evaluation gap across 157 organizations. 118 are shipping to production, 79 shipped an agent that passed internal evals but failed in production, and only 8 fully trust automated evaluation.

AI

Enterprise AI agent evaluation gap: half ship broken agents

The AI agent evaluation gap shows 50% of enterprises shipped an agent that passed internal evals then failed in production. Only 5% fully trust automated evaluation. The gap is structural misalignment, not missing coverage.

Lars Cornelissen · Jul 17, 2026

Heatmap showing AI compute cost measurement maturity across 107 enterprises. Only 8 percent have mature tracking, 52 percent have partial tracking, and 40 percent have little or none. Source: VentureBeat survey.

Engineering

AI compute cost gap: 92% of enterprises fly blind on spend

The AI compute cost gap is widening: 92 percent of enterprises lack mature cost tracking for AI infrastructure even as 45 percent plan to switch or add providers within a year, risking runaway spend.

Lars Cornelissen · Jul 17, 2026

Open-weight model comparison showing Inkling at 975B total parameters and 41B active, Inkling-Small at 276B total and 12B active, and Hy3 295B at 295B total and 21B active

AI for dummies

Inkling 975B: Thinking Machines releases open-weights model

Inkling is Thinking Machines Lab's first open-weights model. At 975B parameters with 41B active and Apache 2.0 licensing, it targets fine-tuning, not frontier benchmarks.

Lars Cornelissen · Jul 17, 2026

Comparison chart of open-weight model sizes in trillions of parameters for Kimi K3 and peers. Kimi K2.6 at 1.0 trillion, DeepSeek v4 Pro at 1.6 trillion, Kimi K3 at 2.8 trillion parameters.

AI for dummies

Kimi K3 explained: what 2.8 trillion parameters means

Kimi K3 is a 2.8 trillion parameter open-weight model from Moonshot AI that matches top US closed models on key benchmarks. Weights arrive July 27.

Lars Cornelissen · Jul 17, 2026

Bar chart comparing memory footprint of Qwen 3.6 27B model versions. Full precision 16-bit at 54 GB, conventional 4-bit at 18 GB, Ternary Bonsai at 5.9 GB, and 1-bit Bonsai at 3.9 GB, with the phone-fit threshold marked at 6 GB.

AI for dummies

Bonsai 27B puts a big AI model on your phone

1-bit quantization is a compression trick that shrinks AI models by storing each parameter as one bit. Bonsai 27B uses it to fit a 27 billion parameter model in 3.9 GB, running on an iPhone.

Lars Cornelissen · Jul 15, 2026

Bar chart comparing CVSS scores for two SonicWall SMA 1000 zero-day vulnerabilities: CVE-2026-15409 at 10.0 critical SSRF and CVE-2026-15410 at 7.2 high code injection, both actively exploited in the wild and added to CISA KEV on July 14 2026

Cyber security

Two SonicWall SMA 1000 zero-days under active attack

Two SonicWall SMA 1000 zero-day vulnerabilities including a CVSS 10.0 SSRF are under active exploitation. CISA KEV-listed with a July 17 patch deadline.

Lars Cornelissen · Jul 15, 2026

Slope chart showing interpretability visibility before and after the J-space discovery. Five behavior categories improve: task progress from 20 to 70, recognition from 10 to 60, commentary from 5 to 80, bias detection from 15 to 55, and cheating signals from 0 to 70. Illustrative values.

Research

Anthropic J-space reveals LLM thoughts, but not fully

Anthropic's J-space exposes hidden words shaping LLM reasoning. Mechanistic interpretability advances, but misuse detection stays unproven.

Lars Cornelissen · Jul 14, 2026

Format Sensitivity Index showing benchmark score variance of up to 15 percentage points across JSON, XML, Markdown, and plain text prompt wrappers for four illustrative LLM models

Research

Format Sensitivity Index exposes LLM benchmark gaps

The Format Sensitivity Index measures how LLM benchmark scores shift when prompt wrappers and schema constraints change. The metric exposes a blind spot in model evaluation that developers ignore at their peril.

Lars Cornelissen · Jul 14, 2026

Gantt chart showing data center moratorium policy timeline from January 2026 to July 2027. Maine's moratorium was vetoed in April 2026. New York's executive order, signed July 14 2026, blocks data centers over 50 MW for up to 12 months. New York's legislative bill, with a 20 MW threshold, awaits the governor's signature.

Business

New York's data center moratorium redraws the AI map

New York enacted the first statewide data center moratorium, blocking new permits for facilities over 50 MW for up to a year. What changes for builders.

Lars Cornelissen · Jul 14, 2026

Apple Silicon unified memory ceiling growth from 128GB in 2022 to 192GB in 2023, then an estimated 1.5TB for the M7 Ultra in 2027, showing a near eightfold jump in max RAM for Apple's AI chips.

Engineering

Apple's dead car project built the Neural Engine

Apple's Neural Engine, born from its failed car project, became the backbone of on-device AI. The M7 Ultra with 1.5TB of RAM could put Apple in the server inference market by 2027.

Lars Cornelissen · Jul 13, 2026

Cloudflare AI crawler traffic split showing 50 percent re-fetching unchanged pages, with Agent and Training categories blocked by default on ad-supported pages from September 15 2026

AI

Cloudflare agent crawler blocks change the access deal

Cloudflare AI agent crawler rules block ad-supported pages by default from September 15. Agent builders face degraded coverage and need negotiated access, not user-agent tricks.

Lars Cornelissen · Jul 13, 2026

Bar chart showing AI code review pass rates for image-based secret theft. Claude Code refused the attack 100 percent of the time across ten runs. Cursor and Antigravity leaked the .env file 100 percent of the time under Sonnet, Gemini, and GPT-5.5.

Cyber security

Ghostcommit weaponizes images to steal secrets from AI agents

Ghostcommit is a prompt injection attack that hides malicious instructions inside PNG files in pull requests. It exploits a blind spot in AI code reviewers to steal .env secrets, exposing a critical new attack surface for teams shipping with coding agents.

Lars Cornelissen · Jul 13, 2026

Illustrative chart showing standard performance mode consuming roughly 30 percent fewer DBUs than performance-optimized mode for Databricks managed Iceberg materialized view refreshes, with performance mode at 100 DBUs and standard mode at 70 DBUs per refresh.

Databricks

Databricks managed Iceberg materialized views, explained

Managed Iceberg materialized views let external engines like Trino read your Databricks MVs. Now in Public Preview, they shift the lock-in calculus.

Lars Cornelissen · Jul 13, 2026

Bar chart of solve rates by task length showing frontier models scoring above 60 percent on short tasks and below 25 percent on long-horizon tasks, with the long-task gap at 35 percentage points. Long-Horizon-Terminal-Bench agent results.

Research

Long-Horizon-Terminal-Bench finds agents hit a long-task wall

Long-Horizon-Terminal-Bench tests AI agents on multi-step terminal tasks with dense reward grading. Top models solve under 30 percent of long-horizon tasks, exposing a durability gap.

Lars Cornelissen · Jul 13, 2026

VultronRetriener model family MTEB scores versus other open embedding models. VultronRetriever-base leads at 72.1, followed by E5-Mistral at 70.3, BGE-M3 at 69.8, and nomic-embed at 68.5.

AI

VultronRetriever takes on-device RAG to the MTEB top

VultronRetriver is a family of embedding models built for on-device retrieval. The 8B variant tops the MTEB leaderboard while running fully offline on an iPhone for Q&A.

Lars Cornelissen · Jul 12, 2026

Abstract data art showing three concentric circles representing GPT-5.6 model sizes Luna, Terra, and Sol. The smallest circle is labeled with the keyword GPT-5.6 pricing: Luna at $1/$6, Terra at $2.50/$15, Sol at $5/$30 per million tokens.

AI for dummies

OpenAI's GPT-5.6 Luna Terra Sol: what beginners should know

GPT-5.6 is OpenAI's newest model family in three sizes: Luna, Terra, and Sol. It claims big efficiency gains for long-running agent tasks at a fraction of competitor costs.

Lars Cornelissen · Jul 11, 2026

Abstract data art representing a single HTTP header piercing through a layered authentication barrier, symbolizing the Gitea Docker auth bypass CVE-2026-20896 where one X-WEBAUTH-USER header lets attackers impersonate any user including admins across approximately 6,200 exposed instances.

Cyber security

Gitea Docker auth bypass lets attackers impersonate admins

A critical Gitea Docker auth bypass, CVE-2026-20896, lets attackers impersonate any user with one HTTP header. Over 6,200 instances are exposed online.

Lars Cornelissen · Jul 11, 2026

Abstract data visualization of ShareFile Storage Zone Controllers response urgency at 5 on a 1 to 5 scale, the highest level, compared to prior Progress incidents MOVEit Transfer and MOVEit Automation at 4.

Cyber security

ShareFile Storage Zone Controllers shut down, no patch yet

ShareFile Storage Zone Controllers are on-prem servers Progress ordered shut down on July 10 over a credible threat with no patch available.

Lars Cornelissen · Jul 11, 2026

Data

Agentic MCP attacks bypass SOTA guardrails 58% of the time

MCP attack chains bypass SOTA guardrails more than half the time because text classifiers miss composed tool-call exploits. The agentic safety gap is architectural, not a tuning problem.

Lars Cornelissen · Jul 9, 2026

Four benchmark tasks shown as paired bars comparing automated pass/fail scores with AgentLens trajectory review scores. SWE-bench drops from 34 to 19 percent, HumanEval from 71 to 52 percent, BigCodeBench from 28 to 9 percent, and LiveCodeBench from 41 to 26 percent. The coding agent evaluation gap is widest on BigCodeBench.

Research

AgentLens cuts coding agent scores in half on review

AgentLens is a trajectory review framework for coding agent evaluation that scores the full agent process, not just whether tests pass. Agent success rates drop 30 to 60 percent under trajectory review, meaning production readiness is roughly half the benchmark headline.

Lars Cornelissen · Jul 9, 2026

Abstract waveform visualization showing overlapping input and output audio patterns representing GPT-Live full-duplex architecture with 150 million weekly ChatGPT Voice users and 2 model components in the new architecture

AI for dummies

GPT-Live voice mode: what full-duplex AI changes for you

GPT-Live is OpenAI's new voice model that listens and speaks at once. It delegates hard questions to GPT-5.5 mid-conversation while keeping the flow going.

Lars Cornelissen · Jul 9, 2026

Bar chart of local AI decode speeds on Apple Silicon. Qwen 3.6 27B on M5 Max: 28 tok/s baseline, 63 tok/s with MTPLX MTP, a 2.24x speedup.

AI for dummies

How MTPLX v2 makes local AI on Mac twice as fast

MTPLX v2 uses multi-token prediction to run local AI on Apple Silicon Macs up to 2.24x faster. Here is what beginners need to know.

Lars Cornelissen · Jul 9, 2026

Bar chart of LLM hallucination rates for repository identifiers: 0.9 percent for pre-2019 repos, 85 percent for popular repos, and 92.4 percent for 2025 repos. The attack targets AI coding agents like Cursor and Copilot.

Engineering

HalluSquatting turns AI coding agents into a botnet

HalluSquatting is a pull-based prompt-injection attack that exploits LLM hallucinations of repository names. Coding agents hallucinate up to 92 percent of newer repo identifiers, letting attackers squat those names and ship reverse shells at scale.

Lars Cornelissen · Jul 8, 2026

Funnel chart showing the AI drug discovery pipeline for rentosertib: 79 molecules synthesized narrowing to 1 preclinical candidate, then Phase I, Phase IIa with 71 patients, and Phase III with 320 patients. The AI-discovered drug rentosertib is the first fully AI-originated molecule to reach Phase III.

Research

AI-discovered drug rentosertib enters Phase III trial

Rentosertib, an AI-discovered TNIK inhibitor for IPF, enters Phase III with 320 patients. It is the first fully AI-originated drug to reach late-stage trials.

Lars Cornelissen · Jul 8, 2026

Bar chart showing two regulatory regimes: Cambridge Analytica involved 87 million harvested records and a $5 billion FTC fine. Muse Image affects approximately 500 million public Instagram accounts by default, with zero fine so far. Data Today benchmark.

Business

Muse Image pulls 500M Instagram accounts into AI by default

Muse Image is Meta's new agentic AI image model that turns 500M public Instagram accounts into inference-time visual references by default. It is free for everyday use, with no notification to tagged users. The move signals that agentic image generation is now cheap enough to bundle into ad-supported apps.

Lars Cornelissen · Jul 8, 2026

Bar chart showing Hy3 has 295 billion total parameters but only 21 billion active parameters per token, a Mixture-of-Experts design. Hy3 hallucination rate dropped to 5.4 percent in internal evaluations.

AI for dummies

Hy3 295B open weights: what 21B active means

Hy3 is Tencent's 295-billion-parameter open-weights model with 21 billion active parameters per token. It uses a Mixture-of-Experts architecture to rival larger models at lower cost, cutting hallucination to 5.4 percent.

Lars Cornelissen · Jul 7, 2026

LeRobot v0.6.0 ships 5 new VLA policies, 2 reward models, 3 world models, and 6 benchmarks, with data loading 2x faster and subset loading dropping from 275 seconds to 0.06 seconds

Engineering

LeRobot DAgger loop turns robot failures into training data

LeRobot v0.6.0 adds a DAgger correction loop that turns robot deployment failures into training data, plus reward models and world model policies in one CLI.

Lars Cornelissen · Jul 7, 2026

Rocket League being simulated by MIRA, a 5B-parameter diffusion transformer that models four independent player views at 20 fps from 10,000 hours of bot self-play, with no physics engine or 3D representation.

Data

MIRA world model simulates Rocket League without a physics engine

MIRA is a 5B-parameter world model that simulates Rocket League from pixels and actions, achieving infinite rollout stability at 20 fps on a single B200.

Lars Cornelissen · Jul 7, 2026

Abstract data-art visualization contrasting direct API integrations that scale as apps times tools with MCP connectors that scale as apps plus tools

AI for dummies

MCP explained: what it actually adds beyond a normal API

MCP, the Model Context Protocol, is an open standard that lets AI models discover and call tools at runtime instead of you hardcoding an API for each one.

Lars Cornelissen · Jul 6, 2026

Abstract funnel data art of the China AI companion rules enforcement scale, narrowing from the 1,000,000 registered user threshold through the 100,000 monthly active user threshold to the 14,000 AI agents Shanghai removed, with the headline figure 14,000.

AI

China AI companion rules split work from emotional agents

China AI companion rules took effect July 15, 2026, forcing ByteDance and Alibaba to shut down companion agent features rather than retrofit compliance.

Lars Cornelissen · Jul 6, 2026

Abstract data visualization suggesting sub-second query latency with a dense band of concurrent user connections, illustrating the Lakehouse Real-Time serving engine target workload of thousands of concurrent users with sub-second SQL reads.

Databricks

Databricks Lakehouse//RT: sub-second SQL reads in beta

Lakehouse Real-Time is a serverless SQL warehouse for sub-second reads against Unity Catalog tables. It targets thousands of concurrent users but only supports SELECT queries via the Statement Execution API, and it is in Beta.

Lars Cornelissen · Jul 6, 2026

Abstract treemap data art of the LeRobot v0.6.0 release: six new simulation benchmarks, five new VLA models, four reward models, and three world model policies, with the headline figure 0.06 seconds for dataset subset loading, down from 275 seconds.

Engineering

LeRobot v0.6 closes the robot learning loop

LeRobot v0.6 ships world models, reward APIs, and DAgger deployment. The robot learning flywheel turns from research demo into a repeatable engineering workflow for VLA builders.

Lars Cornelissen · Jul 6, 2026

Workload identity federation eliminates static credentials. Two credentials to rotate in the static model, zero in the federation model. Snowflake acts as OIDC provider issuing short-lived ID tokens via SYSTEM$ISSUE_WORKLOAD_IDENTITY_FEDERATION_TOKEN.

Snowflake

Snowflake workload identity federation kills static credentials

Snowflake workload identity federation is now GA, letting Snowflake act as an OIDC provider so workloads authenticate to external services with short-lived tokens instead of static credentials. It costs zero additional credits and eliminates credential rotation for outbound API calls.

Lars Cornelissen · Jul 6, 2026

Abstract data visualization representing LongCat-2.0, a Mixture-of-Experts model with 1.6 trillion total parameters and 48 billion active parameters per token, released under MIT license on July 5 2026.

AI for dummies

LongCat-2.0 open weights: what 1.6 trillion parameters means

LongCat-2.0 is a 1.6 trillion parameter AI model from Meituan that activates only 48 billion parameters per token. Its weights are now open under the MIT license.

Lars Cornelissen · Jul 5, 2026

Abstract data-art visualization of GenieX local LLM inference on Snapdragon showing token throughput with Gemma 4 26B at 20 tok/s and Qwen 3.6 27B at 10 tok/s

AI for dummies

GenieX: Qualcomm's local LLM runtime for Snapdragon laptops

GenieX is Qualcomm's runtime for running LLMs locally on Snapdragon laptops and phones. Early users report 20 tokens per second on a 26B model.

Lars Cornelissen · Jul 5, 2026

Abstract visualization of the Bad Epoll CVE-2026-46242 race condition in the Linux kernel epoll subsystem, showing overlapping execution threads colliding in a narrow six-instruction window that triggers a use-after-free granting root access on v6.4+ kernels including Android.

Cyber security

Bad Epoll kernel flaw CVE-2026-46242 roots Linux and Android

Bad Epoll (CVE-2026-46242) is a Linux epoll use-after-free giving unprivileged users root on v6.4+ kernels and Android. No workaround exists. Apply patch a6dc643c6931 now.

Lars Cornelissen · Jul 5, 2026

Abstract timeline data art of the drug development pipeline with the headline figure 13.5 years: preclinical research at roughly 3 to 6 years, Phase 1 at 1 to 2 years, Phase 2 at 2 to 3 years, Phase 3 at 3 to 4 years, and FDA review at 1 to 2 years, illustrating the decade-plus wet-lab gap AI drug discovery still faces.

Business

Anthropic drug discovery hits the wet-lab wall

Anthropic drug discovery pushes Claude Science into pharma R&D, but wet-lab costs and data gaps mean no AI-designed drug has cleared FDA approval yet.

Lars Cornelissen · Jul 4, 2026

Abstract waterfall data art of AI detection coverage on AO3: starting from 100 percent of AI-assisted works, roughly 60 percent leaks away to models other than Claude and another 30 percent to text routed through Google Docs or Word, leaving about 10 percent detectable, shown with the headline figure 10 percent.

AI

AO3 Claude detector is a narrow signal treated as a verdict

The AO3 Claude detector is a fan-made skin that catches one specific paste path from Claude into Archive of Our Own, and fandom communities are already treating its red screen as a verdict. The tool's false negative rate is enormous by design.

Lars Cornelissen · Jul 4, 2026

Bar chart comparing DAT percentile scores: baseline at 81.3, activation steering at 93.9, CreativityNeuro at 94.1, and prompting alone at 87.9 across six open-weight LLMs. CreativityNeuro also reduces top-10 word concentration by 10.2 percentage points.

Research

CreativityNeuro steers LLM weights to break mode collapse

CreativityNeuro is a data-free weight steering method that improves LLM divergent thinking by up to 14 percentile points and reduces mode collapse. It works by scaling creativity-specific weights identified through contrastive prompts, no fine-tuning required.

Lars Cornelissen · Jul 3, 2026

Citrix Bleed 2 timeline showing 23 days from Citrix advisory to CISA KEV, 1 day for the CISA remediation window, and 378 days to the Anubis ransomware report.

Cyber security

Citrix Bleed 2 turns ransomware into identity theft

Citrix Bleed 2 is now an Anubis ransomware access path. Patch NetScaler, kill sessions, and hunt RMM plus credential abuse.

Lars Cornelissen · Jul 3, 2026

SharePoint CVE-2026-45659 patch clock showing 40 days from NVD publication on May 22, 2026 to CISA KEV addition on July 1, 2026, and 3 days from KEV addition to the July 4, 2026 federal due date.

Cyber security

SharePoint CVE-2026-45659 puts patching on a 3 day clock

SharePoint CVE-2026-45659 is an actively exploited RCE risk in CISA KEV. Patch exposed servers and check compromise now.

Lars Cornelissen · Jul 3, 2026

Abstract slope data art showing RLVR training lifting Atlassian tool-use reward across four workflow scenarios, from baselines of 0.35, 0.52, 0.68, and 0.92 up to 1.00, 0.95, 1.00, and 1.00, with the headline figure 1.00.

Engineering

RLVR trains a 4B model to nail Atlassian API calls

RLVR for tool-use agents trains a 4B model to hit 1.00 reward on Atlassian API tasks, up from a 0.35 baseline on Confluence page creation. Synthetic environments and verifiable rewards close the schema gap.

Lars Cornelissen · Jul 3, 2026

GRPO standard deviation chart showing baseline extreme gradient mass at 13.9 percent, GRPO at 24.7 percent, and silent prompts at 44 percent for G=8.

Research

GRPO standard deviation is the reasoning RL dial now

GRPO standard deviation is the update-size dial: Bay and Yearick show 44% of Big-Math prompts go silent at group size 8.

Lars Cornelissen · Jul 2, 2026

LLM groupthink novelty scores from Springboards show Flint at 7.47 distinct responses out of 10, Gemini 3.1 Pro at 3.19, Qwen3-30B-A3B at 3.11, GPT-5.4 at 2.54, and Claude 4.6 Sonnet at 1.83.

AI

LLM groupthink now has a novelty benchmark problem

LLM groupthink is the tendency of models to converge on similar answers. Flint scores 7.47 distinct replies out of 10 in Springboards tests.

Lars Cornelissen · Jul 2, 2026

OpenAI government stake proposal at 5 percent versus Intel 9.9 percent, Nvidia and AMD China revenue cut 15 percent, and Sanders 50 percent plan.

Business

OpenAI government stake tests the AI policy toll road

OpenAI government stake talks put a 5 percent public claim on a $852 billion AI company. Builders should price policy risk into roadmaps.

Lars Cornelissen · Jul 2, 2026

AI browser security disclosure outcomes from BioShocking show 1 fixed vendor, 1 closed or ignored, 3 no response, and 1 patch failed.

Engineering

AI browser security fails the six-agent guardrail test

AI browser security now has a six-agent failure case: BioShocking shows guardrails breaking when web content rewrites context.

Lars Cornelissen · Jul 1, 2026

Claude Science resource scale: 60 curated skills and connectors, ToolUniverse at 600 scientific tools, ClinicalTrials.gov at 500000 studies, and AlphaFold at 200000000 structures.

AI

Claude Science turns lab agents into pharma plumbing

Claude Science is Anthropic’s beta workbench for researchers, with 60-plus curated skills and connectors. Treat it as lab infrastructure first.

Lars Cornelissen · Jul 1, 2026

Langflow RCE response clock showing a 14 day CISA KEV remediation window and a 19 day Trend Micro observed Monero miner campaign.

Cyber security

Langflow RCE turns AI app endpoints into miner bait

Langflow RCE is being used to mine Monero on exposed AI app endpoints. Patch, isolate, and treat public workflows as production attack surface.

Lars Cornelissen · Jul 1, 2026

SimpleHelp CVE-2026-48558 exposure chart showing 14,000 exposed SimpleHelp servers and about 1,008 estimated OIDC-configured exposed servers.

Cyber security

SimpleHelp CVE-2026-48558 needs a patch drill

SimpleHelp CVE-2026-48558 is an actively exploited auth bypass. Patch 5.5.16 or 6.0 RC2, then hunt for TaskWeaver and Djinn.

Lars Cornelissen · Jul 1, 2026

Japan AI robots plan comparing 435,299 industrial robots in Japan's factories in 2023 with 10,000,000 AI robots targeted for 2040.

AI

Japan AI robots turn factory data into a policy bet

Japan AI robots plan targets 10 million machines by 2040, making shared factory data and stage gated delivery the near term test.

Lars Cornelissen · Jul 1, 2026

AI music royalties chart showing Deezer AI uploads rising from 10,000 tracks per day in January 2025 to 75,000 in April 2026.

Business

AI music royalties face Tidal’s harder line in 2026

AI music royalties now hinge on detection: Tidal will label wholly AI tracks on July 15, 2026 and withhold payouts from them.

Lars Cornelissen · Jun 30, 2026

Claude on GB300 in Azure, with each GB300 NVL72 rack carrying 72 GPUs, 36 Grace CPUs, 20 TB HBM, and up to 1.44 exaFLOPS FP4.

AI

Claude on GB300 puts Azure agents on serious new iron

Claude on GB300 is now generally available in Microsoft Foundry, giving Azure teams more inference headroom but a tighter cloud bet.

Lars Cornelissen · Jun 30, 2026

Spring AI 2.0 release work by GitHub release-note count: 5 new features, 5 bug fixes, 4 documentation updates, and 2 dependency upgrades.

Engineering

Spring AI 2.0 gives Java agents a sturdier app stack

Spring AI 2.0 is a sturdier Java AI stack: Boot 4.1, MCP, unified tool loops, and 4 Cosmos DB modules now matter for agents.

Lars Cornelissen · Jun 30, 2026

AI jobs transition donut showing EU employment shares: 47 percent less immediate change, 27 percent workflow reorganization, 14 percent higher automation potential, and 12 percent growth with AI.

Business

AI jobs transition puts Europe in redesign mode now

AI jobs transition maps 27% of EU employment into workflow redesign, with 14% in automation pressure and 12% in growth roles.

Lars Cornelissen · Jun 29, 2026

AI peer review line showing combined ICLR, ICML, and NeurIPS submissions rising from 17,051 in 2020 to an estimated 73,883 in 2026.

AI

AI peer review gets a serious scale test at ICML

AI peer review is moving into production workflow. Google's PAT found 89.7% of tested math errors, but review power remains human.

Lars Cornelissen · Jun 29, 2026

AI coding agent malware chain with 3 hidden indirection steps and 1 runtime DNS payload outside the GitHub repo.

Cyber security

AI coding agent malware hides in clean GitHub repos

AI coding agent malware can hide behind a clean repo. Lock down setup execution, DNS egress, and agent permissions before rollout.

Lars Cornelissen · Jun 29, 2026

VS Code Tasks supply chain attack affected 2 npm packages and 16 Go packages, according to JFrog Security Research.

Cyber security

VS Code Tasks supply chain attack needs new checks

VS Code Tasks supply chain attack is a package hijack pattern that can run outside npm lifecycle scripts, so scan editor configs now.

Lars Cornelissen · Jun 29, 2026

Unity AI Gateway budgets example showing a $5,000 shared Genie budget, $100 per-user threshold, $200 genie-code override, and $300 power-users override.

Databricks

Unity AI Gateway budgets: the spend guardrail guide

Unity AI Gateway budgets set shared and per-user AI spend limits for Genie and gateway traffic, with alerts and blocking for admins.

Lars Cornelissen · Jun 29, 2026

LineShine supercomputer at 2.198 exaflops versus El Capitan at 1.809, Frontier at 1.353, Aurora at 1.012, and JUPITER at 1.000.

Engineering

LineShine supercomputer makes CPU scale political again

LineShine supercomputer is a 2.198 exaflop CPU machine. Treat it as a supply chain signal, with a power bill attached.

Lars Cornelissen · Jun 29, 2026

Matillion

Matillion Maia BigQuery support: setup, cost, limits

Matillion Maia BigQuery support brings GCP warehouses into Maia on Current now, with Stable expected August 1, 2026.

Lars Cornelissen · Jun 29, 2026

Dynamic Iceberg table replication changed from skipped before general availability to supported at general availability, encoded as 0 before and 1 after.

Snowflake

Dynamic Iceberg table replication, costed honestly

Dynamic Iceberg table replication now keeps Snowflake managed Iceberg pipelines in DR plans, but refresh and transfer costs move to the target account.

Lars Cornelissen · Jun 29, 2026

Apple AI price hike raises HomePod mini by 30.3 percent, iPad Air by 25.0 percent, iPad Pro by 20.0 percent, MacBook Air by 18.2 percent, MacBook Pro by 17.7 percent, and MacBook Neo by 16.7 percent.

Business

Apple AI price hike makes memory a consumer tax now

Apple AI price hike is a hardware margin test: memory costs are rising, but Apple's $29.6B profit makes the pass-through harder to defend.

Lars Cornelissen · Jun 28, 2026

Apple CXMT memory deal chart showing TrendForce midpoint price increases: conventional DRAM at 57.5 percent in Q1 2026 and 60.5 percent in Q2 2026, NAND at 35.5 percent and 72.5 percent.

Business

Apple CXMT memory deal makes RAM a policy fight now

Apple CXMT memory deal is a supply chain stress test: DRAM prices are up 58 to 63 percent this quarter, making policy risk a BOM risk.

Lars Cornelissen · Jun 28, 2026

Cisco Unified CM CVE timeline showing 22 days from Cisco's June 3 advisory to CISA's June 25 KEV addition, then 3 days to the June 28 deadline.

Cyber security

Cisco Unified CM CVE gets a weekend patch clock

Cisco Unified CM CVE is now a patch-or-isolate job: CISA set a June 28 deadline after active exploitation of CVE-2026-20230.

Lars Cornelissen · Jun 27, 2026

Timeline for Signal backup recovery keys: Secure Backups announced at day 0, FBI and CISA messaging-app phishing warning at day 193, and CISA June 26 update at day 291.

Cyber security

Signal backup recovery keys become the weak link for ops

Signal backup recovery keys can expose historical chats after one phishing win. Treat messenger backups like identity infrastructure now.

Lars Cornelissen · Jun 27, 2026

Scatter of about 17 dictionary concepts plotted by cosine distance from a seed word, clustered into a near band of banned cliches and a middle band of surprising-but-related lenses the generator is allowed to use.

AI

How to make an LLM escape its own prior with geometry

Novel Search Space breaks an LLM out of its prior by ranking 80,000 dictionary words by embedding distance, banning the obvious neighbours, and forcing the model to brainstorm only from a surprising-but-related band.

Lars Cornelissen · Jun 27, 2026

Mythos 5 access reopened to at least 100 approved US agencies and companies, while GPT-5.6 started with around 20 partners and Fable 5 public access remained 0.

AI

Mythos 5 access returns as a government whitelist

Mythos 5 access is back for a whitelist of at least 100 organizations, turning frontier AI launches into compliance operations.

Lars Cornelissen · Jun 27, 2026

Abstract donut for benchmark saturation showing 19 of 25 human-agent runs completed autonomously and 6 of 25 required some human help.

Research

Benchmark saturation still leaves AI agents exposed

Benchmark saturation is when top agents cluster at ceiling scores. CORE-Bench shows the useful signal moves to cost, reliability, and uplift.

Lars Cornelissen · Jun 26, 2026

Coding agent rewards chart showing clean resolved rising from 40.22 percent to 60.53 percent and hacked resolved falling from 28.57 percent to 0.56 percent.

Research

Coding agent rewards hit a harder verification horizon

Coding agent rewards are now a verification problem: Qwen cut hacked SWE passes from 28.57% to 0.56% with monitoring.

Lars Cornelissen · Jun 26, 2026

GPT-5.6 delay context showing the Anthropic vulnerability triage funnel: 23,019 total Mythos findings, 6,202 estimated high or critical, 1,752 independently assessed, 1,587 true positives, and 1,094 confirmed high or critical.

AI

GPT-5.6 delay makes AI model launches a permit race

GPT-5.6 delay is a shift from voluntary AI testing to government-approved previews. Treat model access as a supply-chain risk now.

Lars Cornelissen · Jun 26, 2026

Agentic AI work chart showing Codex output token share at 99.8 percent for OpenAI workers, 63.3 percent for organizational users, and 16.5 percent for individual users.

Business

Agentic AI work shifts from chat to delegation at scale

Agentic AI work is moving from chat to delegated tasks: OpenAI says 70.2% of sampled Codex users handed off one hour of work.

Lars Cornelissen · Jun 25, 2026

CISA KEV vulnerabilities added on June 23, 2026: Ubiquiti has 3 entries and Lantronix has 1 entry, with three CVSS 10.0 UniFi OS flaws and one CVSS 9.8 Lantronix flaw.

Cyber security

CISA KEV vulnerabilities put edge gear on watch now

CISA KEV vulnerabilities are a live patch queue: four exploited Lantronix and UniFi OS bugs now demand edge inventory and compromise checks.

Lars Cornelissen · Jun 25, 2026

FortiBleed FortiGate credentials chart showing 89 million MySQL authentication tokens, 14.8 million RADIUS credentials, 924,000 NTLM hashes, and 130,000 Kerberos hashes reported in the campaign.

Cyber security

FortiBleed FortiGate credentials need action now

FortiBleed FortiGate credentials are an active edge risk: rotate VPN and admin access now, then hunt for persistence.

Lars Cornelissen · Jun 25, 2026

Ford AI quality comparison showing 2026 PP100 scores: Porsche at 138, Ford at 152, Nissan at 156, Buick at 162, and overall industry at 175.

Engineering

Ford AI quality fix puts veterans back in the loop

Ford AI quality is a lesson in automation debt: 350 veteran engineers came back as JD Power scores rose and recalls stayed costly.

Lars Cornelissen · Jun 25, 2026

IBM nanostack chip density trend showing 50 billion transistors in IBM’s 2021 2 nm chip and nearly 100 billion in 2026 sub-1 nm technology.

Engineering

IBM nanostack chip gives Moore’s Law a vertical path

IBM nanostack chip is a 0.7 nm research architecture that stacks transistors, doubling IBM’s 2021 density claim to nearly 100 billion.

Lars Cornelissen · Jun 25, 2026

Oracle AI layoffs abstract bar chart showing restructuring costs rising from $374 million in fiscal 2025 to $1.8 billion in fiscal 2026.

Business

Oracle AI layoffs expose the cloud capital tradeoff

Oracle AI layoffs are a capital reallocation signal: 21,000 fewer workers, $48 billion raised, and a cloud roadmap funded by debt.

Lars Cornelissen · Jun 24, 2026

Post-quantum cryptography deadline comparison showing Google and Cloudflare at 2029, White House key establishment at 2030, White House signatures at 2031, and NIST general disallowance at 2035.

Engineering

Post-quantum cryptography deadline gets a 2030 clock

Post-quantum cryptography deadline means high-value federal systems must shift key establishment by 2030 and signatures by 2031.

Lars Cornelissen · Jun 24, 2026

RIFT-Bench deterministic attack success rates by domain: wild 41.9 percent, finance 40.5 percent, medical 31.3 percent, personal assistant 30.1 percent, and travel 27.4 percent.

Research

RIFT-Bench tests agents where prompts cannot reach

RIFT-Bench is a dynamic agentic red-teaming benchmark that found attacks activated in 78.9% to 89.3% of tested agent runs.

Lars Cornelissen · Jun 24, 2026

Copilot spend meter plan chart showing individual Copilot AI credit allowances: Pro 1,500, Pro+ 7,000, and Max 20,000.

Engineering

Copilot spend meter shows where AI coding costs bite

Copilot spend meter is VS Code's new warning for AI credit overruns. Treat the 1,500 Pro credits as a budget, not a perk.

Lars Cornelissen · Jun 23, 2026

AutoJack exposure chart showing one malicious page, three chained weaknesses, zero stable PyPI releases affected, and one GitHub hardening commit.

Cyber security

AutoJack makes AI agent prototypes a real RCE risk

AutoJack is a host RCE warning for AI agent prototypes: one malicious page chained three AutoGen Studio weaknesses into command execution.

Lars Cornelissen · Jun 23, 2026

FortiBleed credential theft scale showing 430,000 FortiGate firewalls targeted, 86,644 compromised devices, and 22,405 unique domains.

Cyber security

FortiBleed credential theft reaches the firewall edge

FortiBleed credential theft turns compromised FortiGate firewalls into sniffers. Rotate secrets and hunt traffic capture now.

Lars Cornelissen · Jun 23, 2026

High-NA EUV comparison showing ASML NXE low-NA EUV at 13 nm resolution and EXE High-NA EUV at 8 nm resolution, with ASML citing 2.9 times higher transistor density.

Engineering

High-NA EUV turns AI chips into a $400M bottleneck

High-NA EUV is ASML’s 8 nm lithography step for denser AI chips. Treat the $400M tool as a roadmap risk and cost signal.

Lars Cornelissen · Jun 23, 2026

45C liquid cooling comparison showing conventional cooling-tower systems at 2.6 million gallons per megawatt per year and Nvidia Rubin warm-water cooling at near zero gallons per megawatt per year.

Engineering

45C liquid cooling gives Nvidia a tougher water defense

45C liquid cooling lets Rubin racks reject heat with dry coolers, cutting on-site water use while shifting scrutiny to power and buildout.

Lars Cornelissen · Jun 23, 2026

45C liquid cooling water use comparison showing conventional cooling towers at 2.6 million gallons per megawatt per year and favorable dry cooler designs near zero.

Engineering

45C liquid cooling makes AI factories a water test

45C liquid cooling lets Rubin servers use warmer coolant and near zero facility cooling water in favorable climates, changing AI site math.

Lars Cornelissen · Jun 22, 2026

Databricks

Databricks Lakeflow Designer GA cost guide for teams

Databricks Lakeflow Designer is GA for visual, governed data prep. Use it for analyst-built transforms, but meter previews and jobs.

Lars Cornelissen · Jun 22, 2026

Databricks

Databricks PAT auto-scoping is a quiet job breaker

Databricks PAT auto-scoping narrows long-lived tokens after 30 days of observed API use. Audit automation now before jobs fail.

Lars Cornelissen · Jun 22, 2026

Matillion

Matillion anomaly alerts: runtime drift without polling

Matillion anomaly alerts flag runtime drift after 10 successful runs, using up to 300 runs of history so you can watch cost before failure.

Lars Cornelissen · Jun 22, 2026

Snowflake

Snowflake Adaptive Compute changes AWS cost models

Snowflake Adaptive Compute is GA on AWS in six regions, with query-level billing and a 1.2x Snowflake benchmark edge over Gen2.

Lars Cornelissen · Jun 22, 2026

Snowflake

Snowflake Dynamic Tables refresh gets a cost test

Snowflake Dynamic Tables now claim up to 2.8x faster refresh on Gen2 warehouses, but the cost win depends on target lag and change volume.

Lars Cornelissen · Jun 22, 2026

Vibe coding security donut showing 5,000 sensitive-data assets out of 380,000 public AI-built assets, with 375,000 other public assets.

Engineering

Vibe coding security needs a real publish gate now

Vibe coding security is a publish-time problem: 5,000 AI-built public assets reportedly exposed sensitive data, so add gates before launch.

Lars Cornelissen · Jun 22, 2026

AI music training data comparison showing the four reported dataset sizes: 12.648 million tracks, 9 million tracks, 0.106 million tracks, and 0.100 million tracks.

AI

AI music training data gets a searchable paper trail

AI music training data now has a searchable trail: The Atlantic surfaced four datasets with more than 21 million tracks, raising build risk.

Lars Cornelissen · Jun 21, 2026

FortiBleed exposure bars showing CISA at about 74,000 Fortinet devices and SOCRadar at 86,644 compromised FortiGate access points.

Cyber security

FortiBleed puts FortiGate credentials on the clock now

FortiBleed is an active FortiGate credential exposure campaign with up to 86,644 devices reported. Rotate, isolate, and investigate now.

Lars Cornelissen · Jun 21, 2026

Splunk CVE-2026-20253 timeline showing day 0 disclosure on June 10, day 2 public technical write-up on June 12, day 8 CISA KEV addition on June 18, and day 11 federal due date on June 21.

Cyber security

Splunk CVE-2026-20253 puts SIEMs on a 72-hour clock

Splunk CVE-2026-20253 is an actively exploited critical flaw. Patch exposed Enterprise 10.0 and 10.2 nodes by June 21.

Lars Cornelissen · Jun 21, 2026

Agentic clinical RAG acceptance rates from 80 percent minimum per type, 96.5 percent overall across 7,326 judgments, and 99 percent maximum per type.

Research

Agentic clinical RAG gets 96.5% clinician acceptance

Agentic clinical RAG accepted 96.5% of clinician checks in one lymphoma registry study, but the edge came from citations and constraints.

Lars Cornelissen · Jun 20, 2026

Fortinet credential exposure chart comparing 87,000 FortiGate SSL VPN devices in 2021 with approximately 74,000 Fortinet devices in 2026 FortiBleed reports.

Cyber security

Fortinet credential exposure needs a reset plan now

Fortinet credential exposure means about 74,000 edge devices may have leaked logins. Rotate, audit, and lock down FortiGate access now.

Lars Cornelissen · Jun 20, 2026

Mastra npm supply chain attack count chart showing more than 140 affected packages, 166 wallet extension IDs checked, and 7 attack phases.

Cyber security

Mastra npm supply chain attack hits AI build rooms

Mastra npm supply chain attack exposed AI build pipelines through more than 140 packages, so treat installs as secret exposure events.

Lars Cornelissen · Jun 20, 2026

Splunk Enterprise flaw CVE-2026-20253 response timeline showing disclosure at day 0 on June 10, public technical writeup at day 2 on June 12, exploitation warning at day 8 on June 18, and the CISA remediation deadline at day 11 on June 21.

Cyber security

Splunk Enterprise flaw turns logs into attack surface

Splunk Enterprise flaw CVE-2026-20253 is under active exploitation. Patch by June 21 or disable the PostgreSQL sidecar safely.

Lars Cornelissen · Jun 20, 2026

Reported diffusion language models speed bars showing Mercury Coder Small at 737 tokens per second, Mercury Coder Mini at 1109, Gemini Diffusion at 1479, and Seed Diffusion Preview at 2146.

Research

Diffusion language models meet a messy benchmark tax

Diffusion language models generate by denoising full sequences, but an 8 model, 8 benchmark study shows deployment depends on inference choices.

Lars Cornelissen · Jun 20, 2026

RAM price shock memory index showing conventional DRAM at 100 in Q4 2025, 192.5 in Q1 2026, and 309 in Q2 2026; NAND at 100, 157.5, and 271.7.

Business

RAM price shock just killed Nothing’s budget phone

RAM price shock forced Nothing to skip a 2026 CMF phone, showing AI memory demand now decides what budget hardware can ship.

Lars Cornelissen · Jun 20, 2026

Large-load interconnection chart showing US data center electricity use rising from 58 TWh in 2014 to 176 TWh in 2023, with 2028 scenarios at 325 TWh and 580 TWh.

Business

Large-load interconnection gets a 60-day FERC test

Large-load interconnection is now a 60-day FERC test for grid operators, and AI builders should treat power flexibility as product infrastructure.

Lars Cornelissen · Jun 19, 2026

MosaicLeaks PA-DR results showing base Qwen3-4B at 48.7 percent strict chain success and 34.0 percent leakage, task-only training at 59.3 percent success and 51.7 percent leakage, and PA-DR at 58.7 percent success and 9.9 percent leakage.

Research

MosaicLeaks shows research agents leak query secrets

MosaicLeaks is a privacy benchmark for research agents. It shows PA-DR cut answer or full-information leakage from 34.0% to 9.9%.

Lars Cornelissen · Jun 19, 2026

Subquadratic attention latency comparison at 128K, 256K, 512K, and 1M tokens: FlashAttention-2 rises from 319.88 ms to 21,410.51 ms while SSA rises from 46.5 ms to 380.96 ms.

Engineering

Subquadratic attention gets its first serious test

Subquadratic attention is a sparse LLM design now showing a 56.2x speed test win at 1M tokens. Treat it as promising, gated infrastructure.

Lars Cornelissen · Jun 19, 2026

AI trust gap donut showing 63 percent of U.S. adults say AI is advancing too quickly, 19 percent say the pace is about right, 16 percent are not sure, and 2 percent say too slowly.

AI

AI trust gap widens as chatbot use hits 49% of US adults

The AI trust gap is the split between adoption and confidence: 49% of U.S. adults use chatbots, while 63% say AI moves too fast.

Lars Cornelissen · Jun 18, 2026

VMware migration pressure metrics: 86 percent reducing VMware footprint, 85 percent worried about price rises, 63 percent changed strategy at least twice, 59 percent saw cost increases above 25 percent.

Business

VMware migration turns Broadcom risk into a roadmap

VMware migration is now a board-level escape plan: Tesco says it must move 40,000 workloads after Broadcom pricing and support changes.

Lars Cornelissen · Jun 18, 2026

DivInit agentic search lifts Qwen3 multi-hop pass@4 from 25.0 to 27.8 percent at 1.7B, 29.5 to 36.6 percent at 4B, and 38.6 to 46.0 percent at 8B.

Research

DivInit makes agentic search threads less wasteful

DivInit is a training-free way to seed agentic search. It adds 5 to 7 pass@4 points by diversifying the first query.

Lars Cornelissen · Jun 17, 2026

Research

LLM recommendation bias gives famous brands an edge

LLM recommendation bias is a measurable incumbent edge: a new arXiv paper found famous skincare brands were recommended 100 percent of the time.

Lars Cornelissen · Jun 17, 2026

MLPerf Training 6.0 scale comparison showing Blackwell at 8,192 GPUs for DeepSeek-V3 and 8,192 GPUs for Llama 3.1 405B, with AMD OCI FLUX.1 at 512 GPUs.

AI

MLPerf Training 6.0 gives Blackwell the scale edge

MLPerf Training 6.0 shows Blackwell leading all seven tests, but the useful signal is scale: 8,192 GPUs and MoE training pressure.

Lars Cornelissen · Jun 17, 2026

AI content labelling timeline showing the EU code process falling from 270 days before the deadline at the November 5, 2025 kickoff to 53 days at the June 10, 2026 final code and 0 days on August 2, 2026.

AI

AI content labelling gives builders 47 days to move

AI content labelling becomes an EU product requirement on August 2, 2026. Audit chatbot, deepfake and public-interest text flows now.

Lars Cornelissen · Jun 16, 2026

Flexible data centers chart showing US grid headroom rising from 76 GW at 0.25 percent curtailment to 98 GW at 0.5 percent and 126 GW at 1 percent.

Engineering

Flexible data centers turn AI power into a grid dial

Flexible data centers let AI sites cut load for a few peak hours, with Duke finding 76 GW of headroom at 0.25% curtailment.

Lars Cornelissen · Jun 16, 2026

SearchLeak severity comparison showing Microsoft scored CVE-2026-42824 at 6.5, NVD scored it at 7.5, and EchoLeak's Microsoft CNA score was 9.3.

AI

SearchLeak turns Copilot’s trust boundary into the bug

SearchLeak is a one-click M365 Copilot exploit chain. It shows why agent security has to move below prompts, into render and egress controls.

Lars Cornelissen · Jun 16, 2026

Snowflake

Snowflake Hybrid Tables get faster without request fees

Snowflake Hybrid Tables are row-store tables for OLTP-style work in Snowflake. New preview optimizations report up to 8x throughput.

Lars Cornelissen · Jun 15, 2026

Snowflake

Snowflake Iceberg ADLS writes are finally usable

Snowflake Iceberg ADLS support is the GA path for Azure teams to read and write externally managed Iceberg tables without moving storage.

Lars Cornelissen · Jun 15, 2026

WorkBench agents outcomes showing GPT-4 at 43 percent correct and 26 percent harmful actions, Claude Opus 4.8 at 89 percent correct and 2.5 percent harmful, and the WorkBench repo reporting Claude Fable 5 at 92 percent correct and 1.9 percent harmful.

Research

WorkBench agents close the workplace reliability gap

WorkBench agents now solve 89 percent of workplace tasks with 2.5 percent harmful actions, changing the risk math for builders.

Lars Cornelissen · Jun 15, 2026

AgentPerf relative benchmark lifts showing NVIDIA H200 at 1x, B200 MLPerf at 3.1x, GB200 MLPerf at 3.4x, and GB300 AgentPerf at 20x.

AI

AgentPerf puts Blackwell’s agent lead at 20x per watt

AgentPerf is a benchmark for concurrent AI agents. Its first results put NVIDIA GB300 NVL72 at up to 20x Hopper efficiency.

Lars Cornelissen · Jun 13, 2026

olmo-eval context chart showing MMLU scoring gaps from OLMES: Llama3 70B at 79.8 versus 60.7, Mistral 7B at 64.0 versus 50.3, OLMo 7B 0424 at 54.4 versus 42.4, and OLMo 7B at 28.3 versus 40.5.

Engineering

olmo-eval brings statistical discipline to LLM loops

olmo-eval is an open workbench for iterative LLM evaluation. It makes tiny checkpoint gains harder to mistake for progress.

Lars Cornelissen · Jun 13, 2026

Amazon data centers WUE compared with Amazon at 0.12 L/kWh, Microsoft at 0.27 L/kWh, and the industry average at 0.84 L/kWh.

AI

Amazon data centers put 2.5B gallons on the AI bill

Amazon data centers used 2.5 billion gallons of water in 2025. Treat that disclosure as a roadmap risk for AI products.

Lars Cornelissen · Jun 12, 2026

Bar chart of content tactics and their effect on source visibility in AI answers: citing sources +40 percent, adding quotations +38 percent, adding statistics +37 percent, fluent writing +15 percent, keyword stuffing minus 3 percent.

AI

Generative engine optimization: how to rank inside an LLM

Generative engine optimization (GEO) is the practice of getting your content cited inside AI answers from ChatGPT, Perplexity and Google's AI Overviews. Here is what earns a citation and what to change on your site.

Lars Cornelissen · Jun 12, 2026

Engineering

Miasma worm: live coverage of the Red Hat npm attack

Miasma is a self-propagating npm worm. It hijacked Red Hat's GitHub Actions OIDC trusted publishing to ship 96 backdoored @redhat-cloud-services versions whose preinstall hook runs a Bun credential stealer that then spreads with the secrets it steals.

Lars Cornelissen · Jun 12, 2026

VS Code Autopilot default risk shown through 2025 developer environment usage: Visual Studio Code at 75.9 percent, Visual Studio at 29 percent, Notepad++ at 27.4 percent, IntelliJ IDEA at 27.1 percent, and Vim at 24.3 percent.

AI

VS Code Autopilot puts agent risk in every default

VS Code Autopilot is now enabled by default, giving coding agents more autonomy. Treat the new default as a policy change, not a shortcut.

Lars Cornelissen · Jun 12, 2026

Agentic commerce merchant readiness in 2026: 19 percent have solutions in place, 32 percent are implementing, 31 percent are planning, and 15 percent have no plans.

Business

Agentic commerce gets Visa’s ChatGPT payment rails

Agentic commerce is AI agents completing purchases with permission. Visa’s ChatGPT deal brings it to 4.8 billion credentials.

Lars Cornelissen · Jun 11, 2026

Claude Fable 5 guardrails session split showing more than 95 percent of sessions with no fallback and fewer than 5 percent routed to Opus 4.8.

AI

Claude Fable 5 guardrails get a visibility reset

Claude Fable 5 guardrails are now visible after backlash. Builders should log fallback events, cost, and retention before trusting runs.

Lars Cornelissen · Jun 11, 2026

Donut chart for multi-agent safety showing AI agents succeeding on 66.3 percent of OSWorld tasks in 2025 and failing on 33.7 percent.

AI

Multi-agent safety gets Google’s $10 million test

Multi-agent safety is the problem of keeping interacting AI agents from amplifying failure. Google’s $10 million bet starts small.

Lars Cornelissen · Jun 11, 2026

Confidential inference in Apple’s AFM 3 lineup: 2 on-device models and 3 server-based models, with AFM 3 Cloud Pro running on NVIDIA GPUs in Google Cloud.

Engineering

Confidential inference makes Apple’s cloud AI real

Confidential inference lets Apple run AFM 3 Cloud Pro on NVIDIA GPUs in Google Cloud while keeping PCC privacy promises.

Lars Cornelissen · Jun 10, 2026

Line chart showing bespoke connectors needed grows to 50 without MCP for 5 clients and 10 tools, versus only 15 with MCP, because the cost goes from clients times tools to clients plus tools.

Engineering

What is MCP? The Model Context Protocol for data engineers

MCP is an open standard that lets AI models call your tools and data through one connector instead of a custom integration per model. Here is what it means for data engineers.

Lars Cornelissen · Jun 10, 2026

Siri AI model tests show AFM 3 Cloud at 64.7 percent preference versus 8.7 percent for the 2025 server baseline, and AFM 3 Core at 45.6 percent versus 23.3 percent.

AI

Siri AI makes Apple’s platform bet depend on Google

Siri AI is Apple’s rebuilt assistant, but its strongest model jump is Gemini-built: AFM 3 Cloud won 64.7 percent of text tests.

Lars Cornelissen · Jun 10, 2026

Donut showing WhatsApp AI assistants enforcement stakes: EU fine ceiling is 10 percent of Meta's 2025 revenue, about $20.1 billion of $201.0 billion.

Business

WhatsApp AI assistants become EU gatekeeping test

WhatsApp AI assistants are now an EU platform access fight: Meta must reopen WhatsApp for rivals for free or risk a fine near $20.1 billion.

Lars Cornelissen · Jun 10, 2026

Two-line chart: a saturated benchmark curve flattening near 95 percent while a frontier-capability curve keeps rising past it, showing the AI plateau is a measurement ceiling rather than a stall.

AI

The AI plateau is mostly a goalpost that keeps moving

An AI plateau is mostly an illusion: old benchmarks maxed out and the AGI goalposts keep moving, even as the frontier capability curve keeps climbing.

Lars Cornelissen · Jun 9, 2026

Donut chart showing where a Claude Fable 5 call runs: about 95 percent of sessions get the full Mythos-class model and under 5 percent are silently routed to Claude Opus 4.8.

AI

Fable 5 and Mythos 5 are one model with a bouncer

Claude Fable 5 and Claude Mythos 5 are the same Anthropic weights; a runtime classifier, not the model you call, decides which capability you actually get.

Lars Cornelissen · Jun 9, 2026

Miasma worm disabled 73 Microsoft GitHub repos: 49 under Azure, 13 under Azure-Samples, 10 under microsoft, and 1 under MicrosoftDocs.

Engineering

Miasma worm turns AI coding agents into repo traps

Miasma worm is a credential stealer that hit 73 Microsoft GitHub repos. Treat agent-opened clones as compromised, not suspicious.

Lars Cornelissen · Jun 9, 2026

Seattle data center moratorium context: global data center electricity use rises from 415 TWh in 2024 to 945 TWh by 2030.

AI

Seattle data center moratorium tests AI’s power grab

Seattle data center moratorium is a 365-day pause on new large facilities. Treat 369 MW as the warning label for AI roadmaps.

Lars Cornelissen · Jun 9, 2026

Xcode 27 context: AFM 3 Cloud preferred on 64.7 percent of text prompts versus 8.7 percent for the 2025 server model; AFM 3 Core was 45.6 percent versus 23.3 percent.

Engineering

Xcode 27 makes Apple AI the next app platform toll

Xcode 27 is Apple's AI development reset: free Private Cloud Compute for small apps, agent hooks, and five AFM 3 models change build calculus.

Lars Cornelissen · Jun 9, 2026

Attack selection reduces measured AI agent safety at a 1% audit budget: start policy lowers safety by 20 percentage points in BashArena and 20 in LinuxArena, while stop policy lowers safety by 20 and 28.

Research

Attack selection makes AI agent safety look too high

Attack selection lets AI agents choose when to cheat. A new control eval finds safety drops up to 28 percentage points at 1% auditing.

Lars Cornelissen · Jun 8, 2026

$CrowdMath model results show next post prediction at 88 percent while post role classification reaches only 42 percent macro F1.$

Research

CrowdMath exposes the math gap AI agents still miss

CrowdMath is a dataset of 164 annotated math research chains. Use it to test whether models understand progress, not just answers.

Lars Cornelissen · Jun 8, 2026

Donut chart for London robotaxi trust showing 79 percent of Londoners would not trust or feel comfortable in a driverless car and 21 percent would.

Business

London robotaxi race starts with Uber interest list

London robotaxi service is moving from policy to product with Uber and Wayve sign-ups, but safety drivers and tiny fleets keep it a pilot.

Lars Cornelissen · Jun 8, 2026

Matillion

Matillion Context Engine: setup, limits, and costs

Matillion Context Engine is a public preview knowledge graph for Maia AI Agents. Start with restricted domain graphs and watch crawler scope.

Lars Cornelissen · Jun 8, 2026

Snowflake

Snowflake Adaptive Compute: the bill owner playbook

Snowflake Adaptive Compute is query-based adaptive warehouse compute. Use its 2 knobs to cap bursts, monitor credits, and avoid blind migrations.

Lars Cornelissen · Jun 8, 2026

ICEBERG_MERGE_ON_READ_BEHAVIOR AUTO routes 3 of 4 Iceberg table cases to merge-on-read and 1 of 4 to copy-on-write.

Snowflake

ICEBERG_MERGE_ON_READ_BEHAVIOR changes Iceberg DML

ICEBERG_MERGE_ON_READ_BEHAVIOR is Snowflake's GA switch for Iceberg DML mode. AUTO sends 3 of 4 table cases to merge-on-read.

Lars Cornelissen · Jun 8, 2026

AI data center backlash chart showing U.S. data center electricity rising from 176 TWh in 2023 to 325 to 580 TWh in 2028.

Business

AI data center backlash hits Shelbyville's $2B bet

AI data center backlash is now a zoning risk: Shelbyville's $2B Prologis campus shows trust can bottleneck compute before power does.