AI@X

Sources

The Death of “Tokenmaxxing” and the AI ROI Reckoning — 2026-05-29#

Highlights#

Today’s discourse is heavily dominated by the sobering economic realities of generative AI, with a chorus of voices signaling an end to unconstrained enterprise AI spending—a trend newly dubbed the death of “tokenmaxxing”. As companies scrutinize the return on investment for their massive infrastructure deployments, the community is debating whether the American AI bubble is popping and if foundation models are rapidly commoditizing into low-margin products.

2026-05-21

Simon Willison — 2026-05-21#

Highlight#

The major news today is the official announcement of Datasette Agent, merging Simon’s three years of work on the LLM library with Datasette to create an extensible, conversational AI assistant for querying data. It represents a huge milestone for his ecosystem, opening the door for users to naturally interrogate their databases and easily build custom tools using a new plugin architecture.

Posts#

Datasette Agent Simon officially announced Datasette Agent, a conversational AI interface that lets users ask questions of the data stored in Datasette. The post features a live demo using Gemini 3.1 Flash-Lite to successfully query a blog database to find a bird-watching record. He highlights a growing plugin ecosystem—including charts, image generation, and sandbox execution—and notes that tools like Claude Code and OpenAI Codex are proving excellent at writing these extensions. Looking ahead, Simon teased a major refactor for his LLM library, a Claude Artifacts-style plugin, and a personal AI assistant named “Claw” built using his older Dogsheep tools.

2026-04-03

Sources

Company@X — 2026-04-03#

Signal of the Day#

Google reclaimed the open-source spotlight with the release of the Gemma 4 model family, fully licensed under Apache 2.0. The launch was immediately backed by NVIDIA, who released a quantized 31B version, marking a highly coordinated ecosystem push to challenge Chinese open-source dominance.

2026-04-03

Simon Willison — 2026-04-03#

Highlight#

The overarching theme today is the sudden, step-function improvement in AI-driven vulnerability research. Major open-source maintainers are simultaneously reporting that the era of “AI slop” security reports has ended, replaced by an overwhelming tsunami of highly accurate, AI-generated bug discoveries that are drastically changing the economics of exploit development.

Posts#

Vulnerability Research Is Cooked · Source Highlighting Thomas Ptacek’s commentary, Simon notes that frontier models are uniquely suited for exploit development due to their baked-in knowledge of bug classes, massive context of source code, and pattern-matching capabilities. Since LLMs never get bored constraint-solving for exploitability, agents simply pointing at source trees and searching for zero-days are set to drastically alter the security landscape. Simon is tracking this trend closely enough that he just created a dedicated ai-security-research tag to follow it.

2026-04-15

Sources

AI Deployment Realities & The Open Source Security Squeeze — 2026-04-15#

Highlights#

Today’s discourse reveals a sobering maturation in the AI space, shifting the focus from model hype to the gritty mechanics of practical deployment and the resulting friction,,. While enterprises are defining net-new technical roles and methodologies to integrate agents successfully, the community is simultaneously grappling with a rising backlash against AI “workslop” and the realization that AI-driven automated exploitation is actively forcing companies to close their open-source codebases-,,-.

2026-04-18

Sources

Engineering @ Scale — 2026-04-18#

Signal of the Day#

Figma’s implementation of the Model Context Protocol (MCP) demonstrates that reliable LLM-driven features require exposing strict, deterministic APIs for state extraction rather than relying on generative guessing. By injecting capture scripts to extract running DOM data and programmatically mapping it to native canvas layers, they solved the chronic fragility of code-to-design pipelines.

2026-04-19

Engineering Reads — 2026-04-19#

The Big Idea#

Software engineering is inherently political, whether you are building capability-based microkernels, managing toxic open-source communities, or resisting corporate exploitation through unionization. True technical excellence cannot exist in a moral vacuum; the legal, social, and labor structures behind the code determine its ultimate value to society.

Deep Reads#

Porting Helios to aarch64 for my FOSDEM talk, part one · Drew DeVault · Source The author explains the process of porting the Helios microkernel, written in the Hare language, to aarch64 in order to present a slidedeck directly from a Raspberry Pi 4. The initial focus is on the bootloader, leveraging an EFI stub and device trees instead of SoC-specific complexities. A major challenge discussed is the EL2 to EL1 exception level transition on real hardware, which differed from the QEMU emulator defaults. Systems developers working on bare-metal ARM boot sequences should read this to understand practical EFI memory mapping and MMU configuration.

2026-04-27

Sources

Engineering @ Scale — 2026-04-27#

Signal of the Day#

Amazon successfully bridged the semantic gap in product search by using massive LLMs offline to generate a 29-million edge commonsense knowledge graph, then instruction-tuning a smaller, highly-efficient model (COSMO-LM) for real-time production serving. It is a masterclass in treating frontier models as data-synthesizers rather than production-serving endpoints.

2026-04-29

Simon Willison — 2026-04-29#

Highlight#

The standout update today is the alpha release of llm 0.32a0, which introduces a major architectural shift to handle the complex realities of modern frontier models. By moving from a simple text-in/text-out abstraction to one based on message sequences and typed streaming parts, Simon is future-proofing the library to seamlessly support reasoning tokens, server-side tool calls, and multi-modal inputs and outputs.

Posts#

[LLM 0.32a0 is a major backwards-compatible refactor] · Source Simon has released an alpha version of his LLM Python library and CLI tool that significantly refactors how models process prompts and responses. Recognizing that modern LLMs possess complex capabilities like reasoning, executing tool calls, and returning images or audio, the original text-in/text-out abstraction was no longer sufficient. The library now models inputs as a sequence of conversational messages and outputs as a stream of typed message parts. Developers can use the new llm.user() and llm.assistant() builder functions to cleanly feed in previous conversation turns without relying on SQLite, while the updated streaming interface elegantly interleaves text, tool execution requests, and reasoning output. For CLI users, the only visible change is a new -R/--no-reasoning flag that suppresses thinking tokens, and Python API users gain a new built-in serialization mechanism to roll their own storage alternatives.

2026-04-30

Sources

Engineering @ Scale — 2026-04-30#

Signal of the Day#

When processing sensitive data with large language models, decoupling deterministic data extraction from probabilistic structuring is critical to bypass model-level safety interference. Sun Finance attempted to use Anthropic’s Claude to extract data directly from identity documents, but the model’s built-in PII safety protocols actively degraded character recognition, resulting in a poor 61.8% accuracy. By shifting the raw extraction to a traditional OCR layer (Amazon Textract) and restricting the LLM strictly to JSON structuring, they bypassed the safety throttles, pushing extraction accuracy to 90.8% while reducing per-document costs by 91%.