2026-05-28

Simon Willison — 2026-05-28#

Highlight#

Anthropic’s release of Claude Opus 4.8 brings welcome improvements to model honesty and prompt caching, which Simon immediately put to the test using his newly updated llm-anthropic CLI plugin to generate SVGs of pelicans riding bicycles.

Posts#

Claude Opus 4.8: “a modest but tangible improvement” Simon highlights Anthropic’s refreshing honesty in marketing this release as an incremental upgrade, noting the model’s decreased hallucination rate achieved by simply abstaining when uncertain. Key technical changes include a reduced prompt cache minimum of 1,024 tokens and the ability to insert system messages mid-conversation, which preserves cache hits and reduces input costs in agentic loops. He tested the model by generating SVG pelicans riding bicycles at different thinking levels via his LLM CLI, using Opus 4.8 to build the rendering HTML tool and relying on GPT-5.5 as a “code security blanket” to patch XSS vulnerabilities.

Week 17 Summary

Simon Willison — Week of 2026-04-11 to 2026-04-17#

Highlight of the Week#

This week’s most striking revelation came from Simon’s infamous “pelican riding a bicycle” SVG generation benchmark, where a 21GB quantized local model (Qwen3.6-35B-A3B) unexpectedly outperformed Anthropic’s brand-new Claude Opus 4.7 flagship. Running locally on a MacBook Pro via LM Studio, Qwen generated a better bicycle frame and even won a secret unicycle backup test, leading Simon to conclude that his joke benchmark’s long-standing correlation with general model utility has finally broken down.

Week 19 Summary

Simon Willison — Week of 2026-04-18 to 2026-05-01#

Highlight of the Week#

The alpha release of llm 0.32a0 marks a foundational architectural pivot for Simon’s ecosystem of CLI tools. By moving away from a simple text-in/text-out abstraction to one that natively models complex message sequences and typed streams, the library is now future-proofed to handle the realities of modern frontier models. This opens the door for seamless integration of server-side tool calls, multi-modal inputs, and reasoning tokens.

2026-04-14

Simon Willison — 2026-04-14#

Highlight#

Simon highlights a fascinating paradigm shift in AI security: treating vulnerability discovery as an economic “proof of work” equation where spending more tokens yields better hardening. This creates a compelling new argument for the enduring value of open-source libraries in the age of vibe-coding, as the massive cost of AI security reviews can be shared across all of a project’s users.

Posts#

[datasette PR #2689: Replace token-based CSRF with Sec-Fetch-Site header protection] · Source Simon has replaced Datasette’s cumbersome token-based CSRF protection with a new middleware relying on the Sec-Fetch-Site header, inspired by Filippo Valsorda’s research and recent changes in Go 1.25. This modern approach eliminates the need to scatter hidden CSRF token inputs throughout templates or selectively disable protection for external APIs. Interestingly, while Claude Code handled the bulk of the commits under Simon’s guidance with cross-review by GPT-5.4, Simon chose to hand-write the PR description himself as an exercise in conciseness and keeping himself honest.

2026-04-19

Simon Willison — 2026-04-19#

Highlight#

The most thought-provoking piece today examines the resurgence of APIs, driven by the rapid rise of personal AI agents that need programmable access to services. With industry giants pivoting to “headless” models, robust API access is quickly shifting from technical debt to the ultimate competitive advantage for software products.

Posts#

Headless everything for personal AI · Source Simon highlights a trend identified by Matt Webb: headless services are poised for a massive comeback because AI agents operate far more efficiently via APIs than by awkwardly clicking around a GUI with a bot-controlled mouse. This isn’t just a niche developer theory; Marc Benioff recently announced “Salesforce Headless 360,” which exposes their entire platform via APIs and eliminates the need for a browser so agents can access workflows directly. Simon points out the massive implications this has for traditional per-seat SaaS pricing models, which will inevitably be thrown into havoc as agents replace human seats. Drawing on a piece by Brandur Leach, he notes that we are entering the “Second Wave of the API-first Economy,” where offering an API has evolved from a liability into the crucial deciding factor that allows a service to win in a crowded and relatively undifferentiated market.

2026-05-17

Simon Willison — 2026-05-17#

Highlight#

The NHS recently decided to close its open-source repositories in response to AI-discovered vulnerabilities, but the UK Government Digital Service (GDS) is publicly pushing back. Simon highlights this rare public clash between UK civil service branches over the critical issue of AI security and open-source by-default policies.

Posts#

GDS weighs in on the NHS’s decision to retreat from Open Source · Source Simon points to Terence Eden’s continued coverage of the NHS’s poorly considered decision to lock down access to open-source repositories following vulnerabilities flagged by Project Glasswing. The UK Government Digital Service (GDS) has stepped in with a new publication on AI and open code, strongly recommending that public sector code remain “open by default” because closing everything adds delivery costs and reduces both code reuse and scrutiny. Terence Eden observes that this public disagreement—described as a frosty “meeting without biscuits”—represents a major escalation within the civil service over how to handle open-source security in the age of AI.

Simon Willison

Simon Willison — Week of 2026-05-16 to 2026-05-22#

Highlight of the Week#

The most impactful milestone this week is the official announcement of Datasette Agent, merging Simon’s three years of work on his LLM library directly into Datasette. This conversational AI interface allows users to naturally interrogate their databases, boasting an extensible plugin architecture for charts, image generation, and secure code execution.

Key Posts#

[The last six months in LLMs in five minutes] · Source Simon shared annotated slides from his PyCon US 2026 lightning talk capturing a major inflection point in AI developer tooling. He highlights how coding agents crossed the threshold to become reliable daily drivers, and points to the astonishing capabilities of massive local models running on consumer hardware like Mac Minis.

Daily Digest

AI-curated news and insights, organized so you never miss what matters

Today's Digest
  • What Is This#

    A daily briefing that pulls from dozens of sources — tech blogs, social media, news outlets, and video channels — then distills them into concise, readable summaries you can scan in minutes.

  • How It Works#

    Content is collected and summarized on a rolling basis: today for the freshest takes, this week for catch-up, and monthly/archive views for deeper review.