2026-04-29

Simon Willison — 2026-04-29#

Highlight#

The standout update today is the alpha release of llm 0.32a0, which introduces a major architectural shift to handle the complex realities of modern frontier models. By moving from a simple text-in/text-out abstraction to one based on message sequences and typed streaming parts, Simon is future-proofing the library to seamlessly support reasoning tokens, server-side tool calls, and multi-modal inputs and outputs.

Posts#

[LLM 0.32a0 is a major backwards-compatible refactor] · Source Simon has released an alpha version of his LLM Python library and CLI tool that significantly refactors how models process prompts and responses. Recognizing that modern LLMs possess complex capabilities like reasoning, executing tool calls, and returning images or audio, the original text-in/text-out abstraction was no longer sufficient. The library now models inputs as a sequence of conversational messages and outputs as a stream of typed message parts. Developers can use the new llm.user() and llm.assistant() builder functions to cleanly feed in previous conversation turns without relying on SQLite, while the updated streaming interface elegantly interleaves text, tool execution requests, and reasoning output. For CLI users, the only visible change is a new -R/--no-reasoning flag that suppresses thinking tokens, and Python API users gain a new built-in serialization mechanism to roll their own storage alternatives.

2026-04-30

Engineering Reads — 2026-04-30#

The Big Idea#

As AI models become capable of writing vast amounts of code, our core bottleneck is shifting from generating logic to verifying it. The future of software engineering requires us to aggressively enforce mechanical constraints, utilize correct-by-construction tools, and focus on the “left tail” of subtle system failures to safely orchestrate agentic workflows.

Deep Reads#

Thoughts on WebAssembly as a stack machine · Eli Bendersky WebAssembly functions as a highly readable stack machine augmented by an infinite register file of local variables. Unlike purist stack machines (e.g., Forth) that require mental gymnastics with dup and tuck-swap contortions to organize data, WASM leverages locals to dramatically clarify data flow. At runtime, this semantic sugar doesn’t cost performance; sophisticated compilers like wasmtime easily perform redundant load elimination, mapping these consecutive local accesses directly to native registers without aliasing issues. It is a great reminder that virtual machine abstraction design should favor human readability when the compiler can trivially bridge the gap to hardware efficiency. Read this if you care about virtual machine design or want a deeper intuition for how WASM bridges stack-based execution with register-based hardware.

2026-04-30

Simon Willison — 2026-04-30#

Highlight#

The most fascinating discussion today centers on the cultural clash between AI-assisted programming and traditional open-source community building, specifically looking at the Zig project’s strict ban on LLM-authored contributions. It perfectly articulates a growing divide: while AI can generate perfect code, it breaks the “contributor poker” investment model that maintainers rely on to grow trusted human collaborators over time.

Posts#

The Zig project’s rationale for their firm anti-AI contribution policy Simon dives into Zig’s stringent anti-LLM policy for issues, PRs, and bug tracker comments. He highlights Loris Cro’s concept of “contributor poker,” which argues that open-source maintainers invest in people, not just their initial code contributions. Because reviewing an LLM-assisted PR doesn’t help the project cultivate a new, confident contributor, the maintainer’s time is wasted. Interestingly, this policy means that Bun—an Anthropic-acquired JavaScript runtime built on a Zig fork—is keeping a massive 4x compile performance improvement un-upstreamed due to their heavy use of AI.

2026-05-01

Engineering Reads — 2026-05-01#

The Big Idea#

The friction of software subscriptions extends beyond billing, manifesting as frustrating user experiences when basic autonomy—like the ability to cancel a free service tier—is missing or broken. Even zero-cost services accumulate administrative cruft when system design completely neglects the offboarding path.

Deep Reads#

I can’t cancel GitHub Copilot · Drew DeVault · Source Drew DeVault highlights a frustrating edge case in subscription state management: the inability to cleanly offboard from a complimentary service tier. After securing free access as a Free and Open Source Software (FOSS) community member to evaluate GitHub Copilot, DeVault abandoned the tool after a trivial fifteen-minute test. However, the platform’s design seemingly lacks a self-service cancellation path for non-paying users, locking them into unavoidable, recurring monthly renewal emails. DeVault notes that while there is no financial penalty, the inability to reclaim user autonomy through either UI settings or support channels represents a breakdown in fundamental system usability. Product engineers and system architects should read this as a stark reminder that robust offboarding flows are just as critical as onboarding, even when direct revenue is not explicitly attached to the user state.

2026-05-01

Simon Willison — 2026-05-01#

Highlight#

Simon demonstrates the power of mobile AI-assisted development by building a complete, multi-component tracking application entirely on his phone while camping using Claude Code for web. It’s a perfect example of chaining small, sharp tools—Python CLIs, Git scraping, and AI-generated static frontends—into a highly practical personal utility.

Posts#

[iNaturalist Sightings] · Source Simon wanted to consolidate and view his iNaturalist observations across multiple accounts, grouped by when and where they occurred. To solve this, he used Claude Code for web to write inaturalist-clumper, a Python CLI that groups sightings within a 2-hour and 5km radius. He then set up a Git scraping repository to regularly run the tool and generate a clumps.json file hosted via GitHub. Finally, he prompted an AI against his tools repository to build a static HTML frontend that fetches the CORS-friendly JSON and displays the sightings in a gallery with lazy-loaded thumbnails and full-size modal images.

2026-05-02

Engineering Reads — 2026-05-02#

The Big Idea#

The most valuable technical insights often come from returning to raw browser primitives and bypassing heavy orchestration layers. Whether you are stripping away Node-based test runners to verify UI behavior directly, or relying on native HTML5 to build interactive mathematical concepts, stepping outside complex build pipelines yields faster feedback loops and a deeper understanding of underlying mechanics.

Deep Reads#

Testing Vue components in the browser · Julia Evans · Source This article explores how to write end-to-end integration tests for Vue components without relying on Node, Deno, or unwieldy orchestration tools like Playwright. The technical approach involves mounting components to invisible, off-screen DOM elements and executing the QUnit testing framework directly within a browser tab, utilizing a server endpoint to reset SQL database fixtures to a known state. The author candidly details the complexities of this raw approach, particularly the architectural friction of polling the DOM for readiness rather than relying on flaky sleep commands, and the nuance required to manually dispatch events to simulate form inputs. Engineers suffering from frontend build-tool fatigue should read this for a refreshing, lightweight perspective on verifying UI behavior using native capabilities, including Chrome’s built-in code coverage tools.

2026-05-02

Simon Willison — 2026-05-02#

Highlight#

Simon seamlessly integrated his iNaturalist wildlife photography into his personal blog, demonstrating the practical power of using Claude Code for rapid, on-the-go web development.

Posts#

[Sightings] · Source Simon has added a new “sightings” feature to his blog to showcase his wildlife photos, a project prompted by his new Canon R6 Mark II camera. He built this integration directly from his phone using Claude Code for web, extending his existing “beats” system used for syndicating external content. He also back-populated over a decade of iNaturalist data, meaning legacy photos—like his 2019 lemur sightings in Madagascar—now natively surface on his homepage, archive pages, and site search.

2026-05-03

Engineering Reads — 2026-05-03#

The Big Idea#

Effective error reporting often demands a shift in perspective: instead of decorating errors at the point of failure, we should accumulate context implicitly along the happy path. This telescopic, block-scoped approach minimizes developer friction, though it surfaces new challenges when expected errors (like I/O cancellation) are caught and handled upstream rather than fatally reported.

Deep Reads#

Minimal Viable Zig Error Contexts · Matklad · matklad.github.io Zig’s strongly-typed error codes solve error handling, but its idiomatic “Diagnostics sink” pattern for error reporting introduces too much friction for lightweight or script-like code. To avoid the poor debuggability of naked try statements or the sheer verbosity of custom error wrappers, Matklad proposes a “worse-is-better” pattern that logs key-value context via errdefer at the block level. This creates a telescopic context across the call stack without cluttering the happy path or requiring modifications to individual fallible operations. However, this technique has a severe tradeoff: it unconditionally logs context even if the error is later handled gracefully, which is problematic in Zig 0.16 where serendipitous IO cancellation is treated as a recoverable error. Systems engineers and language designers should read this for a practical exploration of how the ergonomics of context gathering shape the readability of our code.

2026-05-03

Simon Willison — 2026-05-03#

Highlight#

Today’s highlight is a quick but fascinating look into AI behavior evaluation, specifically how Anthropic measures “sycophancy” in Claude. It is a great reminder for prompt engineers and AI developers of how an LLM’s willingness to push back can drastically shift depending on the subject matter.

Posts#

[Quoting Anthropic] · Source Simon highlights an interesting finding from Anthropic’s recent research on how users interact with Claude for personal guidance. Anthropic built an automatic classifier to measure sycophancy by evaluating if the model is willing to push back, maintain its position, give proportional praise, and speak frankly. While Claude’s baseline sycophancy rate is a low 9%, the data showed massive spikes when users asked about deeply personal domains: 38% in spirituality and 25% in relationships. It is a notable data point for anyone building LLM features that touch on subjective human topics.

2026-05-04

Engineering Reads — 2026-05-04#

The Big Idea#

The defining leverage in modern software engineering is safely raising the ceiling of complexity you can manage as an individual. Whether offloading design constraints to curated color systems or using AI to validate aggressive C memory models, the goal is to reserve human cognitive load for system specifications and architectural correctness.

Deep Reads#

Links to CSS colour palettes · jvns.ca · Source The author highlights a practical tradeoff of abandoning utility frameworks like Tailwind for vanilla CSS: the loss of carefully constrained, pre-baked design tokens. While dropping Tailwind reduces tooling overhead, engineers often lack the aesthetic expertise to build cohesive color systems from scratch. To bridge this gap, the post surfaces drop-in alternatives like uchū, flexoki, and reasonable colours, with the latter specifically optimizing for accessibility. The author also points to dynamic generative colors using the CSS oklch function, while noting that complex color generators often remain difficult for non-designers to leverage effectively. This is a quick but essential read for full-stack developers who want the simplicity of vanilla CSS without shipping visually hostile interfaces.