2026-04-28

Ai [1-3], Python [2], Llms [1, 3], Vibe-Coding [2], Prompt-Engineering [3]

Simon Willison — 2026-04-28#

Highlight#

The most fascinating read today is the breakdown of talkie, a 13B vintage language model trained purely on pre-1931 text. It raises excellent questions about training data purity (“vegan models”) and the difficulty of preventing anachronistic contamination when fine-tuning with modern AI.

Posts#

[Introducing talkie: a 13B vintage language model from 1930] · Source Nick Levine, David Duvenaud, and Alec Radford have released an Apache 2.0-licensed 13B model trained entirely on 260 billion tokens of pre-1931, out-of-copyright text. Simon dives into the concept of “vegan models”—LLMs trained solely on licensed or public domain data—noting that while talkie’s base model qualifies, its chat-finetuned version relies on Claude Sonnet and Opus for preference optimization and synthetic chats. This creates an anachronistic contamination problem, though the team ultimately hopes to use their vintage models as judges to bootstrap an era-appropriate post-training pipeline. When tested with a classic prompt for an SVG of a pelican riding a bicycle, the 1930 model generated a highly amusing, historically framed textual description instead.

2026-04-29

Blogs

Software Engineering, Agentic Programming, Code Verification, Conceptual Modeling

Engineering Reads — 2026-04-29#

The Big Idea#

As AI tools accelerate code generation, the primary engineering bottleneck shifts from writing implementation logic to verifying it and providing structural intent. The high-leverage work of a senior engineer is evolving from writing instructions to building deterministic verification harnesses and formalizing clear conceptual boundaries.

Deep Reads#

[On Agentic Programming and Verification] · Chris Parsons · Fragments: April 29 Chris Parsons argues that as AI throughput scales, verification can no longer rely purely on human reading. Instead, modern verification must rely on tests, type checkers, and automated gates to handle the volume. The core bottleneck in software engineering is no longer how fast we can generate code, but how fast we can determine if that generated code is correct. He contrasts “vibe coding” with rigorous “agentic engineering,” where shaping the inner harness is a distinct advantage. For senior engineers, reviewing endless AI diffs is a dead end; the real compounding value lies in training the AI to get it right the first time and shaping the review surfaces. Read this if you are a senior engineer trying to figure out how your role scales in an AI-heavy workflow.

2026-04-29

Blogs, AI, Tech

Python, Llm, Generative-Ai, Projects

Simon Willison — 2026-04-29#

Highlight#

The standout update today is the alpha release of llm 0.32a0, which introduces a major architectural shift to handle the complex realities of modern frontier models. By moving from a simple text-in/text-out abstraction to one based on message sequences and typed streaming parts, Simon is future-proofing the library to seamlessly support reasoning tokens, server-side tool calls, and multi-modal inputs and outputs.

Posts#

[LLM 0.32a0 is a major backwards-compatible refactor] · Source Simon has released an alpha version of his LLM Python library and CLI tool that significantly refactors how models process prompts and responses. Recognizing that modern LLMs possess complex capabilities like reasoning, executing tool calls, and returning images or audio, the original text-in/text-out abstraction was no longer sufficient. The library now models inputs as a sequence of conversational messages and outputs as a stream of typed message parts. Developers can use the new llm.user() and llm.assistant() builder functions to cleanly feed in previous conversation turns without relying on SQLite, while the updated streaming interface elegantly interleaves text, tool execution requests, and reasoning output. For CLI users, the only visible change is a new -R/--no-reasoning flag that suppresses thinking tokens, and Python API users gain a new built-in serialization mechanism to roll their own storage alternatives.

2026-04-30

Blogs

Artificial Intelligence, Software Engineering, Rust, Webassembly, Static Analysis

Engineering Reads — 2026-04-30#

The Big Idea#

As AI models become capable of writing vast amounts of code, our core bottleneck is shifting from generating logic to verifying it. The future of software engineering requires us to aggressively enforce mechanical constraints, utilize correct-by-construction tools, and focus on the “left tail” of subtle system failures to safely orchestrate agentic workflows.

Deep Reads#

Thoughts on WebAssembly as a stack machine · Eli Bendersky WebAssembly functions as a highly readable stack machine augmented by an infinite register file of local variables. Unlike purist stack machines (e.g., Forth) that require mental gymnastics with dup and tuck-swap contortions to organize data, WASM leverages locals to dramatically clarify data flow. At runtime, this semantic sugar doesn’t cost performance; sophisticated compilers like wasmtime easily perform redundant load elimination, mapping these consecutive local accesses directly to native registers without aliasing issues. It is a great reminder that virtual machine abstraction design should favor human readability when the compiler can trivially bridge the gap to hardware efficiency. Read this if you care about virtual machine design or want a deeper intuition for how WASM bridges stack-based execution with register-based hardware.

2026-04-30

Blogs, AI, Tech

Ai-Assisted Programming, Open-Source, Zig, Llms, Coding Agents

Simon Willison — 2026-04-30#

Highlight#

The most fascinating discussion today centers on the cultural clash between AI-assisted programming and traditional open-source community building, specifically looking at the Zig project’s strict ban on LLM-authored contributions. It perfectly articulates a growing divide: while AI can generate perfect code, it breaks the “contributor poker” investment model that maintainers rely on to grow trusted human collaborators over time.

Posts#

The Zig project’s rationale for their firm anti-AI contribution policy Simon dives into Zig’s stringent anti-LLM policy for issues, PRs, and bug tracker comments. He highlights Loris Cro’s concept of “contributor poker,” which argues that open-source maintainers invest in people, not just their initial code contributions. Because reviewing an LLM-assisted PR doesn’t help the project cultivate a new, confident contributor, the maintainer’s time is wasted. Interestingly, this policy means that Bun—an Anthropic-acquired JavaScript runtime built on a Zig fork—is keeping a massive 4x compile performance improvement un-upstreamed due to their heavy use of AI.

2026-05-01

Blogs

Github Copilot, Subscriptions, User Experience, Software Design

Engineering Reads — 2026-05-01#

The Big Idea#

The friction of software subscriptions extends beyond billing, manifesting as frustrating user experiences when basic autonomy—like the ability to cancel a free service tier—is missing or broken. Even zero-cost services accumulate administrative cruft when system design completely neglects the offboarding path.

Deep Reads#

I can’t cancel GitHub Copilot · Drew DeVault · Source Drew DeVault highlights a frustrating edge case in subscription state management: the inability to cleanly offboard from a complimentary service tier. After securing free access as a Free and Open Source Software (FOSS) community member to evaluate GitHub Copilot, DeVault abandoned the tool after a trivial fifteen-minute test. However, the platform’s design seemingly lacks a self-service cancellation path for non-paying users, locking them into unavoidable, recurring monthly renewal emails. DeVault notes that while there is no financial penalty, the inability to reclaim user autonomy through either UI settings or support channels represents a breakdown in fundamental system usability. Product engineers and system architects should read this as a stark reminder that robust offboarding flows are just as critical as onboarding, even when direct revenue is not explicitly attached to the user state.

2026-05-01

Blogs, AI, Tech

Inaturalist, Claude-Code, Generative-Ai, Tools

Simon Willison — 2026-05-01#

Highlight#

Simon demonstrates the power of mobile AI-assisted development by building a complete, multi-component tracking application entirely on his phone while camping using Claude Code for web. It’s a perfect example of chaining small, sharp tools—Python CLIs, Git scraping, and AI-generated static frontends—into a highly practical personal utility.

Posts#

[iNaturalist Sightings] · Source Simon wanted to consolidate and view his iNaturalist observations across multiple accounts, grouped by when and where they occurred. To solve this, he used Claude Code for web to write inaturalist-clumper, a Python CLI that groups sightings within a 2-hour and 5km radius. He then set up a Git scraping repository to regularly run the tool and generate a clumps.json file hosted via GitHub. Finally, he prompted an AI against his tools repository to build a static HTML frontend that fetches the CORS-friendly JSON and displays the sightings in a gallery with lazy-loaded thumbnails and full-size modal images.

2026-05-02

Blogs

Frontend Testing, Vue, Mathematics, Html5 Canvas

Engineering Reads — 2026-05-02#

The Big Idea#

The most valuable technical insights often come from returning to raw browser primitives and bypassing heavy orchestration layers. Whether you are stripping away Node-based test runners to verify UI behavior directly, or relying on native HTML5 to build interactive mathematical concepts, stepping outside complex build pipelines yields faster feedback loops and a deeper understanding of underlying mechanics.

Deep Reads#

Testing Vue components in the browser · Julia Evans · Source This article explores how to write end-to-end integration tests for Vue components without relying on Node, Deno, or unwieldy orchestration tools like Playwright. The technical approach involves mounting components to invisible, off-screen DOM elements and executing the QUnit testing framework directly within a browser tab, utilizing a server endpoint to reset SQL database fixtures to a known state. The author candidly details the complexities of this raw approach, particularly the architectural friction of polling the DOM for readiness rather than relying on flaky sleep commands, and the nuance required to manually dispatch events to simulate form inputs. Engineers suffering from frontend build-tool fatigue should read this for a refreshing, lightweight perspective on verifying UI behavior using native capabilities, including Chrome’s built-in code coverage tools.

2026-05-02

Blogs, AI, Tech

Blogging, Photography, Wildlife, Claude-Code

Simon Willison — 2026-05-02#

Highlight#

Simon seamlessly integrated his iNaturalist wildlife photography into his personal blog, demonstrating the practical power of using Claude Code for rapid, on-the-go web development.

Posts#

[Sightings] · Source Simon has added a new “sightings” feature to his blog to showcase his wildlife photos, a project prompted by his new Canon R6 Mark II camera. He built this integration directly from his phone using Claude Code for web, extending his existing “beats” system used for syndicating external content. He also back-populated over a decade of iNaturalist data, meaning legacy photos—like his 2019 lemur sightings in Madagascar—now natively surface on his homepage, archive pages, and site search.

2026-05-03

Blogs

Zig, Error Handling, Error Reporting, Programming Languages

Engineering Reads — 2026-05-03#

The Big Idea#

Effective error reporting often demands a shift in perspective: instead of decorating errors at the point of failure, we should accumulate context implicitly along the happy path. This telescopic, block-scoped approach minimizes developer friction, though it surfaces new challenges when expected errors (like I/O cancellation) are caught and handled upstream rather than fatally reported.

Deep Reads#

Minimal Viable Zig Error Contexts · Matklad · matklad.github.io Zig’s strongly-typed error codes solve error handling, but its idiomatic “Diagnostics sink” pattern for error reporting introduces too much friction for lightweight or script-like code. To avoid the poor debuggability of naked try statements or the sheer verbosity of custom error wrappers, Matklad proposes a “worse-is-better” pattern that logs key-value context via errdefer at the block level. This creates a telescopic context across the call stack without cluttering the happy path or requiring modifications to individual fallible operations. However, this technique has a severe tradeoff: it unconditionally logs context even if the error is later handled gracefully, which is problematic in Zig 0.16 where serendipitous IO cancellation is treated as a recoverable error. Systems engineers and language designers should read this for a practical exploration of how the ergonomics of context gathering shape the readability of our code.