Week 19 Summary

Blogs, AI, Tech

Prompt-Engineering [1, 2], System-Prompts [2, 3], Claude [2, 3], Coding-Agents [1], Generative-Ai [1-3], Apis, AI, Saas, Salesforce, Speech-to-Text, Microsoft, Openai, Translation, Mlx, Ai [1-3], Python [2], Llms [1, 3], Vibe-Coding [2], Prompt-Engineering [3], Python, Llm, Generative-Ai, Projects, Ai-Assisted-Programming, Open Source, Zig, Llms, Coding-Agents, Inaturalist, Claude-Code, Tools

Simon Willison — Week of 2026-04-18 to 2026-05-01#

Highlight of the Week#

The alpha release of llm 0.32a0 marks a foundational architectural pivot for Simon’s ecosystem of CLI tools. By moving away from a simple text-in/text-out abstraction to one that natively models complex message sequences and typed streams, the library is now future-proofed to handle the realities of modern frontier models. This opens the door for seamless integration of server-side tool calls, multi-modal inputs, and reasoning tokens.

2026-04-29

Blogs, AI, Tech

Python, Llm, Generative-Ai, Projects

Simon Willison — 2026-04-29#

Highlight#

The standout update today is the alpha release of llm 0.32a0, which introduces a major architectural shift to handle the complex realities of modern frontier models. By moving from a simple text-in/text-out abstraction to one based on message sequences and typed streaming parts, Simon is future-proofing the library to seamlessly support reasoning tokens, server-side tool calls, and multi-modal inputs and outputs.

Posts#

[LLM 0.32a0 is a major backwards-compatible refactor] · Source Simon has released an alpha version of his LLM Python library and CLI tool that significantly refactors how models process prompts and responses. Recognizing that modern LLMs possess complex capabilities like reasoning, executing tool calls, and returning images or audio, the original text-in/text-out abstraction was no longer sufficient. The library now models inputs as a sequence of conversational messages and outputs as a stream of typed message parts. Developers can use the new llm.user() and llm.assistant() builder functions to cleanly feed in previous conversation turns without relying on SQLite, while the updated streaming interface elegantly interleaves text, tool execution requests, and reasoning output. For CLI users, the only visible change is a new -R/--no-reasoning flag that suppresses thinking tokens, and Python API users gain a new built-in serialization mechanism to roll their own storage alternatives.