Week 17 Summary

Simon Willison — Week of 2026-04-11 to 2026-04-17#

Highlight of the Week#

This week’s most striking revelation came from Simon’s infamous “pelican riding a bicycle” SVG generation benchmark, where a 21GB quantized local model (Qwen3.6-35B-A3B) unexpectedly outperformed Anthropic’s brand-new Claude Opus 4.7 flagship. Running locally on a MacBook Pro via LM Studio, Qwen generated a better bicycle frame and even won a secret unicycle backup test, leading Simon to conclude that his joke benchmark’s long-standing correlation with general model utility has finally broken down.

Week 19 Summary

Simon Willison — Week of 2026-04-18 to 2026-05-01#

Highlight of the Week#

The alpha release of llm 0.32a0 marks a foundational architectural pivot for Simon’s ecosystem of CLI tools. By moving away from a simple text-in/text-out abstraction to one that natively models complex message sequences and typed streams, the library is now future-proofed to handle the realities of modern frontier models. This opens the door for seamless integration of server-side tool calls, multi-modal inputs, and reasoning tokens.

2026-04-12

Simon Willison — 2026-04-12#

Highlight#

Simon shares a highly practical, single-command recipe for running local speech-to-text transcription on macOS using the Gemma 4 model and Apple’s MLX framework. It is a prime example of his ongoing exploration into making local, multimodal LLMs frictionless and accessible using modern Python packaging tools like uv.

Posts#

[Gemma 4 audio with MLX] · Source Thanks to a tip from Rahim Nathwani, Simon demonstrates a quick uv run recipe to transcribe audio locally using the 10.28 GB Gemma 4 E2B model via mlx-vlm. He tested the pipeline on a 14-second voice memo, and while it slightly misinterpreted a couple of words (hearing “front” instead of “right”), Simon conceded that the errors were understandable given the audio itself. The post highlights how easy it has become to test heavyweight, local AI models on Apple Silicon without complex environment setup.

2026-04-27

Simon Willison — 2026-04-27#

Highlight#

The most substantive post for developers today is Simon’s hands-on experiment running Microsoft’s VibeVoice model locally via MLX. It’s a great example of his signature workflow: taking a newly accessible open-source AI model and immediately figuring out the most frictionless CLI one-liner to get it running on Apple Silicon.

Posts#

[microsoft/VibeVoice] · Source Simon explores Microsoft’s MIT-licensed VibeVoice, a Whisper-style speech-to-text model that notably includes built-in speaker diarization. He shares a practical one-liner using uv and mlx-audio to run a 4-bit quantized version locally on a Mac. Testing it against a one-hour podcast interview, it transcribed the audio in under 9 minutes and impressively distinguished between the host’s conversational voice and his “sponsor read” voice. You’ll need to manually split audio files longer than an hour to avoid token limits, but the resulting JSON drops nicely into Datasette Lite for browsing.