Simon Willison — 2026-04-16#
Highlight#
The most fascinating takeaway today is a surprising win for local AI: a 21GB quantized Qwen3.6 model running on a laptop beat Anthropic’s brand-new Claude Opus 4.7 at Simon’s “pelican riding a bicycle” SVG generation benchmark. This result leads Simon to conclude that his joke benchmark’s long-standing correlation with a model’s general utility has finally broken down.
Posts#
Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7 · Source
Simon put the day’s two major model releases—Alibaba’s Qwen3.6-35B-A3B and Anthropic’s Claude Opus 4.7—through his infamous “pelican riding a bicycle” SVG generation benchmark. Running locally on a MacBook Pro via LM Studio, the quantized Qwen model produced a better bicycle frame than Opus, and even won a “secret backup test” generating a flamingo riding a unicycle. Simon admits this breaks the historical correlation between his SVG benchmark and a model’s general usefulness, noting he highly doubts the 21GB local model is actually more capable than Anthropic’s proprietary flagship.