2026-04-07

Simon Willison — 2026-04-07#

Highlight#

Anthropic’s decision to restrict access to their new Claude Mythos model underscores a massive, sudden shift in AI capabilities. It is a fascinating look at an industry-wide reckoning as open-source maintainers transition from dealing with “AI slop” to facing a tsunami of highly accurate, sophisticated vulnerability reports.

Posts#

[Anthropic’s Project Glasswing - restricting Claude Mythos to security researchers - sounds necessary to me] · Source Anthropic has delayed the general release of Claude Mythos, a general-purpose model similar to Claude Opus 4.6, opting instead to limit access to trusted partners under “Project Glasswing” so they can patch foundational internet systems. Simon digs into the context, tracking how credible security professionals are warning about the ability of frontier LLMs to chain multiple minor vulnerabilities into sophisticated exploits. He even uses git blame to independently verify a 27-year-old OpenBSD kernel bug discovered by the model. He concludes that delaying the release until new safeguards are built, while providing $100M in credits to defenders, is a highly reasonable trade-off.

2026-04-08

AI, Tech

Artificial Intelligence, Ai-Agents, Ai Safety, Large Language Models

Sources

Scaling Ceilings Shatter Alongside Emerging Agent Workflows — 2026-04-08#

Highlights#

The ecosystem is currently split between awe at the unabated scaling laws and deep anxiety over the societal implications of these systems. With Anthropic’s Mythos and Meta’s Muse Spark launching, the capability ceiling continues to shatter, giving rise to highly capable, production-ready agentic workflows. However, experts are urgently reminding us that we lack the regulatory frameworks to manage these increasingly powerful tools.

2026-04-08

AI, Tech

Mcp, Local-Llms, Claude, Muse Spark, Quantization

Sources

AI Reddit — 2026-04-08#

The Buzz#

The biggest narrative collision today is the launch of Meta’s Muse Spark from their Superintelligence Labs, which is posting serious ECI benchmark scores and washing away the bad taste of Llama 4. However, the shadow looming over the community is Anthropic’s Claude Mythos—security researchers are finding unprecedented zero-days with it, but Anthropic’s enterprise-only release strategy has users fearing a “permanent underclass” where only billion-dollar megacorps get frontier reasoning. Meanwhile, Sam Altman and OpenAI are taking heat from a New Yorker exposé alleging Altman lacks basic ML knowledge, alongside their bold “Industrial Policy” paper suggesting no income tax for those under $100k.

2026-04-08

Blogs, AI, Tech

Docker, Sqlite, Llms, Code-Interpreter, Meta

Simon Willison — 2026-04-08#

Highlight#

The most substantial piece today is a deep-dive into Meta’s new Muse Spark model and its chat harness, where Simon successfully extracts the platform’s system tool definitions via direct prompting. His exploration of Meta’s built-in Python Code Interpreter and visual_grounding capabilities highlights a powerful, sandbox-driven approach to combining generative AI with programmatic image analysis and exact object localization.

Posts#

Meta’s new model is Muse Spark, and meta.ai chat has some interesting tools Meta has launched Muse Spark, a new hosted model currently accessible as a private API preview and directly via the meta.ai chat interface. By simply asking the chat harness to list its internal tools and their exact parameters, Simon documented 16 different built-in tools. Standouts include a Python Code Interpreter (container.python_execution) running Python 3.9 and SQLite 3.34.1, mechanisms for creating web artifacts, and a highly capable container.visual_grounding tool. He ran hands-on experiments generating images of a raccoon wearing trash, then used the platform’s Python sandbox and grounding tools to extract precise, nested bounding boxes and perform object counts (like counting whiskers or his classic pelicans). Although the model is closed for now, infrastructure scaling and comments from Alexandr Wang suggest future versions could be open-sourced.

2026-04-09

AI, Tech

Ai-Agents, Cybersecurity, Open-Source Ai, Finance Ai

Sources

The Agentic Era Arrives: Capability Gaps, Financial AI, and the “Mythos” Controversy — 2026-04-09#

Highlights#

Today’s discussions reveal a stark divergence in AI perception: while the general public fixates on consumer chatbot fumbles, technical professionals are experiencing staggering productivity gains from state-of-the-art coding models. Concurrently, the “agentic era” is aggressively moving from theory to reality with autonomous background workflows and highly orchestrated financial assistants hitting the market, sparking urgent debates among leaders over safety and deployment timelines.

2026-04-09

AI, Tech

Large Language Models, Ai-Agents, Open-Source Ai, Cybersecurity, Video Generation

Sources

AI Reddit — 2026-04-09#

The Buzz#

Anthropic claimed their new Mythos Preview model is an unreleased cyber-nuke too dangerous for the public, but the community just used cheap open-weights models (as small as 3.6B) to successfully reproduce its exact zero-day exploits. It is sparking a massive debate over whether “safety” is just a cover story for astronomical compute costs and agentic harnessing.

2026-04-09

Blogs, AI, Tech

Python, Github, Datasette, Asgi, Cors

Simon Willison — 2026-04-09#

Highlight#

Today’s most substantive update is the release of asgi-gzip 0.3, which serves as a great practical reminder of the hidden risks in automated maintenance workflows. A silently failing GitHub Action caused his library to miss a crucial upstream Starlette fix for Server-Sent Events (SSE) compression, which ended up breaking a new Datasette feature in production.

Posts#

[asgi-gzip 0.3] · Source Simon released an update to asgi-gzip after a production deployment of a new Server-Sent Events (SSE) feature for Datasette ran into trouble. The root cause was datasette-gzip incorrectly compressing event/text-stream responses. The library relies on a scheduled GitHub Actions workflow to port updates from Starlette, but the action had stopped running and missed Starlette’s upstream fix for this exact issue. By running the workflow and integrating the fix, both datasette-gzip and asgi-gzip now handle SSE responses correctly.

2026-04-10

AI, Tech

Artificial Intelligence, Agentic Ai, Ai Regulation, Enterprise Ai

Sources

The Tale of Two AIs: Frontier Capability vs. Public Perception — 2026-04-10#

Highlights#

Today’s discourse reveals a widening chasm between the staggering capabilities of state-of-the-art agentic models and the general public’s perception shaped by older, free-tier chatbots. Meanwhile, sweeping regulatory shifts in Europe threaten local AI innovation with strict copyright presumptions, even as enterprise deployments face severe worker backlash due to soaring technology friction.

2026-04-10

AI, Tech

Local Models, Ai-Agents, Ai Safety, Image Generation, Coding Assistants

Sources

AI Reddit — 2026-04-10#

The Buzz#

The biggest shockwave today isn’t a new benchmark—it’s a massive escalation in the AI safety narrative. Following a terrifying Molotov cocktail attack on OpenAI CEO Sam Altman’s home, the community is reeling from a breaking Bloomberg report that Treasury Secretary Bessent and Fed Chair Powell issued an urgent warning to bank CEOs about an “Anthropic model scare”. Anthropic’s unreleased Claude Mythos model reportedly demonstrated offensive cybersecurity capabilities so severe it could compromise global financial controls, sparking fierce debate over whether this is a genuine “black swan” systemic risk or just an elaborate pre-IPO marketing stunt.

2026-04-10

Blogs, AI, Tech

Chatgpt, Openai, Llms, Kakapo

Simon Willison — 2026-04-10#

Highlight#

Simon points out the non-obvious reality that ChatGPT’s Advanced Voice Mode is actually running on an older, weaker model compared to their flagship developer tools. Drawing on insights from Andrej Karpathy, he highlights the widening capability gap between consumer-facing voice interfaces and B2B-focused reasoning models that benefit from verifiable reinforcement learning.

Posts#

ChatGPT voice mode is a weaker model Simon reflects on the counterintuitive fact that OpenAI’s Advanced Voice Mode runs on a GPT-4o era model with an April 2024 knowledge cutoff. Prompted by a tweet from Andrej Karpathy, he contrasts this consumer feature with top-tier coding models capable of coherently restructuring entire codebases or finding system vulnerabilities. Karpathy notes this divergence in capabilities exists because coding tasks offer explicit, verifiable reward functions ideal for reinforcement learning and hold significantly more B2B value.