Sources

The Agentic Era Arrives: Capability Gaps, Financial AI, and the “Mythos” Controversy — 2026-04-09#

Highlights#

Today’s discussions reveal a stark divergence in AI perception: while the general public fixates on consumer chatbot fumbles, technical professionals are experiencing staggering productivity gains from state-of-the-art coding models. Concurrently, the “agentic era” is aggressively moving from theory to reality with autonomous background workflows and highly orchestrated financial assistants hitting the market, sparking urgent debates among leaders over safety and deployment timelines.

Top Stories#

  • Perplexity Unveils ‘Computer’ as a Personal CFO: Orchestrating 19 specialized models simultaneously, Perplexity Computer now links directly to financial accounts via Plaid to track net worth and analyze spending. By cross-referencing personal data with live institutional sources like SEC filings, it offers a highly capable and accessible alternative to legacy platforms like the Bloomberg Terminal. (Source)
  • Karpathy Highlights the ‘AI Psychosis’ Capability Gap: There is a growing disconnect in how AI capabilities are perceived, driven by the gulf between older free-tier models and the latest agentic models like OpenAI Codex and Claude Code. Technical professionals leveraging these frontier models are experiencing massive, verifiable gains in domains like programming, leading to an awe that mainstream users completely miss. (Source)
  • Hassabis Warns of the Looming ‘Agentic Era’: Google DeepMind CEO Demis Hassabis argued the commercial AI race forced a premature deployment of chatbots, diverting focus from solving root scientific problems like curing cancer. He strongly warned that within two to four years, autonomous task-completing systems will arrive, making AI alignment an incredibly difficult and critical technical challenge. (Source)
  • Silicon Valley’s Quiet Reliance on Chinese Open Source: Top-tier AI products in the US are increasingly built on Chinese open-source models, revealing a hidden supply chain. For instance, Cursor’s Composer 2 relies on Moonshot’s Kimi K2.5, while Shopify and Airbnb have shifted to Alibaba’s Qwen model for significant cost savings and performance. (Source)
  • OpenAI Launches $100 Pro Tier Amid GenAI Shifts: Capitalizing on the high demand and strong reception for its Codex model, OpenAI is introducing a new $100 ChatGPT Pro subscription. This comes as commentators note shifting dynamics in the market, including OpenAI’s decision to pause its UK Stargate infrastructure project over regulatory and energy concerns. (Source)

Articles Worth Reading#

The “Mythos” Cybersecurity Debate and Model Evaluations (Source) Anthropic’s newly announced Mythos model is generating significant controversy regarding its actual cybersecurity capabilities. Critics argue the announcement was overblown, noting that testing lacked real-world sandboxing and that cheap, open-weight models could ostensibly recover similar exploit analysis. However, AI executives have highlighted that these counter-tests are fundamentally disingenuous. The cheap open-source models were spoon-fed just 20 lines of highly relevant code, whereas the true challenge and value of frontier vulnerability detection lies in the model’s ability to reason across entire, unstructured file systems.

The Paradigm Shift to Long-Running Background Agents (Source) The dominant mental model of AI as a synchronous chat interface is rapidly giving way to always-on, background agents embedded directly into enterprise workflows. According to Box CEO Aaron Levie, long-running agents that execute code securely, access compute sandboxes, and integrate data across systems represent the clear architecture of the future. Supporting this architectural shift, tools like Claude’s new Monitor tool now allow background scripts to wake the agent up for tasks like following error logs or polling pull requests, significantly optimizing token usage and the agent loop. Moving forward, the biggest IT challenge over the next decade will be ensuring these agents have secure, governed access to the right enterprise context to make autonomous decisions.


Categories: AI, Tech