Sources

Tech Videos — 2026-06-09#

Watch First#

RAG is dead, right?? — Kuba Rogut, Turbopuffer cuts through the “agentic file search” hype by showing how Cursor actually indexes codebases: using Merkel trees and Turbopuffer to implement a semantic search tool that improves model answer accuracy by nearly 24% over naïve grep loops.

Highlights by Theme#

Developer Tools & Platforms#

Apple is fully embracing the open-source Model Context Protocol (MCP) in Xcode 27, as shown on Apple Developer, allowing developers to bring custom agents and MCP tools directly into their IDE plugins. In a practical look at agentic workflows, Visual Studio Code highlighted how engineers are using the GitHub Copilot app in “autopilot” mode with local SQLite databases and MCP servers to asynchronously scaffold applications. Meanwhile, The Pragmatic Engineer offered a healthy dose of skepticism, warning against the chaos of giving autonomous agents raw AWS console access instead of enforcing infrastructure-as-code tools like Terraform.

AI & Machine Learning#

Anthropic released their highly autonomous Mythos-class model, showcased on Anthropic, noting that Claude Fable 5 can run complex tasks for days but enforces safety by redirecting high-risk cyber and biology requests to the older Opus 4.8 model. On the audio front, Google DeepMind detailed Gemini 3.1 Flash Live on AI Engineer, demonstrating a full-duplex, native audio-to-audio model that bakes reasoning directly into the audio stack to understand pacing, interruptions, and accents without cascading through a text layer. Finally, Google for Developers proved that small open-weight models are viable for local edge robotics, running Gemma 4 2B directly on Raspberry Pi 5 and Jetson Nano hardware for multimodal real-time inference.

Hardware & Infrastructure#

To address the massive token generation demands of agentic loops, NVIDIA broke down their Enterprise Reference Architectures, detailing how they validate scalable units from RTX Pro nodes up to NVL72 gigascale clusters using Spectrum X Ethernet. For developers wanting instant GPU access without the infrastructure overhead, AI Engineer featured RunPod’s Flash Python SDK, allowing you to decorate async Python functions to execute directly on cloud H100s with hot-model reloading from a local IDE.

Everything Else#

All-In Podcast featured Bill Maris explaining the harsh math behind why smaller VC funds (under $750M) persistently outperform multi-billion dollar mega-funds that mathematically require unachievable exit velocities to return 3x. Reviewing WWDC 2026, Marques Brownlee noted that while Apple Intelligence upgrades are solid, the most advanced on-device Siri models are strictly hardware-gated to devices with 12GB of RAM, immediately outdating the iPhone 16 line.


Categories: Youtube, Tech