Sources
- AI Engineer
- All-In Podcast
- Andrej Karpathy
- Anthropic
- Apple
- Apple Developer
- AWS Events
- ByteByteGo
- Computerphile
- Cursor
- Dwarkesh Patel
- EO
- Fireship
- GitHub
- Google Cloud Tech
- Google DeepMind
- Google for Developers
- Hung-yi Lee
- Lenny's Podcast
- Lex Clips
- Lex Fridman
- Life at Google
- Marques Brownlee
- Microsoft
- No Priors: AI, Machine Learning, Tech, & Startups
- Numberphile
- NVIDIA
- OpenAI
- Perplexity
- Quanta Magazine
- Slack
- The Pragmatic Engineer
- Visual Studio Code
Tech Videos — 2026-04-11#
Watch First#
Reinforcement Learning at Scale: Engineering the Next Generation of Intelligence offers a deeply technical look at the systems-level nightmare of scaling RL, accurately contrasting its unpredictable “guerrilla warfare” workload with the synchronized marching of standard pre-training.
Highlights by Theme#
Developer Tools & Platforms#
In Accelerate AI through Open Source Inference | NVIDIA GTC from the NVIDIA Developer channel, the panel highlights the launch of Hugging Face Transformers v5, focusing heavily on interoperability and cementing the library as the standard backend for production inference servers like vLLM and SGLang. Over on Lenny’s Podcast with How to future-proof your career, the pragmatic advice for engineers is to double down on your “unfair advantage” and aggressively adopt AI coding tools, recognizing that even if a workflow fails today, the next model release might seamlessly unblock it.
AI & Machine Learning#
The Reinforcement Learning at Scale: Engineering the Next Generation of Intelligence panel is substantive for understanding the algorithmic shift from static knowledge retrieval to dynamic reasoning, noting how RL optimization hits unpredictable failure domains when chained to external APIs or delayed physical-world rewards. On the inference side, Accelerate AI through Open Source Inference | NVIDIA GTC details the necessity of Mixture of Experts (MoE) architectures and highly compressed latent spaces to handle massively parallel serving and consumer-level VRAM constraints. Taking a market perspective, the All-In Podcast’s Why they are trying to KILL OpenClaw argues that frontier models are actively threatened by the rise of open-source, small language models (SLMs) that can be verticalized and run locally on laptops.
Hardware & Infrastructure#
Scaling RL is actively breaking standard GPU cluster paradigms because the compute load shifts dynamically between inference and training depending on the model’s self-directed tool usage, as detailed in Reinforcement Learning at Scale: Engineering the Next Generation of Intelligence. This paradigm requires highly flexible orchestrations that can autotune and load-balance on the fly, which is a massive departure from handling static hardware failures during pre-training.
Everything Else#
Lex Fridman’s channel released a slew of history clips outlining how the “creative destruction” of Viking raids laid the groundwork for modern Europe. These include discussions of the first Duke of Normandy in The Viking Warlord who built modern Europe, brutal execution methods in Brutal revenge by Viking army: The Blood Eagle, battle trances in Berserkers: The most terrifying Viking warriors, the tactical targeting of wealthy monasteries in Why Viking raids were so successful, and hospitality-driven belief systems in The role of religion in history of civilization. On the science side, Biotech Has to Change criticizes the unsustainable $2 billion, 10-year cost required to bring drugs to market, while Dwarkesh Patel’s Why Quantum Computing Was Delayed by 30 Years - Michael Nielsen points out that early quantum concepts lacked traction largely because basic computing wasn’t culturally salient in the 1950s. Lastly, The Perplexity Computer Stock Pitch Competition rounds out the source list.