Sources

Company@X — 2026-06-19#

Signal of the Day#

Google Cloud unveiled the TPU 8i, a new AI accelerator explicitly designed for post-training and high-concurrency reasoning workloads. Featuring a new serving-optimized “Boardfly” network topology, the highest on-chip SRAM to date, and a Collectives Acceleration Engine, this marks a major architectural pivot toward scaling inference efficiency.

Key Announcements#

[Z.ai] · Source Z.ai officially launched GLM-5.2, positioning it as the most capable open-weights intelligence model to date. The release features a 1M context window, an MIT license, and two distinct reasoning levels (max and high) while maintaining the API pricing of the previous GLM-5.1 generation.

[Nebula Security] · Source A critical Remote Code Execution (RCE) vulnerability in Nginx, designated nginx-quicburst (CVE-2026-42530), was disclosed by security agent VEGA. Affecting Nginx 1.31 environments with QUIC enabled, this marks only the third vulnerability since 2014 to receive a “major” severity rating and requires immediate patching.

[Amazon Web Services] · Source AWS officially launched its new Local Zone in Hanoi, Vietnam, bringing latency-sensitive workloads closer to regional customers. The new infrastructure supports Amazon S3 and Amazon EBS Local Snapshots, providing single-digit millisecond response times and robust data sovereignty for the local market.

[ValgorithmicInc] · Source Valgorithmic successfully recreated and expanded Waymo’s “human baselines” for autonomous driving safety, extending the dataset to encompass new cities and trucking routes. They have open-sourced the underlying tool to provide the industry with a shared reference point and invite critical scrutiny of AV safety metrics.

[Hugging Face] · Source Continuous batching has officially landed in the Transformer Reinforcement Learning (TRL) library specifically for Group Relative Policy Optimization (GRPO). This optimization natively processes 64 generations faster and with lower VRAM consumption than standard text generation without requiring external tools like vLLM.

Also Noted#

[UnslothAI] (Source): Released a highly compressed 2-bit local version of the new GLM-5.2 model, shrinking it from 1.51TB to 238GB while retaining approximately 82% accuracy.
[Ollama] (Source): Doubled its US-based cloud GPU capacity—running heavily on NVIDIA B300 Blackwell hardware—to accommodate the immediate surge in GLM-5.2 inference traffic.
[Magnitude] (Source): Launched a new command-line coding agent operating strictly on open models, promising performance on par with Claude Code but at a 60% cost reduction.
[Waymo] (Source): Deployed custom “Supergirl” wrapped robotaxis across San Francisco, Los Angeles, and Phoenix as part of a promotional campaign for an upcoming theatrical release.