Chinese Tech Daily — 2026-05-10#
Top Story#
爱范儿 - MiniMax 回应大模型不认识马嘉祺 MiniMax recently published a technical blog detailing why their M2 large model series suddenly “forgot” the Chinese celebrity Ma Jiaqi, revealing fascinating insights into LLM post-training token degradation. The tokenizer merged the name “Jiaqi” into a single token, but because it appeared fewer than five times in post-training dialogue data, the token’s weight vector was severely squeezed out by high-frequency tokens. After a full-vocabulary scan, MiniMax discovered nearly 4.9% of tokens suffered from similar parameter drift—especially Japanese tokens (29.7%)—and fixed the issue by constructing synthetic data to ensure every token was practiced in simple repetition tasks.