张逸骅的博客 | Yihua's Blog

A Role Shift for AI Infra: From Foundational Support to a Core Engine of Innovation

AI Market Insights

A Role Shift for AI Infra: From Foundational Support to a Core Engine of Innovation Amidst the brilliance of today’s large language models (LLMs), the vast and intricate systems that underpin them...

Posted by Yihua Zhang on October 2, 2025

From GRPO to DAPO and GSPO: What, Why and How [En/中]

GRPO 的进化之路：从 GRPO 走向 DAPO 和 GSPO

Posted by Yihua Zhang on August 8, 2025

Re-understanding KL Approximation from an RL-for-LLM Lens: Notes on “Approximating KL Divergence [En/中]

从 RL for LLM 视角重新理解 KL 估计：读《Approximating KL Divergence》笔记

Posted by Yihua Zhang on July 2, 2025

Decorators in Machine Learning Projects [En/中]

机器学习中的装饰器

Posted by Yihua Zhang on April 10, 2025

《Bauklötze》音乐解构

积木崩塌时的命运回响，泽野弘之用音符砌筑的巨人悲歌

《Bauklötze》音乐解构 - 积木崩塌时的命运回响，泽野弘之用音符砌筑的巨人悲歌想为《进击的巨人》的 OST 写音乐鉴赏，已经是我心里惦记很久的事了。一方面，泽野弘之的配乐确实令人动容；另一方面，《进击的巨人》这部作品与泽野的音乐在情感与氛围的结合上堪称 1+1 > 2，每当音符响起，观众的心弦也随之颤动。于是，我在这里开个新坑，把自己最喜欢的一系列配乐做成音乐鉴赏，算是我读...

Posted by Yihua Zhang on March 8, 2025

DualPipe Explained: A Comprehensive Guide to DualPipe That Anyone Can Understand—Even Without a Distributed Background [En/中]

DualPipe 深入浅出：没有分布式训练基础也能看懂的 DualPipe 全方位讲解

Posted by Yihua Zhang on February 27, 2025

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment [En/中]

大语言模型 RLHF 全链路揭秘：从策略梯度、PPO、GAE 到 DPO 的实战指南

Posted by Yihua Zhang on February 11, 2025

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge [En/中]

DeepSeek-R1 技术剖析：没有强化学习基础也能看懂的 PPO & GRPO

Posted by Yihua Zhang on February 7, 2025

Why Cache 32 Heads When One Latent Variable Suffices? A Theory-to-Code Guide to DeepSeek’s MLA for KV-Cache [En/中]

从多头共享到潜变量：DeepSeek的MLA在低秩投影与按需解压中重新定义 KV-Cache

Posted by Yihua Zhang on February 2, 2025

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning [En/中]

千呼万唤始出来：DeepSeek-R1 如何通过强化学习实现复杂推理

Posted by Yihua Zhang on January 20, 2025

ABOUT ME

Yihua's Blog