A Chinese research map built around one closed loop — a robot choosing actions with a learned imagination model. Fifty recent (2026 H1) world-model and long-horizon papers from top labs, each placed on one of seven parts of the loop: state representation, memory, dynamics, event verification, trust horizon, planner–model coupling, and the action/evaluation/data interface.
Writing
Technical essays and research notes on AI infrastructure, agent systems, and the mechanics behind modern models.
An overview of the long-horizon line in one frame: six failure modes crossed against five recurring moves in a single matrix, the robot–agent isomorphism, and the open trust-horizon question. A synthesis of the rollout-drift, interfaces, and map notes below.
A Chinese research note that splits long-horizon world-model work into five system interfaces: planner usage, rollout fidelity, event memory, action hierarchy, and evaluation/data infrastructure.
A Chinese research note on how to turn long-horizon world-model claims into tasks, metrics, baselines, ablations, and a verifier-MPC experiment.
A Chinese research matrix for long-horizon robot world models and policies: rollout drift, closed-loop planning, trust horizon, event verification, temporal abstraction, and object/state persistence.
A bilingual research map of the long-horizon problem across robot world models, robot policies, and language-model agents.
A Chinese research note on language world models, digital-agent simulators, next-observation prediction, and why a text model can still satisfy the world-model interface.
The phrase "long horizon" names a reliability problem in language-model agents and a fidelity problem in world models. A bilingual comparison across METR's time horizon, self-conditioning, MBPO compounding error, Dreamer/TD-MPC planning, and Genie-style drift.
A Chinese technical breakdown of Puppeteer, hierarchical world models, MoCap-trained motion priors, end-effector commands, and naturalness evaluation for humanoid control.
A bilingual mechanism note on diffusion as iterative denoising, score estimation, latent-space generation, guidance, video diffusion, and the connection to Cosmos Policy.
A bilingual mechanism note on latent slots, diffusion denoising, and how Cosmos Policy fine-tunes a video world model to output action, future state, and value.
A Chinese technical breakdown of PRISM-WM, TD-MPC-style latent dynamics, hybrid systems, orthogonalized MoE, and long-horizon rollout fidelity.
A Chinese paper note on TD-MPC, task-oriented latent dynamics, reconstruction quality, action quality, short-horizon planning, and terminal value estimation.
A note on KV cache, MQA/GQA, MLA, and why long-context models are becoming memory systems.
Dense notes on papers worth understanding deeply, not summaries for engagement.
Work in public, but keep the bar high.