A Chinese mechanism note on the transition operator inside learned world models. VLWM changes the temporal scale of the dynamics interface; PRISM-WM changes its physical regime structure. Both pressure the same assumption: one smooth monolithic transition function can serve every planning rollout.
Writing
Technical essays and research notes on AI infrastructure, agent systems, and the mechanics behind modern models.
A bilingual paper note on GRASP. In a standard rollout, intermediate states are computed by the world model. In GRASP, they become virtual states optimized with actions and checked by dynamics consistency.
A bilingual paper note on WEAVER, a multi-view latent world model for robotic manipulation. It targets fidelity, long-horizon consistency, and efficiency at once, so one model can serve policy evaluation, policy improvement, and test-time planning on real hardware.
A bilingual paper note on WAV. Action-conditioned prediction is split into two independently verifiable factors — state plausibility and action reachability — exploiting the forward-inverse asymmetry so a world model can find its own errors and choose which interactions to collect next.
A bilingual paper note on EV-WM (long-horizon robotic manipulation). A feature-space world model is scored not by how close its predicted features land to a goal, but by a verifier that decodes each imagined future into task predicates and checks whether the event actually happened — guiding CEM planning and gating candidate actions.
A bilingual paper note on NEUBAY (offline model-based RL): it removes explicit conservatism and lets long-horizon rollouts control value overestimation. A posterior over world models decides how far to roll out, and the discount factor demotes the bootstrapped terminal value, the one term prone to overestimation.
A Chinese research map built around one closed loop — a robot choosing actions with a learned imagination model. Fifty recent (2026 H1) world-model and long-horizon papers from top labs, each placed on one of seven parts of the loop: state representation, memory, dynamics, event verification, trust horizon, planner–model coupling, and the action/evaluation/data interface.
An overview of the long-horizon line in one frame: six failure modes crossed against five recurring moves in a single matrix, the robot–agent isomorphism, and the open trust-horizon question. A synthesis of the rollout-drift, interfaces, and map notes below.
A Chinese research note that splits long-horizon world-model work into five system interfaces: planner usage, rollout fidelity, event memory, action hierarchy, and evaluation/data infrastructure.
A Chinese research note on how to turn long-horizon world-model claims into tasks, metrics, baselines, ablations, and a verifier-MPC experiment.
A Chinese research matrix for long-horizon robot world models and policies: rollout drift, closed-loop planning, trust horizon, event verification, temporal abstraction, and object/state persistence.
A Chinese reading-priority table for 152 long-horizon and world-model papers, scored by importance, world-model relevance, long-horizon relevance, organization signal, and source-check status.
A bilingual research map of the long-horizon problem across robot world models, robot policies, and language-model agents.
A Chinese research note on language world models, digital-agent simulators, next-observation prediction, and why a text model can still satisfy the world-model interface.
The phrase "long horizon" names a reliability problem in language-model agents and a fidelity problem in world models. A bilingual comparison across METR's time horizon, self-conditioning, MBPO compounding error, Dreamer/TD-MPC planning, and Genie-style drift.
A Chinese technical breakdown of Puppeteer, hierarchical world models, MoCap-trained motion priors, end-effector commands, and naturalness evaluation for humanoid control.
A bilingual mechanism note on diffusion as iterative denoising, score estimation, latent-space generation, guidance, video diffusion, and the connection to Cosmos Policy.
A bilingual mechanism note on latent slots, diffusion denoising, and how Cosmos Policy fine-tunes a video world model to output action, future state, and value.
A Chinese technical breakdown of PRISM-WM, TD-MPC-style latent dynamics, hybrid systems, orthogonalized MoE, and long-horizon rollout fidelity.
A Chinese paper note on TD-MPC, task-oriented latent dynamics, reconstruction quality, action quality, short-horizon planning, and terminal value estimation.
A note on KV cache, MQA/GQA, MLA, and why long-context models are becoming memory systems.
Dense notes on papers worth understanding deeply, not summaries for engagement.
Work in public, but keep the bar high.