Writing — Sense Wang

World Model 的 Transition Operator：VLWM 和 PRISM-WM 改了同一个接口

June 2026

A Chinese mechanism note on the transition operator inside learned world models. VLWM changes the temporal scale of the dynamics interface; PRISM-WM changes its physical regime structure. Both pressure the same assumption: one smooth monolithic transition function can serve every planning rollout.

GRASP: planning by optimizing the states in between

June 2026 · bilingual

A bilingual paper note on GRASP. In a standard rollout, intermediate states are computed by the world model. In GRASP, they become virtual states optimized with actions and checked by dynamics consistency.

WEAVER: a world model that is faithful, consistent, and fast at once

June 2026 · bilingual

A bilingual paper note on WEAVER, a multi-view latent world model for robotic manipulation. It targets fidelity, long-horizon consistency, and efficiency at once, so one model can serve policy evaluation, policy improvement, and test-time planning on real hardware.

WAV: verifying actions through forward-inverse asymmetry

June 2026 · bilingual

A bilingual paper note on WAV. Action-conditioned prediction is split into two independently verifiable factors — state plausibility and action reachability — exploiting the forward-inverse asymmetry so a world model can find its own errors and choose which interactions to collect next.

EV-WM: verifying events, not features

June 2026 · bilingual

A bilingual paper note on EV-WM (long-horizon robotic manipulation). A feature-space world model is scored not by how close its predicted features land to a goal, but by a verifier that decodes each imagined future into task predicates and checks whether the event actually happened — guiding CEM planning and gating candidate actions.

Long-horizon rollouts as overestimation control

June 2026 · bilingual

A bilingual paper note on NEUBAY (offline model-based RL): it removes explicit conservatism and lets long-horizon rollouts control value overestimation. A posterior over world models decides how far to roll out, and the discount factor demotes the bootstrapped terminal value, the one term prone to overestimation.

机器人世界模型的闭环底图：long-horizon 论文都在改哪个零件

June 2026

A Chinese research map built around one closed loop — a robot choosing actions with a learned imagination model. Fifty recent (2026 H1) world-model and long-horizon papers from top labs, each placed on one of seven parts of the loop: state representation, memory, dynamics, event verification, trust horizon, planner–model coupling, and the action/evaluation/data interface.

Long Horizon Is Not One Problem

June 2026 · bilingual

An overview of the long-horizon line in one frame: six failure modes crossed against five recurring moves in a single matrix, the robot–agent isomorphism, and the open trust-horizon question. A synthesis of the rollout-drift, interfaces, and map notes below.

Long-horizon world model 的五个接口

June 2026

A Chinese research note that splits long-horizon world-model work into five system interfaces: planner usage, rollout fidelity, event memory, action hierarchy, and evaluation/data infrastructure.

Long Horizon World Models：从主张到实验

June 2026

A Chinese research note on how to turn long-horizon world-model claims into tasks, metrics, baselines, ablations, and a verifier-MPC experiment.

Long Horizon：机器人世界模型的研究矩阵

June 2026

A Chinese research matrix for long-horizon robot world models and policies: rollout drift, closed-loop planning, trust horizon, event verification, temporal abstraction, and object/state persistence.

Long Horizon：152 篇世界模型论文阅读梯队

June 2026

A Chinese reading-priority table for 152 long-horizon and world-model papers, scored by importance, world-model relevance, long-horizon relevance, organization signal, and source-check status.

From Rollout to Context

June 2026

A bilingual research map of the long-horizon problem across robot world models, robot policies, and language-model agents.

Qwen-AgentWorld：文本世界模型的边界

June 2026

A Chinese research note on language world models, digital-agent simulators, next-observation prediction, and why a text model can still satisfy the world-model interface.

Two Long Horizons

June 2026

The phrase "long horizon" names a reliability problem in language-model agents and a fidelity problem in world models. A bilingual comparison across METR's time horizon, self-conditioning, MBPO compounding error, Dreamer/TD-MPC planning, and Genie-style drift.

Puppeteer：把 humanoid 控制拆成两层 world model

June 2026

A Chinese technical breakdown of Puppeteer, hierarchical world models, MoCap-trained motion priors, end-effector commands, and naturalness evaluation for humanoid control.

Diffusion：从噪声到数据的一条路径

June 2026

A bilingual mechanism note on diffusion as iterative denoising, score estimation, latent-space generation, guidance, video diffusion, and the connection to Cosmos Policy.

Cosmos Policy：视频模型怎样输出机器人动作

June 2026

A bilingual mechanism note on latent slots, diffusion denoising, and how Cosmos Policy fine-tunes a video world model to output action, future state, and value.

PRISM-WM：世界模型不能把接触平均掉

June 2026

A Chinese technical breakdown of PRISM-WM, TD-MPC-style latent dynamics, hybrid systems, orthogonalized MoE, and long-horizon rollout fidelity.

TD-MPC：世界模型不必还原世界

June 2026

A Chinese paper note on TD-MPC, task-oriented latent dynamics, reconstruction quality, action quality, short-horizon planning, and terminal value estimation.

Attention Is Not Expensive. Remembering Attention Is.

April 2026

A note on KV cache, MQA/GQA, MLA, and why long-context models are becoming memory systems.