World Models
David Ha, Jürgen Schmidhuber
A foundational proposal for learning compressed internal environments that support imagination, rollout, and control.
Read PaperA Research Direction
$WM
From prediction to simulation.
World Models learn an internal universe — a latent space where geometry, physics, causality, and future states can be simulated before action.
I. The Shift
Large language models learned to predict the next token.
World Models learn to predict the next state.
A token is a surface event.
A state is an internal structure.
That difference changes everything.
Token Prediction
Sequence completion on the surface
State Prediction
Internal world evolution
II. The Internal Universe
In latent space, the model compresses reality into a form it can evolve.
Objects remain persistent.
Geometry remains coherent.
Physics remains implicit.
Causality remains traceable.
This is not raw memorization.
It is internal world construction.
III. Temporal Rollout
A true world model does not stop at the next instant.
It moves forward.
One step.
Then another.
Then another.
Future states unfold inside the model before the world is touched.
IV. Structure
A useful internal model must preserve more than appearance.
It must understand:
Spatial structure
Object permanence
Motion continuity
Collision and force
Causal consequence
Intervention paths
The goal is not only to predict what happens.
The goal is to understand why.
V. Causality
World Models enable a more powerful question:
What happens if I do this instead?
By simulating alternate futures internally, AI can reason through intervention, compare outcomes, and choose actions with foresight.
divergent futures from a single decision
VI. Embodiment
Embodied AI requires more than language.
It requires an internal engine for physical reality.
World Models make planning possible before execution.
They reduce trial-and-error in the real world.
They turn data into foresight.
This is one of the clearest paths toward robotics-native intelligence and physical-world AGI.
AI will not become truly capable by predicting surfaces forever.
It must build internal structure.
It must model the world.
It must simulate consequences.
It must plan before action.
World Models mark the beginning of that transition.
Research Context
World Models do not emerge in isolation.
They sit at the intersection of model-based reinforcement learning, latent video dynamics, causal representation learning, embodied intelligence, and long-horizon planning.
What makes them important is not only their predictive capacity, but their ability to form internal structure: a compact universe where future consequences can be simulated before action.
Intersecting Fields
Archive
A curated collection of foundational and emerging research shaping the development of world models and latent predictive systems.
David Ha, Jürgen Schmidhuber
A foundational proposal for learning compressed internal environments that support imagination, rollout, and control.
Read PaperDanijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi
Introduced planning directly inside learned latent dynamics, helping bridge raw perception and model-based control.
Read PaperDanijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba
Showed that agents can learn effective behavior by imagining future trajectories inside compact world models.
Read PaperDanijar Hafner et al.
Demonstrated that learned world models can scale to complex decision-making environments with strong performance.
Read PaperDanijar Hafner et al.
Expanded the world-model paradigm across multiple domains, reinforcing its role as a general framework for intelligent control.
Read PaperNicklas Hansen, Xiaolong Wang, Hao Su
A major step in combining latent dynamics, planning, and control into a practical framework for fast decision-making.
Read PaperNicklas Hansen et al.
Extended latent-space model predictive control toward broader, more scalable embodied intelligence settings.
Read PaperMahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Yann LeCun, Nicolas Ballas
A key move away from raw reconstruction toward semantic predictive representations, highly relevant to next-state learning.
Read PaperAdrien Bardes, Quentin Garrido, Jean Ponce, Xinlei Chen, Michael Rabbat, Yann LeCun, Mido Assran, Nicolas Ballas
Advanced predictive representation learning in video, emphasizing latent anticipation over pixel-level generation.
Read PaperJake Bruce et al., Google DeepMind
A compelling direction toward interactive world simulation, where generative models begin to approximate controllable environments.
Read PaperAlejandro Escontrela et al.
Explores using video prediction as a signal for learning, connecting generative world models to behavioral objectives.
Read PaperBernhard Schölkopf et al.
A crucial research thread focused on learning representations that preserve intervention, structure, and causal consequence.
Read PaperFrancesco Locatello et al.
Important for building latent worlds that maintain object permanence, geometry, and physically meaningful transitions.
Read PaperThomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker
The broader paradigm in which agents use internal models of the environment to simulate outcomes before acting.
Read PaperYann LeCun
A visionary position paper outlining the architecture for autonomous agents built on world models and hierarchical planning.
Read PaperThis archive represents a partial view of a rapidly evolving research landscape. Inclusion reflects relevance to the world-model paradigm, not exhaustive coverage.
Research Threads
Themes
Learning compact state representations where temporal evolution can be modeled efficiently
Generating future states through learned forward models before taking action
Discovering and encoding causal structure to enable intervention and counterfactual reasoning
Grounding world models in physical interaction and sensorimotor prediction
Simulating alternative outcomes under different hypothetical conditions
Building models that understand object permanence, physics, and spatial structure
Key Concepts
Vocabulary
Simulation precedes control.
Internal structure precedes intelligent action.
The next state matters more than the next token.
Concepts