A Research Direction

WORLD MODELS

$WM

From prediction to simulation.

World Models learn an internal universe — a latent space where geometry, physics, causality, and future states can be simulated before action.

I. The Shift

Large language models learned to predict the next token.

World Models learn to predict the next state.

A token is a surface event.

A state is an internal structure.

That difference changes everything.

Token Prediction

w₁w₂w₃?

Sequence completion on the surface

State Prediction

S₀S₁S₂

Internal world evolution

II. The Internal Universe

In latent space, the model compresses reality into a form it can evolve.

Objects remain persistent.

Geometry remains coherent.

Physics remains implicit.

Causality remains traceable.

This is not raw memorization.

It is internal world construction.

III. Temporal Rollout

A true world model does not stop at the next instant.

It moves forward.

One step.

Then another.

Then another.

t₀
t₁
t₂
t₃
t₄

Future states unfold inside the model before the world is touched.

IV. Structure

A useful internal model must preserve more than appearance.

It must understand:

Spatial structure

Object permanence

Motion continuity

Collision and force

Causal consequence

Intervention paths

The goal is not only to predict what happens.

The goal is to understand why.

V. Causality

World Models enable a more powerful question:

What happens if I do this instead?

By simulating alternate futures internally, AI can reason through intervention, compare outcomes, and choose actions with foresight.

now
ab
f₁
f₂

divergent futures from a single decision

VI. Embodiment

Embodied AI requires more than language.

It requires an internal engine for physical reality.

World Models make planning possible before execution.

They reduce trial-and-error in the real world.

They turn data into foresight.

This is one of the clearest paths toward robotics-native intelligence and physical-world AGI.

AI will not become truly capable by predicting surfaces forever.

It must build internal structure.

It must model the world.

It must simulate consequences.

It must plan before action.

World Models mark the beginning of that transition.

Research Context

World Models do not emerge in isolation.

They sit at the intersection of model-based reinforcement learning, latent video dynamics, causal representation learning, embodied intelligence, and long-horizon planning.

What makes them important is not only their predictive capacity, but their ability to form internal structure: a compact universe where future consequences can be simulated before action.

Intersecting Fields

Model-Based Reinforcement Learning
Latent Video Dynamics
Causal Representation Learning
Embodied Intelligence
Long-Horizon Planning
Counterfactual Simulation

Archive

Selected Literature

A curated collection of foundational and emerging research shaping the development of world models and latent predictive systems.

012018Foundational

World Models

David Ha, Jürgen Schmidhuber

A foundational proposal for learning compressed internal environments that support imagination, rollout, and control.

Read Paper
022019Foundational

Learning Latent Dynamics for Planning from Pixels

Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi

Introduced planning directly inside learned latent dynamics, helping bridge raw perception and model-based control.

Read Paper
032019Foundational

Dream to Control: Learning Behaviors by Latent Imagination

Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba

Showed that agents can learn effective behavior by imagining future trajectories inside compact world models.

Read Paper
042021Foundational

Mastering Atari with Discrete World Models

Danijar Hafner et al.

Demonstrated that learned world models can scale to complex decision-making environments with strong performance.

Read Paper
052023Active

Mastering Diverse Domains through World Models

Danijar Hafner et al.

Expanded the world-model paradigm across multiple domains, reinforcing its role as a general framework for intelligent control.

Read Paper
062022Active

TD-MPC: Temporal Difference Learning for Model Predictive Control

Nicklas Hansen, Xiaolong Wang, Hao Su

A major step in combining latent dynamics, planning, and control into a practical framework for fast decision-making.

Read Paper
072024Emerging

TD-MPC2: Scalable, Robust World Models for Continuous Control

Nicklas Hansen et al.

Extended latent-space model predictive control toward broader, more scalable embodied intelligence settings.

Read Paper
082023Active

I-JEPA: Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture

Mahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Yann LeCun, Nicolas Ballas

A key move away from raw reconstruction toward semantic predictive representations, highly relevant to next-state learning.

Read Paper
092024Emerging

V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video

Adrien Bardes, Quentin Garrido, Jean Ponce, Xinlei Chen, Michael Rabbat, Yann LeCun, Mido Assran, Nicolas Ballas

Advanced predictive representation learning in video, emphasizing latent anticipation over pixel-level generation.

Read Paper
102024Emerging

Genie: Generative Interactive Environments

Jake Bruce et al., Google DeepMind

A compelling direction toward interactive world simulation, where generative models begin to approximate controllable environments.

Read Paper
112024Active

Video Prediction Models as Rewards for Reinforcement Learning

Alejandro Escontrela et al.

Explores using video prediction as a signal for learning, connecting generative world models to behavioral objectives.

Read Paper
122021Active

Towards Causal Representation Learning

Bernhard Schölkopf et al.

A crucial research thread focused on learning representations that preserve intervention, structure, and causal consequence.

Read Paper
132020Foundational

Object-Centric Learning with Slot Attention

Francesco Locatello et al.

Important for building latent worlds that maintain object permanence, geometry, and physically meaningful transitions.

Read Paper
142023Foundational

Model-Based Reinforcement Learning: A Survey

Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker

The broader paradigm in which agents use internal models of the environment to simulate outcomes before acting.

Read Paper
152022Emerging

A Path Towards Autonomous Machine Intelligence

Yann LeCun

A visionary position paper outlining the architecture for autonomous agents built on world models and hierarchical planning.

Read Paper

This archive represents a partial view of a rapidly evolving research landscape. Inclusion reflects relevance to the world-model paradigm, not exhaustive coverage.

Research Threads

Themes

01

Latent Dynamics

Learning compact state representations where temporal evolution can be modeled efficiently

02

Predictive Simulation

Generating future states through learned forward models before taking action

03

Causal Modeling

Discovering and encoding causal structure to enable intervention and counterfactual reasoning

04

Embodied Planning

Grounding world models in physical interaction and sensorimotor prediction

05

Counterfactual Reasoning

Simulating alternative outcomes under different hypothetical conditions

06

Physical World Intelligence

Building models that understand object permanence, physics, and spatial structure

Key Concepts

Vocabulary

Simulation precedes control.

Internal structure precedes intelligent action.

The next state matters more than the next token.

Concepts