A Research Direction

WORLD MODELS

$WM

CA:1413aN9b2tW6XkSzxSXhDJQZPTWFDyH8oehqc4Rfpump

From prediction to simulation.

World Models learn an internal universe — a latent space where geometry, physics, causality, and future states can be simulated before action.

Explore the Thesis View Architecture

I. The Shift

Large language models learned to predict the next token.

World Models learn to predict the next state.

A token is a surface event.

A state is an internal structure.

That difference changes everything.

Token Prediction

w₁→w₂→w₃→?

Sequence completion on the surface

State Prediction

S₀→S₁→S₂

Internal world evolution

II. The Internal Universe

In latent space, the model compresses reality into a form it can evolve.

Objects remain persistent.

Geometry remains coherent.

Physics remains implicit.

Causality remains traceable.

This is not raw memorization.

It is internal world construction.

III. Temporal Rollout

A true world model does not stop at the next instant.

It moves forward.

One step.

Then another.

t₀

→

t₁

→

t₂

→

t₃

→

t₄

Future states unfold inside the model before the world is touched.

IV. Structure

A useful internal model must preserve more than appearance.

It must understand:

Spatial structure

Object permanence

Motion continuity

Collision and force

Causal consequence

Intervention paths

The goal is not only to predict what happens.

The goal is to understand why.

V. Causality

World Models enable a more powerful question:

What happens if I do this instead?

By simulating alternate futures internally, AI can reason through intervention, compare outcomes, and choose actions with foresight.

now

f₁

f₂

divergent futures from a single decision

VI. Embodiment

Embodied AI requires more than language.

It requires an internal engine for physical reality.

World Models make planning possible before execution.

They reduce trial-and-error in the real world.

They turn data into foresight.

This is one of the clearest paths toward robotics-native intelligence and physical-world AGI.

AI will not become truly capable by predicting surfaces forever.

It must build internal structure.

It must model the world.

It must simulate consequences.

It must plan before action.

World Models mark the beginning of that transition.

Research Context

World Models do not emerge in isolation.

They sit at the intersection of model-based reinforcement learning, latent video dynamics, causal representation learning, embodied intelligence, and long-horizon planning.

What makes them important is not only their predictive capacity, but their ability to form internal structure: a compact universe where future consequences can be simulated before action.

Intersecting Fields

Model-Based Reinforcement Learning

Latent Video Dynamics

Causal Representation Learning

Embodied Intelligence

Long-Horizon Planning

Counterfactual Simulation

Selected Literature

A curated collection of foundational and emerging research shaping the development of world models and latent predictive systems.

012018Foundational

World Models

David Ha, Jürgen Schmidhuber

A foundational proposal for learning compressed internal environments that support imagination, rollout, and control.

WORLD MODELS

Selected Literature

World Models

Learning Latent Dynamics for Planning from Pixels

Dream to Control: Learning Behaviors by Latent Imagination

Mastering Atari with Discrete World Models

Mastering Diverse Domains through World Models

TD-MPC: Temporal Difference Learning for Model Predictive Control

TD-MPC2: Scalable, Robust World Models for Continuous Control

I-JEPA: Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture

V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video

Genie: Generative Interactive Environments

Video Prediction Models as Rewards for Reinforcement Learning

Towards Causal Representation Learning

Object-Centric Learning with Slot Attention

Model-Based Reinforcement Learning: A Survey

A Path Towards Autonomous Machine Intelligence

Latent Dynamics

Predictive Simulation

Causal Modeling

Embodied Planning

Counterfactual Reasoning

Physical World Intelligence

What is a World Model?

What is next-state prediction?

Why is latent space important?

How is this different from language models?

Why does this matter for robotics?

What does simulation enable?

Why is causality important?

Why is this foundational for AGI?