Experts Have World Models. LLMs Have Word…

22 hrs ago

Most expert work isn’t “produce a probable artifact”; it's "choose a good move considering other agents, guessing hidden state". LLMs default to single-shot artifacts and need World Models to progress

Read →

4 Comments

Lakshmi Narasimhan

the Priya example nails it. the finance friend evaluated the email in isolation. the experienced coworker simulated how it would land in Priya's inbox, against her triage heuristics, under deadline pressure

this is the gap between LLMs writing code and LLMs building systems. code that compiles isn't code that survives contact with users, adversaries, edge cases

been running production systems solo for 20 years. the best operators aren't the ones who know the most commands — they're the ones who can simulate what will break next. "if I do X, the cache invalidates, which triggers Y, which overloads Z." that's a world model

the poker vs chess analogy is perfect. hidden state + adversarial adaptation = can't just pattern match

curious where you see the fix coming from. is it more training on game theory scenarios? or do we need fundamentally different architectures to track hidden state?

Reply (1)

Latent.Space

someone on twitter suggested using an RLM like structure to simulate world models. I think its very worth thinking about. what's clear is next token prediction probably does not meaningfully scale to do internal world models required to do this!

Graham

This is a great framework. LLMs can regurgitate game theory but can’t yet ‘feel’ the move/counter move process. They are great at predicating the next word but perhaps not yet predicting the next emotional state. Makes me wonder if the next frontier is capturing a better understanding of human behavioral psychology and being able to relate to adversarial circumstance.

Do you think we can get there with text alone? You might make the case that art could be useful in this sense: drama, etc.

What is Shakespeare if not a picture of clashing wills.. type of the thing.

Reply (1)

Latent.Space

emotions is a whole higher tier of simulation that we dont even account for in this article! but yeah that would be amazing.