From rewriting Google’s search stack in the early 2000s to reviving sparse trillion-parameter models and co-designing TPUs with frontier ML research, Jeff Dean has quietly shaped nearly every layer of the modern AI stack.
Jeff's framing of energy cost in picojoules as the real optimization target reframes how I think about model selection entirely. I spent weeks obsessing over benchmark scores before realizing the actual cost driver in my AI agent was data movement, not compute. When I switched from Opus for everything to routing tasks by complexity — Haiku for reads/searches, Opus for multi-step reasoning - costs dropped 80%.
The AI engineering focus here fills a real gap in the ecosystem.
Great interview, I really enjoyed it! 💚 🥃
Jeff's framing of energy cost in picojoules as the real optimization target reframes how I think about model selection entirely. I spent weeks obsessing over benchmark scores before realizing the actual cost driver in my AI agent was data movement, not compute. When I switched from Opus for everything to routing tasks by complexity — Haiku for reads/searches, Opus for multi-step reasoning - costs dropped 80%.
The hierarchy logic matches Jeff's Flash vs Pro model: distill capability downward, not just shrink the model. Documented the actual numbers here: https://thoughts.jock.pl/p/claude-model-optimization-opus-haiku-ai-agent-costs-2026