Publications

October 30, 2025
Jasmine: A Simple, Performant and Scalable JAX-based World Modeling Codebase [arXiv]
World Modeling
April 29, 2025
PPO Is An Off-Policy Algorithm [Blog]
Reinforcement Learning
March 26, 2025
Performance-degradation Free Value Assertions in JAX [Blog]
Infrastructure
February 12, 2025
PPO Is Secretly Using Monte Carlo Advantage Estimation In LLM Post-Training [Blog]
Reinforcement Learning
September 26, 2024
Neural Networks Do Not Generalize Out-of-Distribution [Blog]
Roadmap
June 8, 2024
Going Beyond the Causal Mask in Language Modeling [Blog]
Language Modeling
December 7, 2023
ACT: Adaptive Compute Transformer [Blog]
Language Modeling