Publications

(2026). VLA Foundry: A Unified Framework for Training Vision-Language-Action Models. Technical Report, Toyota Research Institute.
(2026). A Careful Examination of Large Behavior Models for Multitask Dexterous Manipulation. Science Robotics.
(2026). A Systematic Study of Data Modalities and Strategies for Co-training Large Behavior Models for Robot Manipulation. In RSS.
(2025). OpenThoughts: Data Recipes for Reasoning Models. arXiv preprint.
(2025). Should VLMs be Pre-trained with Image Data?. In ICLR 2025.
(2025). Improving Test-Time Search for LLMs with Backtracking Against In-Context Value Verifiers. Workshop on Reasoning and Planning for Large Language Models.
(2024). Espresso: High Compression for Rich Extraction from Videos for Your Vision-Language Model. arXiv preprint.
(2024). DataComp-LM: In Search of the Next Generation of Training Sets for Language Models.
(2024). Linearizing Large Language Models. In COLM.
(2024). DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset. In RSS 2024.