Hi, I'm Rohan! I aim to promote welfare and reduce suffering as much as possible for all sentient beings, which has led me to work on AGI safety research. I am particularly interested in foundation model agents (FMAs): systems like Claude Code and AutoGPT that equip foundation models with memory, tool use, and other affordances so they can perform multi-step tasks autonomously.
I am the founder of Aether, an independent research lab focused on foundation model agent safety. I also started as a PhD student at the University of Toronto in September 2025. I am supervised by Professor Zhijing Jin and continue to run Aether. Previously, I completed an undergrad in CS and Math at Columbia, where I helped run Columbia Effective Altruism and Columbia AI Alignment Club (CAIAC). I have done research internships with AI Safety Hub Labs (now LASR Labs), UC Berkeley's Center for Human-Compatible AI (CHAI), and the ML Alignment & Theory Scholars (MATS) program.
I love playing tennis, listening to rock and indie pop music, playing social deduction games, reading fantasy books, watching a fairly varied set of TV shows and movies, and playing the saxophone, among other things.
R. Arike*, R.M. Moreno*, R. Subramani*, S. Biswas, F.R. Ward
Preprint, 2026
Studying how information access affects LLM monitor performance; finds that monitors often perform better with less of the agent's reasoning and actions (less-is-more effect) and introduces extract-and-evaluate (EaE) monitoring, which improves sabotage detection in multiple environments.
K. Williams*, R. Subramani*, F.R. Ward*
Preprint, 2025
A safety mechanism for advanced AI systems using password-activated emergency shutdowns.
R. Subramani*, J. Foxabbott*, F.R. Ward
AAMAS, 2025
A framework for reasoning about higher-order beliefs in multi-agent influence diagrams with incomplete information.
A. Garber*, R. Subramani*, L. Luu*, M. Bedaywi, S. Russell, S. Emmons
AAAI, 2025
Extending the AI off-switch game to partially observable settings, analyzing optimal policies for both human and AI.
J. Clymer, G. Baker, R. Subramani, S. Wang
Preprint, 2023.
Developing formal frameworks to test AI systems' ability to generalize oversight to novel domains.
R. Subramani*, M. Williams*, M. Heitmann*, H. Holm, C. Griffin, J. Skalse
ICLR, 2024
Analyzing the theoretical limitations of different frameworks for specifying objectives in RL.
Implemented a transformer-based language model from scratch to better understand the architecture.
Built several autonomous agents with the OpenAI API to explore their capabilities and limitations.
Completed technical exercises focused on AI alignment concepts and implementation.
Reimplemented and visualized some ideas from the Lottery Ticket Hypothesis paper.
Talk from October 2025 with Rauno Arike. Overview of chain-of-thought monitorability and AI control, and discussion of recent work on comparing the performance of monitors with varying amounts of information access.