Rohan Subramani

Hi, I'm Rohan! I aim to promote welfare and reduce suffering as much as possible for all sentient beings, which has led me to work on AGI safety research. I am particularly interested in foundation model agents (FMAs): systems like Claude Code and AutoGPT that equip foundation models with memory, tool use, and other affordances so they can perform multi-step tasks autonomously.

I am the founder of Aether, an independent research lab focused on foundation model agent safety. I also started as a PhD student at the University of Toronto in September 2025. I am supervised by Professor Zhijing Jin and continue to run Aether. Previously, I completed an undergrad in CS and Math at Columbia, where I helped run Columbia Effective Altruism and Columbia AI Alignment Club (CAIAC). I have done research internships with AI Safety Hub Labs (now LASR Labs), UC Berkeley's Center for Human-Compatible AI (CHAI), and the ML Alignment & Theory Scholars (MATS) program.

I love playing tennis, listening to rock and indie pop music, playing social deduction games, reading fantasy books, watching a fairly varied set of TV shows and movies, and playing the saxophone, among other things.

Papers

How does information access affect LLM monitors' ability to detect sabotage?

R. Arike*, R.M. Moreno*, R. Subramani*, S. Biswas, F.R. Ward
Preprint, 2026
Studying how information access affects LLM monitor performance; finds that monitors often perform better with less of the agent's reasoning and actions (less-is-more effect) and introduces extract-and-evaluate (EaE) monitoring, which improves sabotage detection in multiple environments.

Rohan Subramani

Papers

How does information access affect LLM monitors' ability to detect sabotage?

Password-Activated Shutdown Protocols for Misaligned Frontier Agents

Higher-Order Beliefs in Incomplete Information MAIDs

The Partially Observable Off-Switch Game

Generalization Analogies: A Testbed for Generalizing AI Oversight to Hard-To-Measure Domains

On The Expressivity of Objective-Specification Formalisms in Reinforcement Learning

Projects

Coding GPT-2 from scratch

Implementing various LLM agents

Alignment Research Engineer Accelerator (ARENA) exercises

Experimenting with neural network pruning

Misc

Trajectory Labs Talk: Chain-of-Thought Monitoring and AI Control