I'm passionate about ensuring advanced AI systems are developed safely, robustly, and aligned with human values. Currently focusing on building my knowledge and skills in this critical field.
- AI Alignment: Working to ensure AI systems reliably pursue intended goals
- Interpretability: Making AI systems more transparent and understandable
- Robustness: Building systems that perform reliably under distribution shifts
- Value Learning: Developing techniques to learn and respect human preferences
- AI Governance: Exploring policy frameworks for responsible AI development
- Deep reinforcement learning from human feedback (RLHF)
- Causal interpretability methods
- Mechanistic interpretability techniques
- Formal verification approaches for neural networks
- Bachelor of Science, Computer Science
- Summer 2024 Research Intern @ SprintML lab, CISPA
- Research Fellow @ Fatima Fellowship
- Twitter/X
- Email: [email protected]
If you're interested in AI safety, check out these resources:
"The development of full artificial intelligence could spell the end of the human race... or it could be the best thing that ever happened to us. We just have to be sure it's the latter." — Stephen Hawking