Dedicated to improving our understanding of AI to mitigate its risks
About us
Apollo Research is an AI safety organisation focused on reducing dangerous capabilities in advanced AI systems, especially deceptive behaviors. We design AI model evaluations and conduct interpretability research to better understand state-of-the-art AI models. Our governance team provides global policymakers with expert technical guidance.
Our mission
AI systems will soon be integrated into large parts of the economy and our personal lives. While this transformation may unlock substantial personal and societal benefits, there are also vast risks. We think some of the greatest risks come from advanced AI systems that can evade standard safety evaluations by exhibiting strategic deception. Our goal is to understand AI systems well enough to prevent the development and deployment of deceptive AIs.
What we do
Model Evaluations
We develop and run evaluations of frontier AI systems. Our expertise is in LM agent evaluations for strategic deception and designing model organisms for scheming. Our team also conducts fundamental research to further the science of evaluations.
Interpretability Research
Our applied interpretability research aims to enhance our model evaluations, while our fundamental interpretability research pursues novel methods for understanding neural network internals.
Governance & Policy
We support governments and international organisations by developing technical AI governance regimes. Expertise in building a robust third-party evaluation ecosystem, effectively regulating frontier AI systems, and establishing standards and best practices.
Consultancy
Additionally, we provide consultancy services for building responsible AI development frameworks, designing research programs, ecosystem mapping and literature reviews, and more.
Our Partners
Frontier labs, governments, multinational companies, and foundations partner with Apollo Research.