AI Safety Data Scientist – New Grad – Anthropic
Anthropic is the AI safety company building Claude — the most capable and safest large language model in the world. Founded by former OpenAI researchers including Dario Amodei and Daniela Amodei, Anthropic's mission is the responsible development of AI for the long-term benefit of humanity. Claude powers hundreds of applications for millions of users, and Anthropic is at the forefront of Constitutional AI, interpretability research, and AI alignment. We are hiring New Grad AI Safety Data Scientists in San Francisco to analyze Claude's behavior, measure safety properties, and develop quantitative frameworks for evaluating AI alignment.
Responsibilities
- Design and run large-scale experiments evaluating Claude's capability benchmarks, safety properties, and alignment across diverse task categories
- Develop statistical frameworks to measure and track AI safety metrics — including harmful output rates, refusal quality, and helpfulness-harmlessness tradeoffs
- Analyze human feedback datasets from Anthropic's RLHF training pipeline to identify patterns in human preferences and evaluate label quality
- Build data pipelines processing Claude's conversation logs to surface failure modes, capability regressions, and unexpected behaviors at scale
- Collaborate with Anthropic's interpretability and policy teams to translate quantitative safety findings into model improvements and deployment guidelines
- Conduct red-teaming data analysis — measuring the effectiveness of adversarial prompting techniques and the robustness of Constitutional AI guardrails
Requirements
- Bachelor's or Master's degree in Statistics, Computer Science, Machine Learning, or Cognitive Science
- Strong statistical and probabilistic reasoning skills for experimental design and hypothesis testing
- Proficiency in Python for data analysis (pandas, numpy, scipy, matplotlib/seaborn)
- Experience with ML model evaluation, benchmarking, or NLP data analysis
- Genuine commitment to AI safety and understanding of LLM behavior, alignment, and risks
Benefits
- Among the most competitive compensation packages in AI with Anthropic equity
- Work at the frontier of AI safety research — the most important challenge in technology
- Comprehensive medical, dental, and vision benefits with 100% premium coverage
- 401(k) with Anthropic matching
- San Francisco headquarters with Anthropic's mission-driven, research-first culture