[Remote] Research Program Manager - Model Evals and Safety

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. ReflectionAI is on a mission to build open superintelligence and make it accessible to all. They are seeking a Research Program Manager who will lead the development of model evaluations and safety infrastructure, ensuring that their work aligns with the broader safety ecosystem while driving operational excellence in model development.ResponsibilitiesBuild the foundational infrastructure for model evals and safety at ReflectionDefine the evaluation frameworks, tooling requirements, and operational processes that will underpin how we assess model capabilities, risks, and readiness for releaseStand up model safety operations as a function, including establishing the workflows, review cadences, and decision frameworks that connect safety evaluation to the model development and release lifecyclePartner with research and engineering leads across pre-training, mid-training, and post-training to embed safety and evaluation checkpoints into the development process in a way that is rigorous without being a bottleneckDrive the scoping and prioritization of eval science and eval infrastructure investments, working with technical leads to determine what to build in-house, what to adopt, and where to invest research effortEstablish Reflection's engagement with the external safety ecosystem, including third-party assessments, academic partnerships, and industry safety frameworksRepresent the company's safety posture to external stakeholders with credibility and clarityCreate visibility and reporting structures that give leadership a clear, honest picture of model safety status, evaluation coverage, and open risks, so they can make informed decisions at the pace the business requiresChampion a culture of blameless post-mortems and continuous learning, turning every safety-relevant finding into a concrete improvement to our systems and processesSkills7+ years of experience in technical program management, research operations, or ML engineering, with demonstrated experience standing up new functions, teams, or programs from scratchFamiliar with the landscape of model evaluation and AI safety, including evaluation methodologies, red-teaming, alignment research, and the evolving regulatory and industry safety ecosystemDeep enough technically to engage with researchers and engineers on topics like model behavior, evaluation design, data pipelines, and safety-critical system architectureProven ability to build structures where none exists. You've taken ambiguous mandates and turned them into functioning programs with clear ownership, measurable outcomes, and durable processesStrong stakeholder management skills spanning deeply technical ICs, research leadership, and external partnersExcited to build from zero to one. We are a small, fast-moving team and this role will help define how model safety and evaluation works at ReflectionMotivated by enabling researchers and engineers to build the world's most capable open-weight AI systems, responsiblyBenefitsComprehensive medical, dental, vision, life, and disability insurance.Fully paid parental leave for all new parents, including adoptive and surrogate journeys.Financial support for family planning.Paid time off when you need it, relocation support, and more perks that optimize your time.Lunch and dinner are provided daily.Regular off-sites and team celebrations.Company OverviewReflection is an AI lab building frontier open weight models. Our team previously built frontier LLMs at labs like DeepMind, OpenAI, and Anthropic. It was founded in undefined, and is headquartered in New York, NY, US, with a workforce of 51-200 employees. Its website is https://www.reflection.ai/.

Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Experienced Remote Chat Moderator and Customer Support Specialist – Flexible Hours and Comprehensive Training Provided for a Dynamic Online Community

Remote

Software Programmers

Remote

Global Technology Solutions Internship (Remote & Hybrid) in Virginia

Remote

[Remote] Global Account Manager - East Coast

Remote

Remote Truck Dispatcher- New York City,US

Remote

Experienced Technical Program Manager – Leading Cross-Functional Projects and Driving Business Growth at blithequark

Remote

Physician Relations Manager- Inside Sales

Remote

Experienced Entry-Level Data Entry Specialist – Remote Opportunity at careerzynith

Remote

Hilton Baton Rouge - Group Reservations Coordinator

Remote

Psychiatric Assistant - Emergency Room Holdover- FT- Nights- MRH

Remote
← Back