[Remote] Senior Research Scientist, Model Evaluation

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. Cohere is on a mission to scale intelligence to serve humanity by training and deploying frontier models for AI systems. In this role, you will be responsible for creating next-generation evaluation methods and infrastructure to measure LLM progress, pushing the limits of what models can accomplish and ensuring high data quality. Responsibilities • Create ambitious new evaluation benchmarks that push the limits of what our models can accomplish • Work on highly cross-functional teams to translate model feedback into trustworthy, repeatable evaluations • Conduct research to advance the state-of-the-art in LLM evaluation methods, including training LLM judges; refining LLM-based data synthesis pipelines; and improving evaluation efficiency • Build scalable and reusable tools for digging into model performance Skills • Create ambitious new evaluation benchmarks that push the limits of what our models can accomplish • Work on highly cross-functional teams to translate model feedback into trustworthy, repeatable evaluations • Conduct research to advance the state-of-the-art in LLM evaluation methods, including training LLM judges; refining LLM-based data synthesis pipelines; and improving evaluation efficiency • Build scalable and reusable tools for digging into model performance • You enjoy rapidly building prototypes that demonstrate the boundaries of what LLMs are capable of, and you have developed resources to measure those capabilities • You have spent dozens of hours reviewing complex data and LLM outputs to ensure high data quality • You are obsessive about rigorously measuring AI capabilities, and also about making sure your measurements actually align with the capabilities you care about • You have strong software engineering skills Benefits • An open and inclusive culture and work environment • Work closely with a team on the cutting edge of AI research • Weekly lunch stipend, in-office lunches & snacks • Full health and dental benefits, including a separate budget to take care of your mental health • 100% Parental Leave top-up for up to 6 months • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement • Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend • 6 weeks of vacation (30 working days!) Company Overview • Cohere is an enterprise AI firm developing secure and private AI technology to address real-world business challenges. It was founded in 2019, and is headquartered in Toronto, Ontario, CAN, with a workforce of 201-500 employees. Its website is Company H1B Sponsorship • Cohere has a track record of offering H1B sponsorships, with 11 in 2025, 14 in 2024, 13 in 2023, 5 in 2022, 2 in 2021. Please note that this does not guarantee sponsorship for this specific role. Apply tot his job
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Community Liaison & Marketer Job at Cambridge Caregivers in Dallas

Remote

Sales Consultant, Interiors- Inland Empire, CA

Remote

Remote Director Partnership Development (Phoenix Based) in Phoenix, AZ

Remote

Delta Airlines Flight Attendant Needed-Frisco,TX

Remote

**Experienced Full Stack Customer Service Representative – Home-Based Part-Time Chat Support Agent**

Remote

Amazon Customer Support Representative

Remote

LPN / LVN - Remote Nurse (FT & PT Available) - Full-time

Remote

**Experienced Full Stack Customer NOC Analyst – Web & Cloud Application Development**

Remote

Experienced Remote Data Entry and Customer Service Representative – Part-Time Opportunity to Work from Home and Contribute to Market Research and Product Development

Remote

Experienced Administrative Assistant and Data Entry Clerk for Remote Work from Home Opportunities with blithequark – Flexible Part-Time and Full-Time Positions Available

Remote
← Back