[Remote] QA Engineer, AI Products

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. MDCalc is a leading medical reference tool used by clinicians worldwide, and they are seeking a QA Engineer to enhance their AI product team. This role focuses on ensuring the quality and reliability of AI-powered features, particularly in testing LLM-based systems, while collaborating with cross-functional teams to define quality metrics and testing strategies.ResponsibilitiesDesign and execute test strategies for LLM-powered features, including prompt regression testing, output evaluation, and hallucination detectionBuild and maintain automated evaluation pipelines (eval sets, golden datasets, LLM-as-judge frameworks) to catch quality regressions in non-deterministic outputsPerform black-box and exploratory testing of MDCalc's AI features across web and mobile, with particular attention to clinical accuracy, safety, and edge casesDefine quality metrics for AI outputs (accuracy, faithfulness, relevance, safety, latency, cost) and establish thresholds for release readinessCollaborate cross-functionally with engineers, product managers, ML/AI engineers, and clinical reviewers to define what 'good' looks like for AI responsesInvestigate and triage AI failure modes, distinguishing model issues, prompt issues, retrieval issues, and integration bugsParticipate in team discussions, offering feedback on testability, risks, prompt design, and guardrailsHelp develop QA strategies to expand future testing capacity, automation, and evaluation coverage as the AI product surface growsSkills5+ years of experience in software QA, with at least 1 year of hands-on testing of LLM-based or AI/ML-powered featuresStrong understanding of QA principles, test case creation/documentation, and best practices for both deterministic and non-deterministic systemsHands-on experience with LLM tooling and concepts: prompt engineering, RAG systems, evaluation frameworks (e.g., Promptfoo, Braintrust, LangSmith, DeepEval, Ragas, OpenAI Evals), and LLM APIs (OpenAI, Anthropic, etc.)Experience designing automated qualitative evaluation approaches, including LLM-as-judge, rubric-based scoring, semantic similarity checks, and golden dataset regression testingProficiency with test automation tools, with a focus on PlaywrightStrong SQL skills for data validation, test data creation, and verifying data integrity across systemsFamiliarity with token usage, latency profiling, and cost monitoring as quality signalsEagerness to learn quickly and a positive, solutions-oriented attitudeClear and concise communicator, able to surface issues, blockers, and risks effectively when communicating ambiguous or probabilistic failuresSelf-motivated, proactive, and able to manage time and priorities independentlyBenefitsMedical, Dental, & Vision Coverage, with option to extend to your dependentsCompany-sponsored short-term insuranceFully-paid 8 week parental leave, after 6 months of employmentCompany-sponsored 401k, after 3 months of employmentUnlimited vacation for salaried roles - we trust you to take the time you needBi-annual company offsites to connect, reflect, and plan togetherWork from home monthly stipendA culture of fun and motivated team members who believe in a greater mission here at MDCalcCompany OverviewMDCalc is used by over 2/3 of US physicians, provides free and access to 800+ medical scores, calculations and algorithms. It was founded in 2005, and is headquartered in New York, New York, USA, with a workforce of 11-50 employees. Its website is https://www.mdcalc.com.

Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Veterinary Receptionist job at Mission Pet Health in Greenville, NC

Remote

ES Scoring Assistant Music - MA

Remote

Sr. Software Engineering Manager- AI Development

Remote

Immediate Hiring: Payroll Data Entry Specialist

Remote

Experienced or Aspiring Data Entry Specialist - Join Apple's Remote Team for a Rewarding Career in Data Management

Remote

Compliance Risk Management Specialist - Privacy & Data Protection at Doctor on Demand

Remote

**Experienced Remote Data Entry Specialist – Flexible Part-Time Opportunity with arenaflex**

Remote

**Experienced Full Stack Business Data Scientist – Impact Measurement & Analytics**

Remote

**Experienced Customer Support Representative – Remote Healthcare Solutions**

Remote

Case Manager Registered Nurse - Field Wayne and Macomb 5

Remote
← Back