AI Evaluation manager

Remote Full-time
About the RoleLuma is pushing the boundaries of generative AI, building tools that redefine how visual content is created. We’re seeking a candidate to help shape and scale the way we understand, measure, and improve model performance. In this role, you’ll partner with researchers, engineers, and technical artists to evaluate our models against real-world creative use cases, design frameworks that capture qualitative nuance, and identify actionable insights that guide development.This is not a checkbox metrics role — it's about building evaluative systems that match the complexity of human perception, creativity, and intention.ResponsibilitiesEvaluate generative model performance across diverse tasks, prompts, and modalities.Identify key failure modes, regression patterns, and edge cases that impact product quality.Develop and maintain qualitative evaluation frameworks that are scalable and reusable.Collaborate closely with technical artists and engineers to align evaluations with model capabilities and target use cases.Translate high-level product goals into concrete evaluative criteria.Lead qualitative studies, side-by-side comparisons, and human-in-the-loop evaluation efforts.Provide detailed feedback that informs model fine-tuning, dataset curation, and product UX.Stay informed about emerging evaluation standards in generative AI and creative tools.QualificationsMaster’s degree or higher in Cognitive Science, Human-Computer Interaction (HCI), Design Research, Psychology, Media Studies, or a related field.5+ years of experience in product evaluation, UX research, model testing, or similar roles that involve structured qualitative assessment.Deep familiarity with creative workflows and real-world use cases for generative models (e.g., animation, filmmaking, digital art, VFX).Strong systems thinking and the ability to define abstract qualities (like believability, identity retention, or scene coherence) in clear evaluative terms.Experience working cross-functionally with engineers, researchers, and creatives.Excellent written communication skills and the ability to synthesize nuanced judgments into clear, actionable insights.Nice to HaveBackground in motion, visual effects, or storytelling pipelinesExperience evaluating AI-generated media (video, images, 3D)Prior work on building internal tools for qualitative data collection or scoringFamiliarity with prompt engineering and reference-based input methodsOriginally posted on Himalayas

Apply Now
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Field Nurse Educator – Eugene, OR

Remote

Need a serious iOS developer who can accomplish specific tasks within a week.

Remote

Experienced Entry Level Remote Data Entry Specialist for Innovative Technology Products - $75,000/Yearly - Work from Home Opportunity

Remote

Stay At Home Aetna Jobs Wa $26Hr - VacancyGlobal

Remote

Experienced Remote Customer Service Representative – Flexible Work from Home Opportunity with arenaflex

Remote

Facilities Manager

Remote

Virtual Memory Kernel Engineer

Remote

Immediately Require Asante's New Grad Nurse Residency Program-Tuition Reimbursement After One Year of Service - Medford, OR in Medford, OR

Remote

Experienced Customer Support Professional – Entry-Level Live Chat Representative for Dynamic Remote Team at blithequark

Remote

**Experienced Online Data Entry Clerk / Online Chat Assistant – Remote Opportunity at arenaflex**

Remote
← Back