[Remote] Software Engineer – AI Coding Evaluation

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. MillionLogics is a global leader in IT solutions specializing in Data & AI, Cloud Solutions, and IT Consulting. They are seeking experienced Software Engineers to evaluate and improve the coding capabilities of frontier AI models by assessing AI-generated code and developing high-quality evaluation datasets and benchmarks.ResponsibilitiesReview and evaluate AI-generated code for correctness, efficiency, maintainability, and adherence to requirementsAnalyze software engineering tasks and validate whether proposed solutions meet expected outcomesDebug code, reproduce issues, and verify fixes across different programming environmentsAssess model-generated explanations, reasoning, and implementation approaches for technical accuracyCreate, refine, and maintain evaluation datasets, benchmarks, and grading rubrics for coding tasksIdentify edge cases, failure modes, and areas where AI systems struggle with software engineering problemsDocument findings clearly and provide structured feedback to improve evaluation quality and consistencyCollaborate with project teams to establish quality standards and evaluation methodologiesSkillsBachelor's or Master's degree in Computer Science, Software Engineering, or a related technical field3+ years of professional software engineering experienceStrong proficiency in one or more of the following languages: Python, Java, C/C++, Go, Swift, Objective-C, PHP, or SQLStrong understanding of data structures, algorithms, software design principles, and debugging methodologiesExperience performing code reviews and evaluating code quality in production or large-scale codebasesAbility to analyze complex technical problems and assess solution correctness with minimal supervisionFamiliarity with version control systems (e.g., Git) and modern software development workflowsStrong written communication skills and attention to detailExperience with AI/ML data annotation, NLP, prompt engineering, model evaluation, or LLM-related projectsExperience evaluating AI-generated code, benchmark creation, or software quality assessmentBenefitsMode of Work: RemoteContract: 12 monthsCommitments Required: At least 4 hours per day and minimum 20 hours per week with overlap of 4 hours with PSTEngagement type : Contractor assignment (no medical/paid leave)Company OverviewAs a trusted Oracle Partner, MillionLogics is more than just an IT solutions provider - it's a global powerhouse blending innovation, expertise, and strategic vision. It was founded in 2020, and is headquartered in London, United Kingdom, GB, with a workforce of 51-200 employees. Its website is https://www.millionlogics.com.

Apply Now β†’

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

**Experienced Data Entry Specialist – Remote Opportunity with arenaflex**

Remote

Remote Data Entry Specialist – Airline Data Management | Work From Home Position | arenaflex

Remote

Urgently Hiring: Immediately Need Second Grade Teacher - INR in

Remote

**Experienced Remote Data Entry Assistant - Global Opportunities at Apple Inc.**

Remote

Call Center Representative (Part Time Remote)

Remote

International Tax Consulting Senior Manager

Remote

Regional Head - Deputy General Manager

Remote

**Experienced Customer Support Manager, Social Media – Leading Arenaflex's DTC Platforms**

Remote

Remote Data Entry Specialist – High‑Pay $30/hr – Flexible Home‑Based Role with careerzynith – U.S. Candidates Only

Remote

**Experienced Full Stack Distributed Systems Engineer – Data Platform (Part Time/Remote) at blithequark**

Remote
← Back