Automation QA Engineer

Remote Full-time
Ciklum is looking for a Automation QA Engineer to join our team full-time in Poland.
We are a custom product engineering company that supports both multinational organizations and scaling startups to solve their most complex business challenges. With a global team of over 4,000 highly skilled developers, consultants, analysts and product owners, we engineer technology that redefines industries and shapes the way people live.
About the role:
As a Automation QA Engineer, become a part of a cross-functional development team engineering experiences of tomorrow.
The Project: We are partnering with B&R Industrial Automation to significantly upgrade the Retrieval-Augmented Generation (RAG) architecture of their AS client's system. Our goal is to drastically reduce AI hallucinations in code generation and optimize retrieval latency without re-architecting their existing platform.
Responsibilities:
Own the evaluation lifecycle, offline acceptance testing, and KPI measurement for the AS client's RAG pipeline
Lead the co-creation and management of the project's "golden dataset" to consistently benchmark AI performance
Implement and manage the RAGAS evaluation harness and automated CI/CD regression testing
Track, classify, and build root-cause taxonomies for LLM hallucinations, with a specialized focus on code-generation correctness
Golden Dataset & Baselines: Collaborate with client domain experts and technical leads to build a robust synthetic test set (~90+ queries across multiple categories) and establish baseline metrics for Faithfulness, Context Precision, and Answer Relevance
Evaluation Harness: Build and automate evaluation pipelines using RAGAS and custom Python scripts, enabling A/B comparisons between the baseline, MVP, and full implementation
Regression & CI/CD Guardrails: Implement automated CI/CD regression checks within Azure DevOps, ensuring that a >5% drop in core metrics automatically blocks pipeline deployments
Hallucination Tracking: Develop a root-cause taxonomy for hallucinations and track code-generation queries separately to ensure the AI generates functionally correct and compilable output
Performance Benchmarking: Measure and monitor pipeline latency, rigorously validating P95 latency targets (sub-4.5s) under representative concurrent load
Requirements:
Background: Mid-to-Senior level experience in Data Science, Machine Learning Evaluation, AI Quality Assurance, or Data Engineering
Evaluation Frameworks: Deep, hands-on experience with LLM evaluation frameworks (e.g., RAGAS, DeepEval, TruLens) and establishing human-anchored or synthetic benchmarks
Technical Stack: Strong proficiency in Python. Solid experience with CI/CD tools (especially Azure DevOps) and integrating complex test suites into automated deployment pipelines
Data & Observability: Experience working with databases (PostgreSQL) and integrating custom telemetry or observability data (e.g., Azure App Insights) into evaluation reports
Analytical Mindset: Strong attention to detail with the ability to perform rigorous error analysis, build structured taxonomies for failures, and identify embedding drift
Personal skills:
Highly collaborative and data-driven; comfortable working directly with client SMEs to validate queries and presenting evaluation scorecards to guide engineering decisions
What`s in it for you?
Strong community: Work alongside top professionals in a friendly, open-door environment
Growth focus: Take on large-scale projects with a global impact and expand your expertise
Tailored learning: Boost your skills with internal events (meetups, conferences, workshops), Udemy access, language courses, and company-paid certifications
Endless opportunities: Explore diverse domains through internal mobility, finding the best fit to gain hands-on experience with cutting-edge technologies
Flexibility: Enjoy flexibility – full remote working possibilities
Care: We’ve got you covered with company-paid medical insurance, mental health support, and financial & legal consultations
About us:
At Ciklum, we are always exploring innovations, empowering each other to achieve more, and engineering solutions that matter. With us, you’ll work with cutting-edge technologies, contribute to impactful projects, and be part of a One Team culture that values collaboration and progress.
With delivery centers in Wrocław and Gdańsk, our 300+ professionals in Poland drive forward-thinking solutions for global clients. Join a community where collaboration sparks innovation—and your impact reaches millions.
Want to learn more about us? Follow us on Instagram, Facebook, LinkedIn.
Explore, empower, engineer with Ciklum!
Interested already? We would love to get to know you! Submit your application. We can’t wait to see you at Ciklum.
#LI-MP1

Apply To This Job
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Part Time Personal Assistant

Remote

Injury Prevention Specialist (AT, PTA, PT, OT, COTA, LMT) - Hillsboro, OR

Remote

Medical Affairs Director - Vaccines

Remote

Join Today: Association Engagement Specialist, Vertical Solutions

Remote

Experienced Full Stack Customer Service/Data Entry Representative – Remote Work Opportunity with careerzynith

Remote

Nurse Reviewer (Registered Nurse) - Remote in US

Remote

Experienced Customer Service and Data Entry Professional – Remote Opportunity for Dynamic and Tech-Savvy Individuals

Remote

Remote Work from home Data Entry Jobs - Work From Home

Remote

Sr. Software Engineer (Enterprise Zone) Multiple Roles

Remote

**Experienced Full Stack Strategic Sourcing Specialist – Professional Services Procurement**

Remote
← Back