[Remote] Lead AI Engineer
Note: The job is a remote job and is open to candidates in USA. IXOPAY is an enterprise-grade global payment infrastructure platform that utilizes AI-driven intelligence for payment solutions. The Lead AI Engineer will design and implement AI agent systems, manage performance and cost, and lead the team in delivering AI-powered products and strategies.ResponsibilitiesArchitect AI agent systems: Design, build, and deploy production single-agent and multi-agent systems using LangGraph on cloud AI platforms (Azure AI Foundry, AWS Bedrock, etc.), with Claude as the primary foundation model. Architect agent memory, state management, and context persistence for long-horizon reasoning tasksOwn evaluation and quality: Build and maintain evaluation harnesses that measure faithfulness, hallucination rate, retrieval quality, latency, and instruction-following accuracy across model versions and prompt changes. Define golden test sets, regression suites, and automated eval pipelines that gate every releaseBuild AI safety and guardrails: Implement safety guardrails, output filtering, and prompt injection defenses across all agent systems. Lead red-teaming exercises to identify failure modes before they reach production. Ensure responsible AI practices are embedded in every deploymentShip RAG-powered intelligence: Build agentic RAG pipelines with retrieval, generation, validation, and ReAct-style reasoning loops that ground outputs in real payment data. Leverage the Claude SDK, Anthropic's tool-use APIs, and MCP protocol to build agents that interact with internal systems, payment gateways, and external data sourcesManage cost and performance: Own token cost modeling and inference optimization across agent workflows. Understand the cost profile of every agent loop at production scale and make architecture decisions that balance capability with spendDesign human-in-the-loop patterns: Design human-in-the-loop review checkpoints and escalation paths for high-stakes workflows. Define where agents operate autonomously and where human oversight is required β especially in payment-critical operationsLead delivery streams: Own the AI delivery streams (Trust Score, StormTrooper, RFC Agent) from ideation through production deployment. Define technical direction, set quality standards, and make build-vs-buy decisionsInstrument observability: Set up and maintain LLM observability and tracing infrastructure (LangSmith, LangFuse, or equivalent) to monitor agent behavior, debug failures, and track quality metrics in productionDrive AI strategy: Partner with product, engineering, and payments teams to identify high-impact AI use cases β fraud scoring, intelligent routing, chargeback prediction, automated reconciliation, and beyondBuild the team: Mentor contractors and future hires. Establish coding standards, review patterns, and engineering culture for the AI functionSkillsExtensive software development experience, with many years of your career focused on data science, machine learning, and/or agentic AI development (or a combination)Demonstrated track record of building and deploying production-ready AI agents β not proofs of concept, not notebooks, but systems running in production with real users and real data. You can describe a specific production failure you caught, what the failure mode was, and what you built to prevent itExperience designing and running evaluation frameworks for LLM-powered systems β faithfulness scoring, hallucination detection, retrieval quality metrics, and regression testing across model and prompt changesHands-on experience building agents with LangGraph (state machines, conditional edges, tool nodes, memory management, human-in-the-loop patterns)Production experience with the Claude SDK / Anthropic API β tool use, MCP protocol, structured outputs, and prompt engineering at scaleUnderstanding of AI safety practices: output filtering, guardrail implementation, prompt injection defense, and red-teaming methodologiesStrong Python skills. You write clean, testable, well-documented codeExperience with RAG architectures β vector stores (FAISS, Pinecone, OpenSearch), embedding models, chunking strategies, and retrieval evaluationPractical understanding of token cost modeling, inference optimization, and the trade-offs between fine-tuning, retrieval-based approaches, and prompt engineering at production scaleFamiliarity with cloud AI platforms (AWS Bedrock, Azure AI Foundry, GCP Vertex AI) and supporting cloud infrastructure servicesExperience with LLM observability and tracing tools (LangSmith, LangFuse, Arize, or equivalent) for monitoring agent behavior and quality metrics in productionAbility to lead technical decisions and communicate trade-offs clearly to both engineering and business stakeholdersExperience in payments, fintech, or financial servicesBackground in NLP, information extraction, or document understandingExperience with fine-tuning approaches (LoRA, SFT, DPO/RLHF) and the judgment to know when fine-tuning is the right call versus retrieval or better promptingPrior experience scaling an AI/ML function from 0 to 1Contributions to open-source AI/agent frameworksBenefitsCompetitive salary and benefitsOpportunities for growth and developmentA collaborative and supportive team environmentMedical, Dental & Vision InsuranceFlexible Spending Account (FSA) & Health Savings Account (HSA)Employer-paid Life, AD&D, STD & LTD InsuranceUnlimited PTO & Paid Holidays401(k) Plan with Employer MatchCompany OverviewIXOPAY provides digital payment processing solutions enabling independent, flexible and global payment processing for enterprise merchants. It was founded in 2014, and is headquartered in Vienna, Wien, AUT, with a workforce of 51-200 employees. Its website is https://www.ixopay.com.