[Remote] Lead AI Engineer
Note: The job is a remote job and is open to candidates in USA. EPAM Systems is seeking a Lead AI Engineer to design, build, and scale cutting-edge AI applications powered by large language models. In this role, you will partner with clients to deliver tailored LLM-driven solutions, architect agentic systems, and drive the adoption of emerging AI technologies across enterprise environments.ResponsibilitiesDesign, implement and maintain end-to-end AI applications, including chatbots, Q&A platforms, agent workflows and other LLM-driven solutionsCollaborate directly with clients to understand their needs, identify opportunities and recommend tailored AI/LLM solutions that drive business valueArchitect and optimize robust data pipelines, prompt strategies and datasets to ensure effective, accurate and scalable AI modelsEvaluate, monitor and refine AI system performance, ensure outputs are accurate, secure, scalable and compliant with industry regulations and best practicesConduct research, design experiments and perform rapid prototyping to validate technical feasibility and demonstrate the business value of AI solutionsStay current with evolving LLM technologies, frameworks, protocols (such as MCP, A2A, ACP) and methodologies, continuously improve solution quality and client outcomesDesign and implement agentic systems with frameworks such as LangChain, LangGraph and Semantic Kernel, integrate with vector databases and advanced memory architecturesDevelop and maintain APIs and system integrations for production-grade AI applications, including enterprise system integration (CRM, ERP, databases)Deploy AI solutions at scale, consider performance, cost-efficiency, maintainability, observability and security (including guardrails and prompt injection prevention)Implement and monitor retrieval systems (keyword search, vector search, embeddings), ranking algorithms and agent evaluation frameworksUse MLOps/AIOps practices for agentic systems and ensure robust observability and monitoring of deployed solutionsClearly communicate complex technical concepts and AI strategies to both technical and non-technical stakeholders, iterate on models based on user feedbackSkillsStrong proficiency in at least one modern programming language (such as Python, Java, C#, Go, etc.); experience with web frameworks like FastAPI or similar is a plusDeep understanding of the AI application development lifecycle, including production deployment, system integration and rapid UI prototyping (Streamlit, Gradio or similar)Familiarity with major LLM platforms and APIs (OpenAI, Anthropic, Amazon Bedrock, Gemini) and related frameworks (LangChain, LangGraph, LlamaIndex, Strands Agents, etc.)Knowledge of advanced AI integration patterns (e.g., RAG, agent orchestration, tool calling), retrieval systems (keyword/vector search, embeddings) and ranking algorithmsExperience to deploy AI solutions at scale, with a focus on performance, cost-efficiency, maintainability, observability and security (including guardrails and prompt injection prevention)Proven ability to evaluate generative AI quality with retrieval/classification scores, LLM-based evaluation, agent evaluation metrics and A/B testingExperience with vector databases (Pinecone, Weaviate, ChromaDB, FAISS) and semantic/hybrid searchExperience to design experiments, conduct A/B tests and iterate on models based on user feedbackExperience with enterprise system integration (CRM, ERP, databases) and deployment to cloud AI platforms or on-premise solutionsExperience with observability and monitoring tools/frameworks, and application of MLOps/AIOps practices for agentic systemsFamiliarity with emerging protocols (MCP, A2A, ACP) and advanced memory architecturesProven experience in AI engineering and delivery of ML-based solutions in production environmentsStrong problem-solving skills, attention to detail and ability to work independently and collaborativelyExcellent communication, collaboration and interpersonal skills, with the ability to explain complex technical concepts to non-technical stakeholdersBenefitsCareer plan and real growth opportunitiesUnlimited access to LinkedIn learning solutionsConstant training, mentoring, online corporate courses, eLearning and moreEnglish classes with a certified teacherSupport for employee’s initiatives (Algorithms club, toastmasters, agile club and more)Enjoyable working environment (Gaming room, napping area, amenities, events, sport teams and more)Flexible work schedule and dress codeCollaborate in a multicultural environment and share best practices from around the globeHired directly by EPAM & 100% under payrollLaw benefits (IMSS, INFONAVIT, 25% vacation bonus)Major medical expenses insurance: Life, Major medical expenses with dental & visual coverage (for the employee and direct family members)13 % employee savings fund, capped to the law limitGrocery coupons30 days December bonusEmployee Stock Purchase Plan12 vacations daysOfficial Mexican holidays, plus 5 extra holidays (Maundry Thursday and Friday, November 2nd, December 24th & 31st)Monthly non-taxable amount for the electricity and internet billsCompany OverviewEPAM leverages its core engineering expertise as a leading global product development and digital platform engineering services company. It was founded in 1993, and is headquartered in Newtown, Pennsylvania, USA, with a workforce of 10001+ employees. Its website is https://www.epam.com.