LLM / GenAI Engineer

Remote Full-time
About The Role

The role is focused on architecting and scaling production-grade generative AI features, moving beyond basic API wrappers to build robust, deterministic systems powered by large language models. The engineer will design orchestration layers, optimize retrieval-augmented generation (RAG) workflows, and implement strict evaluation and guardrail systems to ensure safety, accuracy, and low latency at scale.

The team works at the intersection of modern software engineering and applied AI. This role involves collaborating with backend engineers and product owners to integrate intelligence into core platform workflows, ensuring LLM applications are observable, cost-effective, and highly performant.

Key Responsibilities
• Design and optimize advanced RAG pipelines, utilizing hybrid search, query rewriting, and reranking strategies to maximize retrieval quality.
• Implement systematic LLM evaluation pipelines using frameworks like Ragas, TruLens, or custom LLM-as-a-judge architectures to measure hallucination and accuracy.
• Integrate and manage enterprise-grade vector databases such as Pinecone, Milvus, or pgvector, including indexing strategies and metadata filtering.
• Develop agentic workflows and multi-agent systems using frameworks like LangGraph, Autogen, or custom state machines.
• Deploy, fine-tune, and optimize open-source models (e.g., Llama, Mistral) using LoRA, QLoRA, and quantization techniques for specialized tasks.
• Build robust guardrails and alignment layers using tools like NeMo Guardrails or Llama Guard to ensure safe and deterministic model behavior.
• Monitor LLM latency, cost, and token usage in production using tracing tools such as LangSmith, Phoenix, or Arize.

What We Are Looking For
• 3-6 years of professional software engineering experience, with at least 1.5 years dedicated to building and deploying LLM applications in production.
• Deep proficiency in Python and familiarity with asynchronous programming, FastAPI, and containerization via Docker.
• Hands-on experience with LLM orchestration frameworks like LangChain, LlamaIndex, or DSPy.
• Strong understanding of modern NLP techniques, embedding models, vector spaces, and semantic search.
• Experience deploying production applications on AWS, GCP, or Azure, utilizing managed Kubernetes or serverless containers.
• Bachelor's or Master's degree in Computer Science, Data Science, or a related quantitative technical field.
• Bonus: Experience with vLLM, TensorRT-LLM, custom model hosting, or contribution to open-source GenAI frameworks.
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Clinical Documentation Improvement (CDI) Specialist I – Alta Hospitals System Corporate/CBO – Los Angeles, CA

Remote

Telehealth Mental Health Physician MD/DO - Multi-state Remote

Remote

Account Executive (HR Services / PEO / Fractional HR Sales - 100% Commission)

Remote

Work from Anywhere & Earn with Travel – No Experience Needed - T

Remote

Call Center Agent - REMOTE - Land O' Lakes, FL

Remote

Experienced Remote Data Entry Specialist – Flexible Day & Night Shifts with Competitive Hourly Rates at arenaflex

Remote

Remote - Customer Service Agent

Remote

Regional Head - Oil & Gas/ Dairy

Remote

Experienced Remote Data Entry Associate for Teens – Flexible Work Hours and Professional Growth Opportunities at blithequark

Remote

Surrogates and Influencers Director (Remote)

Remote
← Back