Applied AI Data Engineer (Vector Databases, Data Management)

Remote Full-time
About the position As an Applied AI Data Engineer, you will be responsible for building data pipelines, vector embeddings, and retrieval mechanisms that power AI reasoning systems. Your work ensures that LLMs remain grounded in fact, efficiently retrieving high-quality, contextually relevant data without noise or hallucinations. You will design and implement features that harness vector search, retrieval-augmented generation (RAG), and domain-specific embeddings, directly influencing how AI models store, retrieve, and apply knowledge at scale. Responsibilities • Build and optimize data pipelines that transform incoming documents into high-quality embeddings for AI retrieval. • Design and implement vector search strategies using Pinecone, Weaviate, FAISS, or Vespa to improve AI response relevance. • Develop retrieval-augmented generation (RAG) workflows, ensuring models access up-to-date and high-quality context. • Fine-tune chunking strategies and indexing frequencies to enhance information recall and factual accuracy. • Integrate hybrid search approaches (semantic + keyword) to improve precision and efficiency in knowledge retrieval. • Monitor retrieval logs and LLM interaction patterns, adjusting embedding configurations for maximum relevance. • Compare model performance (GPT-4, Claude, Llama 2) across different embedding structures and refine tuning strategies. • Experiment with metadata filtering techniques to dynamically surface the most relevant data for AI reasoning agents. • Collaborate with ML engineers and AI researchers to ensure data pipelines align with evolving AI capabilities. Requirements • 5-8+ years of experience in Data Engineering, AI Systems, or Machine Learning Infrastructure. • 3+ years of hands-on experience working with vector databases, embeddings, and retrieval-augmented generation (RAG). • Strong understanding of vector search algorithms, indexing strategies, and hybrid search techniques. • Expertise in building and scaling data pipelines for AI-driven applications. • Proficiency in Python, along with experience using libraries such as Hugging Face, LangChain, and OpenAI SDKs. • Hands-on experience with vector database platforms (Pinecone, Weaviate, FAISS, ChromaDB, or Vespa). • Deep knowledge of LLM retrieval strategies, chunking methodologies, and context optimization. • Familiarity with semantic search, keyword search, and metadata filtering techniques. • Strong grasp of data governance, security, and optimization for AI-driven knowledge retrieval. • Experience integrating retrieval mechanisms with multi-agent AI systems. Nice-to-haves • Experience in fine-tuning transformer models for domain-specific retrieval tasks. • Familiarity with real-time indexing and adaptive embedding refresh strategies. • Understanding of LLM hallucination mitigation and factual consistency techniques. • Experience building scalable knowledge graphs and structured AI databases. • Background in AI-powered document processing and knowledge extraction. Benefits • Medical, dental, vision insurance. • Short and long-term disability insurance. • Life insurance. • 401k available on the first day of the month after start date. • Flexible PTO. Apply tot his job
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Experienced Remote Data Entry Clerk – Part-Time Home-Based Opportunity for Detail-Oriented and Organized Individuals

Remote

CAD Designer Needed for Small Medical/Wearable Device Prototype (Fusion360/SolidWorks)

Remote

Experienced Customer Service Representative – Remote Position for Delivering Exceptional Travel Experiences at blithequark

Remote

Experienced Remote Data Entry Specialist – Flexible Work from Home Opportunities for Students and Career Starters

Remote

[Remote] Work from Home Remote Sales Manager $6k - 25k month *No Experience Necessary FT/PT

Remote

Experienced Full Stack Software Engineer – Web & Cloud Application Development

Remote

Experienced Remote Infrastructure Engineer – American Express Boston $25/Hour – Innovative Technology and Customer Experience

Remote

Experienced Data Entry Specialist for Teens - Work from Home Opportunity at blithequark

Remote

Senior Full-stack Developer

Remote

**Experienced Online Adjunct Instructor - Philosophy and Applied Ethics: Join Our Dynamic Team and Shape the Minds of Future Leaders**

Remote
← Back