[Remote] Data Scientist, AI Data Foundations

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. MeridianLink is a company focused on data engineering and AI applications. The Data Scientist in AI Data Foundations will design and build curated data structures for AI and ML applications, ensuring high-quality data for model training and inference while leading data discovery efforts to uncover trends in lending and account-opening data.ResponsibilitiesBuild and maintain vector stores for RAG: Design embedding pipelines, chunking strategies, indexing approaches, and refresh patterns for the vector stores powering retrieval-augmented generation across MeridianLink productsOwn the feature store: Design, build, and operate feature store assets used for model training and online/offline inference, including feature definitions, freshness SLAs, lineage, point-in-time correctness, and reuse across teamsDesign graph data structures: Build graph databases that model relationships between applicants, applications, products, lenders, decisions, and outcomes — and make them queryable for both AI use cases and analytical investigationsLead data discovery: Profile our lending, deposit, and behavioral datasets to identify hidden trends, segments, anomalies, and potential model drivers; turn findings into actionable hypotheses for product, risk, and growth teamsEngineer for AI consumption: Build the curated, AI-ready datasets that downstream model builders, application engineers, and analysts rely on — with appropriate quality, documentation, and governance baked inEvaluate retrieval and feature quality: Define and run evaluation frameworks for RAG retrieval quality, feature drift, embedding quality, and graph completeness; iterate based on what the metrics tell youPartner with model builders: Work closely with ML engineers and applied scientists to make sure the data structures you build accelerate their work rather than slow it downChampion responsible data use: Partner with governance, security, and compliance to ensure that AI-facing data assets respect data classification, customer consent, and regulatory boundaries from day oneCommunicate findings: Translate discovery work into clear narratives — write-ups, notebooks, dashboards, and short presentations — that help non-technical stakeholders act on what the data is showingSkills4–7 years of experience in a data science, ML engineering, or applied data role, with a meaningful portion of that time spent building data assets that other people's models or applications consumedHands-on experience designing and operating vector stores for RAG or semantic search, including embedding generation, chunking, indexing, and retrieval evaluationExperience building or operating a feature store (e.g., Databricks Feature Store, Feast, or a custom internal platform), including offline training and online serving patterns and point-in-time correctnessExperience modeling and building graph data structures using Neo4j, TigerGraph, Azure Cosmos DB Gremlin, or similar graph databases — and writing graph queries to answer real questionsStrong proficiency in Python (pandas, NumPy, scikit-learn, PySpark) and SQL; comfortable working day-to-day in Databricks notebooks and jobsPractical experience with embedding models and LLM tooling (e.g., Hugging Face transformers, OpenAI / Azure OpenAI APIs, LangChain or similar) in a production or near-production contextDemonstrated data discovery skills: profiling messy real-world datasets, surfacing non-obvious patterns, validating findings statistically, and explaining them clearlySolid grounding in classical ML concepts — supervised vs. unsupervised learning, train/test discipline, leakage, evaluation metrics — even though you will not own model training day-to-dayStrong written and verbal communication skills; able to write up findings for both technical and business audiencesExperience working in a SaaS or FinTech environment, particularly with lending, deposit, credit, fraud, or KYC/AML dataExperience with Databricks-native AI/ML tooling: Databricks Vector Search, Databricks Feature Store, MLflow, and Unity CatalogFamiliarity with open-source vector databases such as pgvector, Pinecone, Weaviate, Chroma, or FAISS, and a clear point of view on when to use whichExperience with Microsoft Azure data and AI services (Azure OpenAI, Azure AI Search, ADLS Gen2)Experience evaluating RAG systems end-to-end (recall@k, faithfulness, answer quality, hallucination measurement)Exposure to graph algorithms (community detection, link prediction, centrality) applied to real business problemsBachelor's or Master's degree in Computer Science, Statistics, Mathematics, Engineering, or a related quantitative field, or equivalent professional experienceCompany OverviewMeridianLink is a digital lending platform that helps financial institutions through a configurable platform. It was founded in 1998, and is headquartered in Costa Mesa, California, USA, with a workforce of 501-1000 employees. Its website is https://www.meridianlink.com.Company H1B SponsorshipMeridianLink has a track record of offering H1B sponsorships, with 14 in 2025, 5 in 2024, 1 in 2023, 12 in 2022, 11 in 2021, 1 in 2020. Please note that this does not guarantee sponsorship for this specific role.

Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

**Experienced Chat Support Officer - Work from Home with arenaflex**

Remote

Part-Time Beauty Advisor - Sephora

Remote

Experienced Online Chat Consultant – Customer Service and Sales Expert for Web-Based Client Interactions at blithequark

Remote

Visual Designer

Remote

Senior Product Manager, Benefits

Remote

Inbound Customer Service Representative - Hybrid

Remote

**Experienced Enterprise Customer Base Account Executive – Driving Strategic Growth and Customer Satisfaction**

Remote

Bilingual/Spanish Speaking Medical Assistant

Remote

Frontend / GIS Visualization Engineer

Remote

SDR | Sales Development Representative Junior

Remote
← Back