[Remote] Principal Data Scientist (AI)
Note: The job is a remote job and is open to candidates in USA. Octave is a company that provides mission-critical software for organizations to make informed decisions across the asset lifecycle. They are seeking a Principal Data Scientist to build predictive models and implement Generative AI features for their compliance management platform, requiring expertise in developing and maintaining ML systems in production environments.ResponsibilitiesBuild and deploy Generative AI features using foundation models (AWS Bedrock, OpenAI, Anthropic Claude) and RAG architectures with vector databases for compliance document understandingDesign agentic AI systems that autonomously handle compliance workflows, document review, regulatory mapping, and multi-step reasoning tasksImplement comprehensive LLM evaluation frameworks with automated pipelines, custom metrics, benchmark datasets, and safety guardrails ensuring regulatory complianceBuild end-to-end MLOps pipelines for model training, deployment, monitoring, versioning, and automated retraining with drift detectionDevelop predictive models for compliance risk scoring, regulatory change impact, anomaly detection, and time-series forecastingWrite production-quality Python code for data processing, feature engineering, API development (FastAPI/Flask), and ETL/ELT workflowsLead A/B experiments and product analytics to measure AI feature impact and drive data-driven decision-makingCreate explainability frameworks (SHAP/LIME) and monitoring dashboards ensuring transparency and regulatory adherenceCollaborate with cross-functional teams to translate business needs into ML solutions and communicate insights to stakeholdersSkills7+ years in data science, ML engineering, or related roles3+ years building NLP/generative AI applications and implementing MLOps in productionBachelor's or Master's degree in Data Science, Computer Science, Statistics, or related field (PhD preferred)Track record of deploying ML systems processing large-scale datasets with proper monitoring and governancePython (5+ years): Production-level experience with Pandas, NumPy, scikit-learn, XGBoost, TensorFlow/PyTorch, Hugging Face Transformers, FastAPI/Flask, MLflow, and pytestSQL: Advanced proficiency with complex queries, window functions, and optimizationMachine Learning & NLP: Strong foundation in supervised/unsupervised learning, deep learning, document understanding, text classification, and semantic analysisGenerative AI & LLMs: Hands-on experience with foundation models (GPT, Claude, Llama), prompt engineering, RAG architectures, and vector databases (Pinecone, Weaviate, Chroma)MLOps & ModelOps: End-to-end experience with ML pipelines, experiment tracking (MLflow, W&B), model versioning, feature stores, drift detection, CI/CD for ML, and Docker containerizationLLM Evaluation: Experience with evaluation frameworks (RAGAS, DeepEval), custom metrics, benchmark datasets, and human-in-the-loop validationCloud & AWS: Experience with AWS services including SageMaker, Bedrock, S3, Lambda, EC2, and CloudWatchStatistics & Experimentation: Strong foundation in statistics, A/B testing, causal inference, and experimental designVisualization: Proficiency with Tableau, Power BI, or Python visualization librariesExperience with agentic AI frameworks (LangGraph, LangChain, AutoGen, CrewAI)Knowledge of Life Sciences/regulated industries (FDA, EMA, ISO, GxP) and compliance management systemsFamiliarity with big data tools (Spark, Databricks, Snowflake), orchestration (Airflow, Kubeflow), and monitoring tools (Datadog, Prometheus)Experience with LLM fine-tuning, document processing libraries, multi-modal AI, or distributed trainingUnderstanding of ML governance, bias detection, model risk management, and data privacy regulations (GDPR, CCPA, HIPAA)Experience working in agile environments with JiraAWS ML certifications or similar credentialsCompany OverviewOctave provides mission-critical software that empowers organizations to make informed decisions across every stage of the asset lifecycle. It was founded in 1985, and is headquartered in Madison, Alabama, USA, with a workforce of 5001-10000 employees. Its website is https://www.octave.com/.