MLOps Lead, Central Technology

Remote Full-time
About the position Responsibilities • Provide technical MLOps leadership for a team of MLOps Engineers, managing and leading the team in operating AI training and inference systems. • Drive the application of MLOps and DevOps principles across multiple platforms, ensuring peak operational efficiency. • Define end to end metrics program including full proactive monitoring and alerting systems for the MLOps team. • Facilitate model training through collaboration with AI Researchers to ensure best practices in machine learning and deep learning. • Optimize Kubernetes based AI Lifecycle platform through IAC practices and integrate with On-Prem HPC systems. • Collaborate on Data systems for AI model training with Data Infrastructure Eng team and Science data teams. • Lead MLOps team supporting on-call rotation with a focus on automation and proactive alerting. Requirements • BS, MS, or PhD degree in Computer Science or a related technical discipline or equivalent experience. • 7+ years of relevant coding and systems experience. • 5+ years of systems Architecture and Design experience, with a broad range of MLOps experience. • Proven technical leadership in SRE and MLOps related experience. • Strong experience scaling containerized applications on Kubernetes or Mesos. • Cloud Platform proficiency with AWS, GCP, or Microsoft Azure. • MLOps experience working with medium to large scale GPU clusters in Kubernetes. • Working knowledge of Nvidia CUDA and AI/ML custom libraries. • Knowledge of Linux systems optimization and administration. • Solid Coding experience with a systems language such as Rust, C/C++, C#, Go, Java, or Scala. • Expertise with a scripting language such as Python, PHP, or Ruby. • Experience in integrating Data with the AI Lifecycle. • AI/ML Platform Operations experience in an environment integrated with challenging data and systems platform challenges. • Large scale Streaming data systems integration experience. • Experience with Hadoop, Spark, and/or Kafka deployments. • Workflow scheduling tools experience such as Apache Airflow, Dagster, or Apache Beam. • Understanding of Data Engineering, Data Governance, Data Infrastructure, and AI/ML execution platforms. Nice-to-haves • Experience with PyTorch, Keras, or Tensorflow. • Experience with HPC and Slurm. Benefits • Generous employer match on employee 401(k) contributions. • Annual benefit for employees that can be used for housing, student loan repayment, childcare, commuter costs, or other life needs. • CZI Life of Service Gifts awarded to employees to support causes closest to them. • Paid time off to volunteer at an organization of your choice. • Funding for select family-forming benefits. • Relocation support for employees moving to the Bay Area. Apply tot his job
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Join Today: Experienced Customer Service Associate Representative – Remote Telehealth Support Specialist for MD Live By Evernorth

Remote

Full Stack Developer

Remote

Part Time Data Entry and Researcher Assistant for Dynamic Industry Leader in Publications

Remote

Supply Chain Planning, Principal Consultant

Remote

Experienced Customer Service Representative - Overnights in Glenview, IL at careerzynith

Remote

Remote Tax Senior: ASC 740 Provisions, CPA/EA

Remote

Care Coordination (RN)

Remote

Entry Level Data Entry Clerk Full Remote

Remote

[Remote] Business Development Representative

Remote

Experienced Customer Service Representative – Delivering Exceptional Support and Solutions to Diverse Customer Base at arenaflex

Remote
← Back