[Remote] DevOps Engineer - Senior Vice President
Note: The job is a remote job and is open to candidates in USA. iCapital is a company focused on ensuring that production and development environments operate smoothly and securely. They are seeking a Senior Vice President DevOps Engineer to leverage advanced cloud capabilities, support MLOps pipelines, and partner with various teams to deliver automated platforms for AI and machine learning workloads.ResponsibilitiesDesign, build, and operate MLOps pipelines supporting the full ML lifecycle (training, validation, deployment, monitoring)Enable production workloads for AI/ML and Generative AI systems, including LLM-based servicesDevelop and maintain CI/CD pipelines for AI/ML services and supporting infrastructureBuild and manage cloud-native infrastructure on AWS, with heavy use of Kubernetes and containerized workloadsAutomate infrastructure provisioning and configuration using Infrastructure as Code (Terraform)Implement model versioning, experiment tracking, and artifact management across environmentsEnsure reliability, scalability, observability, and cost efficiency of AI platformsPartner with AI/ML engineers to operationalize models and standardize deployment patternsImplement monitoring and alerting for system health, model performance, and driftEnforce security, compliance, and governance requirements for AI workloadsParticipate in incident response, root cause analysis, and continuous improvement initiativesDocument standards, best practices, and reference architectures for MLOps and AI infrastructureSkills15+ years of experience in DevOps, SRE, or Platform Engineering, with AWS as a primary cloudExperience supporting machine learning systems in production, including deployment and monitoring concernsHands-on experience with machine learning platforms, particularly AWS SageMaker (required)Strong hands-on experience with Kubernetes, containerized workloads, and cloud networkingProven experience building and operating CI/CD pipelines (e.g., GitLab CI, ArgoCD)Strong proficiency with Terraform and scripting/programming in Python or similar languagesSolid Linux, systems, and troubleshooting fundamentalsExcellent communication skills and ability to work across teamsDirect experience with MLOps platforms and tooling (model registries, experiment tracking, feature stores)Exposure to Generative AI / LLM workloads in production environmentsFamiliarity with data stores commonly used in ML systems (e.g., Postgres, DynamoDB, object storage)Experience operating in regulated or fintech environmentsBackground in cost optimization for compute-intensive workloadsStrong written and verbal communication skillsAWS certifications are a plusBenefitsEquity for all full-time employeesAn annual performance bonusA comprehensive benefits package that includes an employer matched retirement planGenerously subsidized healthcare with 100% employer paid dental, vision, telemedicine, and virtual mental health counselingParental leaveUnlimited paid time off (PTO)Employees in this role will work in the office Monday-Thursday, with the flexibility to work remotely on FridayCompany OverviewDice is a job-searching platform for technology professionals. It is a sub-organization of DHI Group. It was founded in 1990, and is headquartered in Santa Clara, California, USA, with a workforce of 201-500 employees. Its website is http://www.dice.com.Company H1B SponsorshipDice has a track record of offering H1B sponsorships, with 2 in 2022, 4 in 2021, 5 in 2020. Please note that this does not guarantee sponsorship for this specific role.