[Remote] ML Ops Engineer (AI)
Note: The job is a remote job and is open to candidates in USA. SewerAI is transforming underground infrastructure management through AI-powered inspection and risk analysis. They are seeking an MLOps Engineer to own the Machine Learning Operations infrastructure that powers their AI products, focusing on designing and scaling systems for machine learning models in sewer line analysis.ResponsibilitiesArchitectural Hardening: Audit, secure, and optimize our existing cloud infrastructure (AWS) to ensure high availability, fault tolerance, and security for both training and production workloadsModel Deployment & Inference: Design and maintain scalable architectures for serving deep learning models (PyTorch/TensorFlow), optimizing for low latency and high throughput in handling complex infrastructure dataCI/CD for Machine Learning: Build and maintain automated pipelines for model testing, validation, deployment, and rollbackTraining Infrastructure: Architect efficient, scalable compute environments for training complex computer vision and time-series models on large datasetsMonitoring & Observability: Implement comprehensive monitoring for model drift, data quality, and system health, ensuring rapid response to performance degradationSkillsDeep expertise in AWS (e.g., EC2, S3, EKS, SageMaker, Lambda) and cloud security best practicesStrong experience with Docker and Kubernetes for packaging and scaling ML applicationsProficiency with tools like Terraform or AWS CloudFormationExperience building robust automated pipelines using GitHub Actions, GitLab CI, or JenkinsStrong Python skills with a focus on writing clean, production-grade, and well-tested codeFamiliarity with model registry and tracking tools (e.g., MLflow, Weights & Biases)4-6+ years of experience in MLOps, DevOps, or Data Engineering, with a strong emphasis on machine learning workloadsA security-first and stability-first mindset—you think about edge cases, failure modes, and system hardening by defaultStrong collaborative instincts to work closely with Data Scientists, ensuring smooth handoffs from experimentation to productionClear communication skills to articulate architectural decisions and tradeoffs to the broader technical teamExperience with our specific data stack (Hex, dbt, ClickHouse, Anyscale, Ray, Deeplake)Familiarity with deep learning frameworks (PyTorch preferred) and optimization techniques like TensorRT or ONNXKnowledge of edge computing or deploying models to IoT devicesExperience in the infrastructure, utility, or geospatial domainsBenefitsEquity opportunities availableMedical, Dental, Vision, Basic Life, 401(k), and moreUnlimited PTOTools and resources to support successCompetitive compensation with high-growth potentialCompany OverviewSewerAI is transforming the way utilities maintain & rehab their underground infrastructure. It was founded in 2019, and is headquartered in Walnut Creek, California, USA, with a workforce of 51-200 employees. Its website is http://www.sewerai.com.Company H1B SponsorshipSewerAI has a track record of offering H1B sponsorships, with 1 in 2023, 1 in 2021. Please note that this does not guarantee sponsorship for this specific role.