[Remote] Machine Learning Platform Lead Engineer, Training and Inference

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. Paramount is a company on a mission to unleash the power of content, and they are seeking a Senior Lead ML Platform Engineer to architect and own the technical direction for their Training and Inference infrastructure. This role involves leading the adoption and optimization of distributed training and managing a high-performance inference environment to ensure efficient model training and serving.ResponsibilitiesTechnical Roadmap & Strategy: Own the long-term architectural direction for the Training and Inference domains, ensuring the platform scales 10x over a 1–3 year horizonDistributed Training Leadership: Lead the implementation and optimization of Ray/AnyScale, providing a unified compute layer for batch processing, model training, and reinforcement learningHigh-Performance Inference: Design and maintain K8s-based inference servers (e.g., Triton, TorchServe, or vLLM) optimized for GPU memory management and high throughputHardware & Cost Optimization: Navigate the trade-offs between different GPU instances (A100s, H100s, T4s), optimizing for cost, availability, and performanceCross-Team Standardization: Solve high-leverage problems that affect multiple pods (e.g., Entry, Session, Presentation), establishing reusable patterns for CI/CD, model versioning, and canary deploymentsReliability Engineering: Define and enforce SLIs/SLOs for the platform, ensuring that infrastructure failures never interrupt the user-facing personalization experienceMentorship & Coaching: Act as a technical mentor to senior engineers across the ML Platform and Applied ML pods, raising the bar for system design and operational rigorSkills6-8+ years of experience in ML Infrastructure, Platform Engineering, or high-scale Backend EngineeringExtensive experience with Kubernetes (K8s) and serving frameworks for large-scale ML modelsStrong knowledge of GPU architecture, CUDA, and optimizing ML workloads for hardware accelerationProven track record of owning the technical direction for a major domain and driving impact across multiple teamsExperience with Infra-as-Code (Terraform/Pulumi) and building automated MLOps pipelinesDeep expertise with Ray (AnyScale) or similar distributed compute frameworksFamiliarity with ML observability tools (Prometheus, Grafana, Weights & Biases, or MLFlow)Experience managing multi-cloud or hybrid-cloud ML environmentsDeep knowledge of Python and C++ for performance-critical systemsBenefitsMedicalDentalVision401(k) planLife insurance coverageDisability benefitsTuition assistance programPTOThis position is bonus eligible.Generous paid time off.Opportunities for both on-site and virtual engagement events.Unique opportunities to make meaningful connections and build a vibrant community, both inside and outside the workplace.Company OverviewParamount is a leading media and entertainment company that creates premium content and experiences for audiences worldwide. It was founded in 1914, and is headquartered in New York, New York, USA, with a workforce of 10001+ employees. Its website is https://www.paramount.com.Company H1B SponsorshipParamount has a track record of offering H1B sponsorships, with 2 in 2024. Please note that this does not guarantee sponsorship for this specific role.

Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Experienced On-Site Customer Service Representative – Delivering Exceptional Client Experiences at careerzynith

Remote

Technical Business Analyst (Secret II Clearance)

Remote

**Experienced Customer Service Representative – Patient Support and Billing Operations (Part-Time Remote Work-from-Home Opportunity)**

Remote

[Remote-Position] Call Center/Customer Service Representative-USA

Remote

Customer Service Representative – Kansas‑Based Inbound Medical Transportation Coordination Specialist at arenaflex

Remote

Healthcare Recruiter (Remote | Performance-Based)

Remote

[Remote] Senior Database Administrator - Remote / Telecommute

Remote

Software Engineer, iOS Core Product - Bangalore, India

Remote

USA Airlines Remote Jobs $54 in Data Entry

Remote

Junior Cloud DevOps Engineer - Remote

Remote
← Back