Member of Technical Staff, Training (Bay Area, Remote)

Remote Full-time
What You’ll Do
Drive down wall-clock time to convergence by profiling and eliminating bottlenecks across the foundation model training stack stack, from data pipelines to GPU kernels

Design, build, and optimize distributed training systems (PyTorch) for multi-node GPU clusters, ensuring scalability, robustness, and high utilization

Implement efficient low-level code (CUDA, cuDNN, Triton, custom kernels) and integrate it seamlessly into high-level training frameworks

Optimize workloads for hardware efficiency: CPU/GPU compute balance, memory management, data throughput, and networking

Develop monitoring and debugging tools for large-scale runs, enabling rapid diagnosis of performance regressions and failures

What You’ll Bring
Deep experience in distributed systems, ML infrastructure, or high-performance computing (8+ years)

Production-grade expertise in Python

Low-level performance mastery: CUDA/cuDNN/Triton, CPU–GPU interactions, data movement, and kernel optimization

Scaling at the frontier: experience with PyTorch and training jobs using data, context, pipeline, and model parallelism

System-level mindset with a track record of tuning hardware–software interactions for maximum utilization

Apply To This Job
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Representative,Telemedicine

Remote

HelpDesk Technician- 4x10 Shift

Remote

Sales Manager | WFH | B2B Saas

Remote

Inventory Control Escalation Specialist in Jessup, PA

Remote

Weekend Scheduling Assistant - Part-Time

Remote

[Remote] Manager, Security Engineering

Remote

Experienced Data Entry Clerk – Flexible Online Opportunities for College Students

Remote

Experienced Customer Support Associate – Remote Learner Engagement and Success Specialist at blithequark

Remote

Technical Project Manager

Remote

Experienced Customer Success Representative – Remote Work Opportunity in Delivering Exceptional Service and Ensuring Customer Satisfaction

Remote
← Back