[Remote] Software Engineer - GPU Kernels
Note: The job is a remote job and is open to candidates in USA. Baseten is an innovative company powering AI solutions for leading firms like Notion and OpenEvidence, and they are seeking a GPU Kernel Engineer to enhance AI model performance. This role focuses on designing high-performance GPU kernels and optimizing computation for machine learning operations, directly impacting production systems for millions of users.ResponsibilitiesDesign and implement high-performance GPU kernels for key ML operations, including matrix multiplications, attention mechanisms, and mixture-of-experts routingWrite and optimize code using CUDA, PTX assembly, and architecture-specific techniquesApply advanced performance optimization methods such as memory coalescing, warp-level programming, tensor core acceleration, and compute/memory overlapImplement cutting-edge features like quantization (FP8/FP4), sparsity, and compute/communication overlapIdentify and resolve performance bottlenecks using tools like Nsight Systems, Nsight Compute, and Torch ProfilerCollaborate with research teams to productionize theoretical advancementsContribute to internal and open-source GPU librariesPresent technical contributions at industry conferences (e.g., NVIDIA GTC, AWS re:Invent)SkillsStrong understanding of GPU architecture and programming paradigms: Memory hierarchy (global, shared, registers, L1/L2 cache), Thread/block/grid organization, Synchronization techniques and race condition mitigationProficient in C++ and GPU performance profiling toolsKnowledge of: CUDA C++ API, Memory access patterns and bandwidth optimization, Numerical precision and quantization strategies, Modern GPU features (e.g., tensor cores, async operations)Experience with Transformer models and attention optimization (e.g., Flash Attention)Familiarity with GPU kernel libraries: Cutlass, Triton, Thrust, CUBBackground in GEMM tuning and distributed/multi-GPU computeContributions to open-source GPU projectsResearch publications or conference presentations on GPU performanceBenefitsCompetitive compensation, including meaningful equity.100% coverage of medical, dental, and vision insurance for employee and dependentsFlexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)Paid parental leaveFertility and family-building stipend through CarrotCompany-facilitated 401(k)Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.Company OverviewBaseten is an AI infrastructure company that integrates machine learning into business operations, production, and processes. It was founded in 2019, and is headquartered in San Francisco, California, USA, with a workforce of 201-500 employees. Its website is https://www.baseten.co.Company H1B SponsorshipBaseten has a track record of offering H1B sponsorships, with 1 in 2026, 6 in 2025, 8 in 2024, 1 in 2023, 1 in 2020. Please note that this does not guarantee sponsorship for this specific role.