Member of Technical Staff, Inference

Remote Full-time
Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. Founded by the creators and core maintainers of vLLM, we sit at the intersection of models and hardware—a position that took years to build.

About the Role
We're looking for an inference runtime engineer to push the boundaries of what's possible in LLM and diffusion model serving. Models grow larger. Architectures shift: mixture-of-experts, multimodal, agentic. Every breakthrough demands innovations on the inference engine itself. You'll work at the core of vLLM, optimizing how models execute across diverse hardware and architectures. Your work will directly impact how the world runs AI inference.

Skills and Qualifications
Minimum qualifications:
Bachelor's degree or equivalent experience in computer science, engineering, or similar.

Deep understanding of transformer architectures and their variants.

Strong programming skills in Python with experience in PyTorch internals.

Experience with LLM inference systems (vLLM, TensorRT-LLM, SGLang, TGI).

Ability to read and implement model architectures and inference techniques from research papers.

Demonstrate the ability to contribute performant and maintainable code and debug in complex ML codebases.

Preferred qualifications:
Deep understanding of KV-cache memory management, prefix caching, and hybrid model serving.

Familiarity with RL frameworks and algorithms for LLMs.

Experience with multimodal inference (audio/image/video/text).

Contributions to open-source ML or system infrastructure projects.

Bonus points if you have:
Implemented core features in vLLM or other inference engine projects.

Contributed to vLLM integrations (verl, OpenRLHF, Unsloth, LlamaFactory, etc).

Written widely-shared technical blogs or side projects on vLLM or LLM inference.

Logistics
Location: This role is based in San Francisco, California. Will consider remote in the US for exceptional candidates.

Compensation: Depending on background, skills, and experience, the expected annual salary range for this position is $200,000 - $400,000 USD + equity.

Visa sponsorship: We sponsor visas on a case-by-case basis.

Benefits: Inferact offers generous health, dental, and vision benefits as well as 401(k) company match.
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Cloud Solution Architect – Widows365/EndPoint (H/F)

Remote

Software Developer-Amazon Boulder

Remote

Virtual CSR Apprentice – No Experience Required

Remote

Director, Field Sales – Mastercard Cybersecurity Solutions (PAC Northwest, Ohio Valley & NY NJ Eastern PA)

Remote

**Experienced Team Leader – Customer Service and Operations Management at blithequark**

Remote

Experienced Part-Time Remote Tutor for Elementary & Middle School Students - Work from Home and Make a Difference in Education with Revolution Prep

Remote

Senior Enterprise Account Executive Iberia

Remote

Clinical Coding and Documentation Specialist (Remote)

Remote

Senior Member of Technical Staff (JoinOCI-SDE)

Remote

Experienced Remote Data Entry Specialist – Entry-Level Part-Time Opportunity for Career Growth and Development at blithequark

Remote
← Back