[Remote] Senior Platform Telemetry Engineer

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. NVIDIA is a leading technology company known for its innovations in GPU technology and AI computing. They are seeking a Senior Platform Telemetry Engineer to design and implement fleet management solutions for scaling AI infrastructure, while collaborating with customers and teams to ensure effective product development and delivery.ResponsibilitiesDrive next generation fleet management solutions for scaling AI infrastructure using GPUs and Grace solution from Nvidia. Work with customers, product management and other architects to narrow down on requirements for implementation to ensure speed of light product developmentBring up clarity on architecture for fleet health monitoring and fault-remediation solution at scale. Work with customers and other architects, understand their requirements on health monitoring, making best use of available capabilities in-band as well as out of band. Detailed architecture, do POCs to validate architectureEducate customers about product architecture and take feedback to make necessary changes. Write architecture specs, design documents and own end to end delivery of product by working across the teams. Do code review for the code produced because of architecture specsEnsure product is properly tested by working with the development team to enhance unit testing and proper test plan in placeDrive product life cycles with QA teams to productize the code and be responsible as a product ownerArticulate requirements as part of Jira and bug management tools and work out an end-to-end execution plan in collaboration with other managersContribute to all phases of product development, from product definition, architecture, and design, through implementation, debugging, testing and early customer supportSkillsBS, MS, or PhD in EE/CS or related field of education (or equivalent experience)5+ years hands-on coding experienceStrong knowledge of time series databases like Influxdb & PrometheusStrong knowledge of building and consuming REST APIs (Redfish is big plus)Strong knowledge of telemetry visualization solutions like Grafana & InfluxStrong knowledge of firmware architecture, optimize firmware for low latency APIsStrong knowledge of analyzing algorithms for time & space complexity and project system resource requirementsProven record of solutions for scalabilityStrong and demonstrable skill in C/C++ and PythonExperience programming and debugging skills for server platformsExperience in SCM (e.g., Git, Perforce) and project management tools like JiraExcellent written and oral communication skillsExcellent work ethicsGreat sense of teamworkLove to produce quality work and commitment to finish your tasks every single daySelf-starter who loves to find creative solutions to complicated problems and hands on with codingExperience building telemetry collection & analysis enginesExperience with RedfishExperience with notification systems like PagerDutyActive Open Compute (OCP) and DMTF contributor in relevant areasHands on with x86 or ARM system architectureFamiliarity with Confidential ComputeExperience with ML and multi-variable optimization techniquesEducation RequirementsBS, MS, or PhD in EE/CS or related field of education (or equivalent experience).BenefitsEquityBenefitsCompany OverviewNVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI. It was founded in 1993, and is headquartered in Santa Clara, California, USA, with a workforce of 10001+ employees. Its website is https://www.nvidia.com.Company H1B SponsorshipNVIDIA has a track record of offering H1B sponsorships, with 448 in 2026, 1872 in 2025, 1354 in 2024, 976 in 2023, 835 in 2022, 601 in 2021, 529 in 2020. Please note that this does not guarantee sponsorship for this specific role.

Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

[Remote-Position] Patient Registration Rep

Remote

Remote Airlines Jobs At American Airlines - Part Time

Remote

Remote Data Entry Specialist – Part-Time Flexible Position | No Experience Required | Data Management & Administrative Support Role at arenaflex

Remote

Mobile Repair Technician, Coppell, TX

Remote

Customer Chat Support Specialist - Remote Entry-Level Opportunity to Make a Real Difference

Remote

Virtual Customer Support Representative - Entry Level at careerzynith

Remote

**Experienced Licensed Customer Service Representative – Personal Lines Insurance**

Remote

Customer Success Manager (New York City Metro Area)

Remote

[Remote] Marketing Associate, Brazil

Remote

Staff Nurse-IMT Manhattan

Remote
← Back