[Remote] Senior Site Reliability Engineer, Infrastructure

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. Vultr is on a mission to make high-performance cloud infrastructure easy to use, affordable, and locally accessible for enterprises and AI innovators around the world. They are seeking a highly skilled and experienced Senior Site Reliability Engineer to build and own the observability pipeline for their global datacenter infrastructure.ResponsibilitiesDesign and build the observability pipeline for datacenter infrastructure including CDUs, PDUs, bare metal servers, and provisioning workflows, collecting telemetry via Redfish, IPMI, SNMP, and OpenTelemetryOwn the full stack from data collection through to visualization and alerting in Grafana, Loki, and MimirBuild dashboards and alerting that are actionable and meaningful for stakeholder teams including Datacenter Ops, SysAdmin, Network, and ProvisioningEstablish standards and patterns for how datacenter infrastructure telemetry is collected, stored, and visualized across Vultr's global footprintPartner closely with stakeholder teams to understand their operational needs and translate them into observable, measurable signalsDrive infrastructure-as-code practices across the observability pipeline to ensure consistency, repeatability, and maintainabilitySkills5+ years of experience in site reliability, platform, or infrastructure engineering in a production environmentHands-on experience building and operating observability pipelines including metrics, logs, and alerting using Grafana, Loki, Mimir, or equivalent toolingWorking knowledge of datacenter hardware telemetry protocols including Redfish, IPMI, and/or SNMPStrong Linux fundamentals and operational experience in production infrastructure environmentsDemonstrated experience with infrastructure-as-code and configuration management tooling (Terraform, Ansible, Chef or similar)Strong cross-functional communication skills and experience delivering tooling for operational stakeholder teamsBenefits100% company-paid insurance premiums for employee medical, dental and vision plans.401(k) plan that matches 100% up to 4%, with immediate vestingProfessional Development Reimbursement of $2,500 each year11 Holidays + Paid Time Off Accrual + Rollover PlanIncreased PTO at 3 year and 10 year anniversary + 1 month paid sabbatical every 5 years + Anniversary Bonus each year$500 stipend for remote office setup in first year + $400 each following yearInternet reimbursement up to $75 per monthGym membership reimbursement up to $50 per monthCompany paid Wellable subscriptionCompany OverviewVultr is an AI cloud infrastructure platform offering latest generation NVIDIA GPUs and AMD CPUs and GPUs across 32 worldwide regions It was founded in 2014, and is headquartered in West Palm Beach, Florida, USA, with a workforce of 201-500 employees. Its website is https://www.vultr.com.Company H1B SponsorshipVultr has a track record of offering H1B sponsorships, with 1 in 2024. Please note that this does not guarantee sponsorship for this specific role.

Apply Now →
← Back