[Remote] Senior Site Reliability Engineer (Remote Build)
Note: The job is a remote job and is open to candidates in USA. Remote is solving modern organizations’ biggest challenge – navigating global employment compliantly with ease. As a Senior Site Reliability Engineer for Remote Build, you'll own the operational excellence and infrastructure strategy that makes Build's platform reliable, performant, and safe for customers.ResponsibilitiesDesign, implement, and maintain infrastructure-as-code patterns using Terraform and Kubernetes that support both standard connectors and custom buildsBuild and maintain comprehensive monitoring, logging, and alerting systemsLead incident response efforts, conduct post-mortems, and drive continuous improvement in system reliabilityWork with our Security team to embed security into every layer of Build infrastructureEnsure we meet compliance requirements across 100+ jurisdictions without creating friction for developers or customersContinuously optimize system performance, resource utilization, and cloud costsMake recommendations that improve both reliability and unit economicsIdentify manual operational toil and systematically eliminate itBuild tools and processes that let teams operate efficiently without scaling headcountPartner with platform teams to ensure APIs, MCP, and CLI are resilient and observableGive infrastructure feedback that shapes how the platform evolvesSkillsSenior-level SRE experience: demonstrated experience in a Site Reliability Engineering, DevOps Engineering, or SysOps role. You have stood up and operated production systems at scaleKubernetes and AWS: deep, hands-on experience running Kubernetes in production. Solid AWS fundamentals across compute, networking, storage, and managed servicesInfrastructure-as-code: Proficiency with Terraform or similar IaC tools. You write code to define infrastructure; you don't click buttons in the consoleCI/CD and deployment automation: real experience setting up and operating GitLab, GitHub Actions, Jenkins, or similar. You understand deployment strategies, rollback mechanisms, and safety netsScripting and systems knowledge: strong bash scripting. Comfortable debugging system-level issues, reading logs, and understanding Linux kernel basicsGreat communication: you explain complex infrastructure decisions clearly to both engineers and non-technical stakeholders. You write clear runbooks and documentationExperience with 1+ backend programming language (Elixir, Python, Go, Java, Node.js, etc.)Experience in consultancy settingsContainer registry and artifact management (ECR, Docker Hub, etc.)Observability stack depth (Datadog, Prometheus, ELK, Grafana, or similar)Experience working with or scaling multi-tenant platformsBenefitsWork from anywhereFlexible paid time offFlexible working hours (we are async)16 weeks paid parental leaveMental health support servicesStock optionsLearning budgetHome office budget & IT equipmentBudget for local in-person social events or co-working spacesCompany OverviewRemote is an employment, payroll, and HR platform for modern, distributed businesses. It was founded in 2019, and is headquartered in San Francisco, California, USA, with a workforce of 1001-5000 employees. Its website is https://remote.com.Company H1B SponsorshipRemote has a track record of offering H1B sponsorships, with 1 in 2026, 2 in 2024, 2 in 2023. Please note that this does not guarantee sponsorship for this specific role.