[Remote] Lead Site Reliability Engineer
Note: The job is a remote job and is open to candidates in USA. Gifthealth is revolutionizing healthcare by simplifying prescription and health service management. The Lead Site Reliability Engineer is responsible for building reliable, scalable software systems and implementing DevOps practices to enhance application performance and resilience.ResponsibilitiesDesigns, builds, and maintains reliable, scalable software systems supporting Ruby on Rails applicationsEmbs reliability, performance, and operational best practices into application code and development workflowsOwns DevOps practices including CI/CD reliability, deployment strategies, and release safetyLeads incident response, debugging, and root cause analysis across application and platform layersImplements and evolves observability (logging, metrics, tracing) within application and service codePartners with engineering teams on architecture, capacity planning, and technical standardsSkillsBachelor's degree in computer science, engineering, or related field OR equivalent professional experience in software engineering, SRE, or DevOps roles5+ years of experience in software engineering, SRE, or DevOps rolesHands-on experience building and operating Ruby on Rails applications in productionExperience in owning production incidents and application-level reliabilityKnowledge of Ruby on Rails application architecture and production operations; software reliability engineering principles (SLOs, SLIs, error budgets); and modern DevOps and CI/CD practicesStrong software engineering skills (Ruby and/or comparable backend languages)Debugging and performance optimization of production applications skillsCI/CD pipelines, deployment automation, and release tooling skillsMonitoring and observability tooling (Datadog, New Relic, Prometheus, etc.) skillsAbility to write production-quality code that improves system reliabilityAbility to collaborate with product and engineering teams to influence design decisionsAbility to troubleshoot complex, cross-system failuresCloud platform certifications (AWS, GCP, Azure)SRE or DevOps-focused certificationsExperience in high-growth or scaling engineering organizationsExperience working in regulated or customer-impact–sensitive environmentsKnowledge of security and compliance considerations in production systemsInfrastructure as Code (Terraform or similar) skillsContainerization and orchestration (Docker) skillsAbility to mentor engineers on operational ownership and reliability practicesAbility to balance speed of delivery with long-term system healthCompany OverviewGiftHealth is a healthcare tech startup that streamlines pharmacy experience with free delivery and competitive medication pricing. It was founded in 2020, and is headquartered in Columbus, Ohio, USA, with a workforce of 501-1000 employees. Its website is https://www.gifthealth.com.