[Remote] Principal Site Reliability Engineer (SRE)
Note: The job is a remote job and is open to candidates in USA. Symmetrio is a rapidly growing healthcare technology organization focused on advanced healthcare technology solutions. They are seeking a Principal Site Reliability Engineer (SRE) to ensure the reliability, scalability, security, and performance of a mission-critical SaaS platform supporting healthcare providers across the United States.ResponsibilitiesServe as the primary technical owner for production reliability across U.S. customer environmentsInvestigate and resolve complex issues spanning web applications, APIs, backend services, data pipelines, cloud infrastructure, and customer integrationsLead production incident response efforts, coordinating cross-functional teams to restore service and minimize customer impactPerform root cause analysis and drive corrective actions that improve long-term system stability and resiliencePartner with software engineering and platform teams to identify recurring reliability risks and implement sustainable solutionsDesign, configure, and validate secure customer connectivity solutions including Site-to-Site VPNs, Transit Gateway integrations, routing configurations, and secure network pathsSupport customer onboarding initiatives by troubleshooting connectivity challenges and ensuring consistent implementation processesEnhance platform observability through improvements in monitoring, logging, alerting, tracing, and operational dashboardsContribute to CI/CD, infrastructure automation, and deployment processes that improve release safety and operational consistencyDevelop operational tooling that supports incident response, troubleshooting, onboarding, and system monitoring activitiesCollaborate with engineering leadership to improve cloud architecture, scalability, security, and operational readinessPartner with customer-facing teams to communicate technical issues, remediation plans, and reliability improvements in a clear and effective mannerSupport compliance, security, and risk management initiatives within highly regulated healthcare environmentsSkills6+ years of hands-on experience supporting and managing AWS-based production environments4+ years of experience supporting web applications and backend services (Python/Django experience strongly preferred)Experience with AWS networking technologies including VPCs, Site-to-Site VPNs, Transit Gateways, routing, NAT gateways, and security groupsStrong experience with Terraform and infrastructure-as-code deployment practicesExperience with containerized environments including ECS, Fargate, Kubernetes, or similar technologiesExperience building and supporting CI/CD pipelines and release automation processesFamiliarity with monitoring and observability platforms such as Datadog, CloudWatch, Sentry, Grafana, or similar toolsExperience leading production incidents, outage management, and root cause analysis initiativesExposure to Windows Server environments, Active Directory, Kerberos, and enterprise infrastructure conceptsHealthcare technology, healthcare SaaS, clinical software, or other regulated industry experienceBachelor's degree in Computer Science, Engineering, Information Technology, or a related technical fieldBenefitsHealth Care Plan (Medical, Dental & Vision)Retirement Plan (401k, IRA)Paid Time Off (Vacation, Sick & Public Holidays)Company OverviewSymmetrio prides itself on our unwavering dedication to understanding clientsโ unique needs, organizational design, culture, and operational challenges. It was founded in 2012, and is headquartered in Philadelphia, Pennsylvania, USA, with a workforce of 51-200 employees. Its website is https://www.solustaff.com.Company H1B SponsorshipSymmetrio has a track record of offering H1B sponsorships, with 1 in 2024. Please note that this does not guarantee sponsorship for this specific role.