[Remote] Expert Site Reliability Engineer
Note: The job is a remote job and is open to candidates in USA. Altera Digital Health is a company focused on enhancing healthcare platforms, and they are seeking an Expert Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of their services. The role involves leading incident management, automating operations, and collaborating with engineering teams to improve overall service availability and customer experience.ResponsibilitiesMaintain and improve the reliability, availability, and performance of our production environmentsLead the investigation and resolution of complex application, database, and infrastructure issuesParticipate in incident management, conduct root cause analysis (RCA), and contribute to post-incident reviews to prevent future occurrencesDefine and measure Service Level Indicators (SLIs) and Objectives (SLOs) to meet our service commitmentsDevelop proactive monitoring and alerting strategies to identify and resolve issues before they impact customersAutomate operational tasks using scripting and Infrastructure-as-Code (IaC) to improve efficiencyPartner with engineering and cloud teams to refine deployment, monitoring, and support processesProvide technical leadership during major incidents and act as a key escalation point for critical issuesSkills7+ years of experience supporting enterprise applications, infrastructure, or cloud environmentsStrong experience with APM tools such as LogicMonitor, AppDynamics, Azure Monitor, SentryOne, Dynatrace, Datadog, or New RelicDeep knowledge of Windows Server administration, IIS, .NET applications, Windows Clustering, MSMQ, Event Logs, and PerfMonStrong SQL Server experience, including performance tuning, query optimization, blocking analysis, and Always On Availability GroupsExperience with Azure cloud environments and a solid understanding of networking fundamentals (DNS, TCP/IP, load balancing, firewalls)Familiarity with ServiceNow (or other ITSM platforms) and ITIL principlesScripting with PowerShell, Python, or similar languagesInfrastructure as Code (Terraform, ARM Templates, Bicep)CI/CD pipelines and deployment automation (Azure DevOps, GitHub Actions)Experience with Kubernetes and containerized workloadsExperience implementing SLOs, SLIs, and Error BudgetsExperience in a healthcare technology or patient care environmentBachelor's Degree in Computer Science, Information Technology, or Engineering is preferred; equivalent professional experience will be consideredBenefitsCompetitive compensation and benefits packageRemote position open to candidates within the United StatesParticipate in an on-call rotation to support our 24x7 healthcare environmentOccasional after-hours work is required for activations, upgrades, and major incidentsCompany OverviewHealthcare IT should work for clinicians, not against them. It was founded in undefined, and is headquartered in , with a workforce of 5001-10000 employees. Its website is http://www.alterahealth.com.