[Remote] Senior Site Reliability Engineer - Remote
Note: The job is a remote job and is open to candidates in USA. Optum is a company that develops cutting-edge solutions to help people live healthier lives and improve the health system. The Site Reliability Engineer will architect, develop, and maintain Optum Serve's cloud environment, collaborating with various teams to ensure a secure and high-performance cloud infrastructure.ResponsibilitiesBuild, maintain, and operate IaaS and PaaS infrastructure in Azure commercial and government cloudsWork closely with dev teams to identify and measure SLOs, SLAs and SLIsAct a strong contributor to development of platform services including architecture, provisioning, configuration, deployment, and supportPerform integrations with central logging, metrics dashboards, instrumentation, incident monitoring and managementBuild/integrate/administer systems and tools that enable engineering teams to observe their applications in production with autonomy (Dashboards, APMs)Support software and/or cloud-infrastructure in an on-call rotation basisAssist with identification and remediation of technical problems at the root cause by continuously implementing automation, self-healing, and real-time monitoring to production systemsMaintain and improve operational tooling, frameworksBuild frameworks that test the performance and resiliency of our platform services/toolsAutomate alerts for metrics on performance, cost, vulnerabilities, risk, compliance violationsImprove processes and champion automation of any manual items around supportSkills6+ years of experience working within a cloud engineer/SRE roleExperience with infrastructure as code (IaC) tools like Terraform, PulumiExperience with Kubernetes deployment tools like Helm, ArgoCD, FluxExperience supporting infrastructure in production cloud environmentsSome experience with monitoring tools (Azure Monitor, Splunk, Dynatrace, Graphana, Prometheus)Experience working with RESTful servicesExpert knowledge of a cloud service providerExpert knowledge and hands on production experience in Kubernetes (bare metal or managed) cluster setup and managementKnowledge of Encryption, Public Key Infrastructure (PKI), understanding of OWASPUnderstanding of identity and access management (IAM)Familiarity with IDEs and Source Control tools like Visual Studio Code and GitProven solid awareness of networking and internet protocolsAvailable and willing to be 24/7 on-call rotationUnited States citizenship is required for this positionBachelor's Degree in Computer Science, Information Technology, Software Engineering, Math, PhysicsMaster's Degree with coursework focused on advanced algorithms, mathematics in computing, data structures or related fieldExpert knowledge of AzureDemonstrate passion about infrastructure automationProven ability to prioritize work in a fast-paced environmentAll employees working remotely will be required to adhere to UnitedHealth Group's Telecommuter PolicyBenefitsA comprehensive benefits packageIncentive and recognition programsEquity stock purchase401k contribution (all benefits are subject to eligibility requirements)Company OverviewOptum provides primary care and optical care services. It was founded in 1985, and is headquartered in Albuquerque, New Mexico, USA, with a workforce of 10001+ employees. Its website is https://nm.optum.com.