[Remote] Site Reliability Engineer
Note: The job is a remote job and is open to candidates in USA. OneStream Software is a company that empowers finance teams with a unified enterprise finance platform. The Site Reliability Engineer will focus on ensuring the reliability, performance, and availability of the platform and services while collaborating with internal teams and customers to implement and maintain operations.ResponsibilitiesImplement application/infrastructure observability solutions to ensure desired application availability, reliability, and performanceParticipate in regular On-Call rotations and share details related to incidents and their resolution through post-mortem reports and regular review meetingsProactively partner with Product and Engineering teams to identify, develop, deploy, and maintain reliable systems and servicesInfluence and create new designs, architectures, standards, and methods for large-scale systemsSustain a high level of reliability for key services and automated systemsAutomate processes to improve reliability, performance, and availabilityUpdate technical documentation, workflows, and knowledge base articlesProvide feedback in pull requests and peer coding reviewsImplement codified automated solutions that build integrations between Dynatrace, Azure DevOps and JiraSolid knowledge in focused areas of OneStream SoftwareAbility to mentor others in several technical areasUnderstanding practical use of SOC/FedRAMP controls to assist Compliance and Security teamsSkillsBS/BA in computer science, engineering, or technology-related field (or equivalent work experience)Proven work experience as a Site Reliability Engineer or in a similar role6+ years of cloud infrastructure and software development experience2+ years hands on experience of Azure Kubernetes Services (AKS) with container-based deployment skills or other platforms such as OpenShift, GKS, EKSAdvanced understanding of APM and observability tools such as Dynatrace, AppInsights, DataDog, Log Analytics, New Relic, Prometheus and GrafanaAdvanced understanding of Infrastructure-as-Code (IaC) concepts and tooling (Terraform, CloudFormation templates, Bicep or ARM templates) on Microsoft Azure, Amazon Web Services (AWS), or Google Cloud Platform (GCP)Deep knowledge of Configuration Management/Orchestration utilities such as Ansible, PowerShell DSC, Chef, and PuppetAdvanced understanding of cloud concepts including elasticity, security, and identity managementWell versed familiarity with Agile Development methodologies utilizing Jira or Azure DevOps Boards6+ years of hands-on experience with the following technologies, tools, and concepts: Automating processes using PowerShell, Bash, CLI, REST APIs, python, ARM Templates or other scripting languagesComfortable leveraging source control tools such as Git, Azure DevOps, or GitHubKnowledge of container orchestration platforms such as Kubernetes, OpenShift, AKS, GKS or helmMicrosoft Azure, Amazon Web Services (AWS) or Google Cloud (GCP)Experience working for a cloud service provider (CSP), managed service provider (MSP), or SaaS provider6+ years of relevant Azure experience deploying and managing leveraging Infrastructure-as-Code (IAC) conceptsExperience with Microsoft and .NET (.NET, C#, SQL)Experience writing efficient and reliable code in a development environmentDebian, Ubuntu, Alpine or other distributions of the Linux operating systemsDeep knowledge and understanding of containerized applications, with special attention to reliability and monitoring of those containerized applicationsBenefitsVisionMedicalLifeDental401KExcellent Medical Plan.Dental & Vision Insurance.Life Insurance.Short & Long Term Disability.Vacation Time.Paid Holidays.Professional Development.Retirement Plan.Company OverviewOneStream Software is an independent software company that develops a financial planning and analysis software. It was founded in 2012, and is headquartered in Rochester, Michigan, USA, with a workforce of 1001-5000 employees. Its website is https://www.onestream.com.