[Remote] Senior Manager – Site Reliability Engineering (SRE)
Note: The job is a remote job and is open to candidates in USA. Dice is seeking a Senior Manager of Site Reliability Engineering (SRE) to lead the activation and scaling of SRE practices within the Financial Services & Innovation organization. This role involves establishing operational discipline, driving the adoption of SRE standards, and aligning various teams to a consistent reliability model.ResponsibilitiesDrive adoption of the SRE operating model across application teamsEstablish clarity in roles between:SREProduction Support Engineering (PSE)Application teamsEnsure SRE practices are embedded into the development lifecycle, not treated as post-production activitiesDefine and enforce:SLIs, SLOs, and Error BudgetsProduction readiness criteriaReliability best practicesLead SLO adoption and compliance reviews across the organizationEstablish governance frameworks to ensure consistent application of standardsPartner with:Application product teamsProduction Support Engineering (MG team)Platform / Infrastructure / Observability teamsDrive alignment and reduce friction between engineering and operationsEnsure clear handoffs, escalation models, and operational ownershipLead adoption of centralized observability standards across:MetricsLoggingTracingAlign tooling (AppDynamics, Splunk, Prometheus, etc.)Ensure monitoring and alerting are SLO-driven and actionable, not noise-basedPartner with PSE to strengthen:Incident management processesRCA (Root Cause Analysis) standardsDrive identification of patterns and systemic issuesEnsure learnings translate into engineering improvements and automationIdentify opportunities to:Reduce manual operational workImprove system resilienceEnable self-healing capabilitiesPromote a culture of engineering over reactionDefine and track reliability metrics across FS&IBuild reporting that provides visibility into:System healthIncident trendsSLO performanceTranslate technical data into actionable business insightsSkills10+ years in engineering, operations, or SRE roles5+ years leading SRE, platform, or reliability-focused teamsProven experience implementing SRE practices at scale (SLIs, SLOs, error budgets)Strong background in cloud environments (AWS, Azure, Google Cloud Platform)Hands-on experience with observability tools (Splunk, AppDynamics, Prometheus, etc.)Experience in incident management and production operations at scaleAbility to operate effectively in high-pressure and complex enterprise environmentsExperience driving organizational transformation (not just technical implementation)Strong understanding of CI/CD, DevOps, and automation practicesExperience working in regulated or large enterprise environmentsFamiliarity with AIOps or advanced automation strategiesCompany OverviewDice is the go-to career marketplace for tech professionals. It was founded in 2010, and is headquartered in Drachten, Friesland, NLD, with a workforce of 201-500 employees. Its website is https://www.or-quest.nl/.Company H1B SponsorshipDice has a track record of offering H1B sponsorships, with 2 in 2022, 4 in 2021, 5 in 2020. Please note that this does not guarantee sponsorship for this specific role.