[Remote] Manager Site Reliability Operations

Remote Full-time

Note: The job is a remote job and is open to candidates in USA. Mercury Insurance is a well-recognized company known for its achievements and culture, recently awarded as one of America's Best Midsize Employers for 2026. The Site Reliability Operations Manager will lead a team responsible for observability, real-time monitoring, and incident management across production platforms, ensuring operational excellence and service reliability.ResponsibilitiesLead the Site Reliability Operations team, including the Network Operations Center (NOC), responsible for observability, real-time monitoring, incident response, and operational excellence for key enterprise services; set direction, priorities, and success metrics for the teamPartner with Product Management, Engineering, SRE, and the rest of infrastructure team to embed CI/CD and release best practices into operations, including automated build/test/deploy, health checks, rollbacks, release monitoring via the NOC, and change-management guardrailsOversee service reliability monitoring and incident management: ensure appropriate observability (metrics, logs, traces, dashboards), well-tuned alerting thresholds, escalation paths, and effective communications to stakeholders and leadership during incidentsOwn and mature the Problem Management function for the team: drive root cause analysis (RCA) of recurring or high-severity incidents, standardize post-incident reviews, and ensure corrective actions and follow-ups are implemented and verifiedDefine, track, and report operational and reliability metrics (e.g., availability, MTTR, incident volume, change failure rate, deployment frequency, problem resolution time); provide regular insights and recommendations to Technology Operations leadershipChampion automation and “operations as code” (infrastructure as code, configuration as code, automated runbooks), working with engineering teams to reduce manual toil and improve consistency, speed, and safety of operations and releasesRecruit, develop, coach, and evaluate team members; provide performance feedback, make salary and promotion recommendations, and foster a high-performing, collaborative culture aligned with Mercury’s core valuesProvide leadership coverage for 7x24 mission-critical support through the NOC and on-call rotations; ensure sustainable on-call practices, high-quality runbooks, and continuous improvement of tooling and processesSkillsBachelor's degree in computer science, Information Systems, Engineering, or related field, or equivalent combination of education and work experience7+ years of experience in IT operations, SRE, DevOps, or related roles supporting mission-critical systems3+ years of experience in a lead or management role overseeing technical teams in a 24x7 environmentStrong understanding of CI/CD pipelines (build, test, security scanning, deployment, rollback) and how they support reliable operationsSolid knowledge of observability practices and tools (metrics, logs, traces, dashboards, alerts) and how to design actionable monitoring and alerting for production systemsDeep familiarity with incident and problem management processes, including root cause analysis methods and post-incident review facilitationWorking knowledge of DevOps/SRE concepts such as SLOs/SLIs, error budgets, resilience patterns, automation to reduce toil, and blameless cultureDemonstrated ability to lead and influence cross-functional teams, build relationships, and collaborate effectively with engineering, InfoSec, infrastructure, and business stakeholdersExcellent communication skills, both written and verbal; able to clearly communicate technical issues, risks, and recommendations to technical and non-technical audiences, including senior leadershipStrong analytical and problem-solving skills; able to analyze operational data and trends to identify risks, drive decisions, and prioritize improvementsSelf-motivated, adaptable, and able to operate with minimal supervision in a fast-changing environmentAbility to work extended hours, nights, or weekends as needed to support critical releases or resolve major incidentsAdvanced coursework or certifications or experience in Site Reliability Engineering, DevOps, Cloud platforms, or ITILExperience leading teams that support services deployed via modern CI/CD pipelines and running on cloud and/or container platforms (e.g., Kubernetes/OpenShift, AWS). Experience integrating operations functions with DevOps/SRE teams, including shared ownership of reliability goals and metricsBenefitsCompetitive compensationFlexibility to work from anywhere in the United States for most positionsPaid time off (vacation time, sick time, 9 paid Company holidays, volunteer hours)Incentive bonus programs (potential for holiday bonus, referral bonus, and performance-based bonus)Medical, dental, vision, life, and pet insurance401 (k) retirement savings plan with company matchEngaging work environmentPromotional opportunitiesEducation assistanceProfessional and personal development opportunitiesCompany recognition programHealth and wellbeing resources, including free mental wellbeing therapy/coaching sessions, child and eldercare resources, and moreCompany OverviewMercury Insurance has offered quality insurance for personal auto insurance to homeowners insurance to mechanical breakdown protection. It was founded in 1962, and is headquartered in Los Angeles, California, USA, with a workforce of 5001-10000 employees. Its website is http://www.mercuryinsurance.com.Company H1B SponsorshipMercury Insurance has a track record of offering H1B sponsorships, with 7 in 2026, 22 in 2025, 23 in 2024, 14 in 2023, 15 in 2022, 8 in 2021, 13 in 2020. Please note that this does not guarantee sponsorship for this specific role.

Apply Now →

[Remote] Manager Site Reliability Operations

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

USPS Office Helper

Hiring Now: Amazon remote work opportunities available - Hiring

3D Modeling Specialist - Freelance AI Trainer Project

EPIC Applications Analyst- Bugsy/ Infection Prevention

Shift Leader

Valuations Associate

Site Care Partner I - FSP

Remote Certified Financial Planner

Netflix Remote (Data Entry Jobs) $76000/Yearly ? No Experience

Entry Level Remote Data Entry Specialist – Accurate Digital Records Management & Administrative Support

Remote Chat Agent