[Remote] Manager, Site Reliability Engineering

Remote Full-time

Note: The job is a remote job and is open to candidates in USA. Paradigm is a software company transforming the residential, construction & building product industries. They are seeking a Manager of Site Reliability Engineering to lead a high-performing team, promote modern SRE practices, and enhance reliability across their Azure-based platform.ResponsibilitiesLead and grow a team of site reliability engineers. Provide guidance, mentorship, and career developmentContribute to and mature SRE practices across production services: SLOs, SLIs, error budgets, toil reduction, and blameless post-mortems that turn incidents into lasting improvementsOversee the incident management lifecycle end-to-end including detection, response, resolution, post-incident review, and systemic improvementDesign on-call rotations, runbooks, and escalation procedures that balance service reliability with engineer well-being and sustainable work practicesDrive measurable reductions in MTTR and MTTD through improved observability, intelligent automation, and predictive monitoringBuild automation to eliminate manual operational work including provisioning, deployment, scaling, self-healing, and reportingImplement chaos engineering practices to validate system resilience and surface weaknesses before they cause outagesPartner with engineering and product teams to embed reliability requirements into the development lifecycle, from design through deploymentCollaborate with the observability team to ensure comprehensive instrumentation, smart alerting, and actionable dashboards across all critical servicesMeasure, report, and advocate for reliability improvements with both technical and executive stakeholders using data to drive investment decisionsSkillsBachelor's degree in Engineering, or a related field or equivalent experience7+ years in site reliability engineering, DevOps, or infrastructure engineering, with at least 1 year in people management (or demonstrated tech lead experience with direct influence over team processes and career growth)Hands-on experience running production systems on Azure (including proficiency with key services such as AKS, App Services, Service Bus, Event Grid, and Azure Monitor) or comparable cloud platformsProven track record implementing SRE practices with measurable reliability improvements and familiarity with modern observability platforms (Datadog, Prometheus/Grafana, or equivalent)Experience leading incident response for high-severity production issues and running effective post-mortemsStrong background in automation, infrastructure as code (Terraform, Bicep, or similar), and systematically eliminating manual operational workExperience with Kubernetes container orchestration with production-grade operational experienceAbility to automate workflows and build scripts using Python, Bash, PowerShell, or GoStrong communication with the ability to make complex technical issues clear for both engineers and executivesData-driven approach. You use metrics and telemetry to guide decisions, not gut feelYou are collaborative cross-functionally and build trust and alignment naturallyAI-enhanced observability experience is preferredExperience with AI coding assistants and CI/CD systems (GitHub Actions, Azure DevOps, ArgoCD) with automation capabilities is preferredKnowledge of distributed systems patterns is preferredExposure to AIOps platforms or using LLMs for operational automation is preferredCompany OverviewParadigm provides a software platform that focuses on the building products industry. It was founded in 1999, and is headquartered in Middleton, Wisconsin, USA, with a workforce of 501-1000 employees. Its website is http://myparadigm.com/.Company H1B SponsorshipParadigm has a track record of offering H1B sponsorships, with 1 in 2026, 1 in 2025, 4 in 2024, 1 in 2023, 1 in 2022, 4 in 2021, 1 in 2020. Please note that this does not guarantee sponsorship for this specific role.

Apply Now →

[Remote] Manager, Site Reliability Engineering

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

USPS Office Helper

Comptable généraliste / General Accountant (m/f/d)

Customer Team Leader (District Sales Manager), Cardiovascular Disease - North New Jersey District

PepsiCo Entry Level Call Center Remote Jobs - Hiring Now

Team Leader (£1000 Joining Bonus!)

Enterprise Data Specialist, Power BI

Customer Experience (CX) Specialist

Audit Supervisor | Digital Assets and Blockchain - National Attest Office

Lead 3D Character Modeler

Customer Support Associate, Bilingual (Starlink)

Entry Level American Express Jobs $27/Hour - VacancyGlobal