[Remote] Senior Site Reliability Engineer

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. DocuSign is a company that brings agreements to life and serves over 1.5 million customers worldwide. They are seeking a Senior Site Reliability Engineer to lead reliability initiatives for high-impact services, ensuring the reliability, scalability, and performance of critical systems while driving improvements in observability and incident response.ResponsibilitiesDesign, implement, and operate highly available, scalable services in cloud environments (primarily Azure, with some multi‑cloud scenarios)Define and evolve SLOs/SLIs, error budgets, and capacity strategies for owned services; use them to guide engineering trade‑offs and release decisionsAnalyze patterns in incidents and outages; own long‑term reliability improvements for your domain and contribute to reliability strategy across servicesWrite high quality code that is easy to maintain and testEnsure design and architecture is extensible across projects, and participate in technical design and code reviewsIdentify operational toil and lead automation efforts to eliminate it—deployment, runbook, and remediation workflows that make incidents rarer and faster to resolveDevelop robust, well‑tested tooling and shared libraries that are adopted across multiple teamsImprove CI/CD pipelines and guardrails to reduce change failure rate while increasing deployment velocityDesign and implement logging, metrics, tracing, and alerting for complex distributed systems; ensure signals are actionable and aligned to business impactBuild and automate tools and solutions for incident impact analysis and effective mitigationParticipate in and often lead incident response for Sev0–Sev2 events: triage, mitigation, coordination, and clear communicationPerform and contribute to blameless post‑incident reviews, root‑cause analysis, and follow‑through on corrective actionsWork with Operations and Incident Command teams during and post incidents to drive excellence in Incident Management ProcessCompose and analyze dashboard to highlight areas of the business that need attention and help drive organizational KPICreate and respond to system generated alerts to maintain system healthWork with Operations and Engineers to fill any gaps in alerting and telemetryAct as the primary SRE partner for one or more engineering teams—shaping architecture, reviewing designs, and embedding reliability best practicesMentor and coach other SREs and software engineers on topics such as debugging, observability, incident management, and performance optimizationContribute to and help standardize SRE practices, runbooks, and production readiness criteria across CPE and product teamsWork with Product Management, collaborators and other developers to understand design requirements and provide estimates for developmentLearn and grow in all key technologies in Docusign and be a partner to Eng and Operations teamsSkills8+ years of experience in Site Reliability Engineering, DevOps, or Software Engineering roles with ownership of production systems at scale (or equivalent experience)Experience coding in at least one modern language (e.g., Go, Python, C#, Java), with the ability to design, implement, test, and debug production‑grade automation and servicesPractical experience operating large‑scale services in public cloud (Azure preferred; AWS/GCP acceptable with willingness to learn Azure)Experience with Linux, networking fundamentals, and common infrastructure components (load balancers, DNS, certificates, queues, caches, databases)Experience with Observability stacks (e.g., Prometheus/Grafana, OpenTelemetry/Chronicle, centralized logging)Experience with CI/CD systems and deployment strategies (blue/green, canary, rolling updates)Experience with incident management and on‑call operations for 24x7 servicesExperience in building dashboards and metrics analysisStrong analytical and problem-solving skillsExperience in high‑availability, regulated, or customer‑facing SaaS environmentsBackground in reliability practices such as chaos testing, capacity modeling, and performance tuningExposure to release management/unified release practices and safe rollout strategies (feature flags, staged rollouts, configuration‑driven changes)Demonstrated leadership driving cross‑team initiatives: reliability programs, migrations, or major refactorsStrong written and verbal communication skills; ability to explain complex technical topics to both engineers and non‑technical stakeholdersBenefitsBonus: Sales personnel are eligible for variable incentive pay dependent on their achievement of pre-established sales goals. Non-Sales roles are eligible for a company bonus plan, which is calculated as a percentage of eligible wages and dependent on company performance.Stock: This role is eligible to receive Restricted Stock Units (RSUs).Paid Time Off: earned time off, as well as paid company holidays based on regionPaid Parental Leave: take up to six months off with your child after birth, adoption or foster care placementFull Health Benefits Plans: options for 100% employer paid and minimum employee contribution health plans from day one of employmentRetirement Plans: select retirement and pension programs with potential for employer contributionsLearning and Development: options for coaching, online courses and education reimbursementsCompassionate Care Leave: paid time off following the loss of a loved one and other life-changing eventsCompany OverviewDocusign provides a software to prepare, send, sign, store, and manage documents through electronic signature and workflow tools. It was founded in 2003, and is headquartered in San Francisco, California, USA, with a workforce of 5001-10000 employees. Its website is http://www.docusign.com.Company H1B SponsorshipDocusign has a track record of offering H1B sponsorships, with 73 in 2026, 361 in 2025, 337 in 2024, 236 in 2023, 249 in 2022, 236 in 2021, 115 in 2020. Please note that this does not guarantee sponsorship for this specific role.

Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Experienced Commercial Insurance Customer Service Representative – Delivering Exceptional Client Experiences at careerzynith

Remote

Amazon Work From Home (customer service) Hiring Now-

Remote

Evening/ Overnight Front Desk Agent in Albuquerque, NM

Remote

Netflix Careers Remote Customer Service, Remote Jobs At Netflix, Netflix Careers @ Get Informed!!

Remote

Entry-Level Bridge Engineer

Remote

Cloud Security Architect

Remote

GTM Engineer

Remote

Urgently Hiring: Machine Operator (Nights)

Remote

Multimodal Transportation Planner

Remote

Entry Level Remote Chat Support Assistant – Flexible Part Time Work From Home Opportunity for Moms and Beginners

Remote
← Back