Senior Site Reliability Engineer

Remote Full-time

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.




Role Description


This role involves joining an Identity Security Cloud software development team as a Senior Site Reliability Engineer (SRE). You will work closely with software engineers, infrastructure platform services, engineering managers, and other stakeholders to ensure the reliability, scalability, and performance of the team's services.



Work with development and service owners to solve performance issues and ensure system scalability.


Design, develop, and implement solutions to improve reliability, availability, performance, and scalability of systems.


Develop alerts and dashboards in collaboration with technical leaders and infrastructure platform services.


Own and improve key operational metrics (SLIs, SLOs, Error Budgets, monitoring and alerting).


Drive continuous improvement through post-incident reviews and blameless postmortems of non-functional issues.


Develop and maintain comprehensive monitoring and alerting to proactively identify and resolve issues.


Create and maintain dashboards, conducting ongoing reviews to optimize gaps.


Collaborate with technical leads, DevOps/SRE, and infra teams for capacity planning.


Identify and address production performance bottlenecks through profiling, tuning, and optimization.


Automate repetitive tasks and processes to improve efficiency.


Work closely with Software, Performance, and Test Engineers to influence system design and architecture.


Review and contribute to documentation for systems, processes, runbooks, and procedures.


Participate in a 24/7 on-call rotation to gain subject matter expertise.


Lead incident postmortem efforts, ensuring timely compilation of reports.


Utilize excellent diagnostic and problem-solving skills to analyze complex systems and data.



Qualifications



Bachelor’s degree in computer science, a related field, or equivalent practical experience.


Proven 5+ years of SRE experience.


Strong understanding of SRE principles and practices.


Experience with cloud platforms (AWS, GCP, or Azure).


Proficiency in at least one scripting language (e.g., Python, Bash, Go).


Experience with monitoring and logging tools (e.g., Prometheus, Grafana, Honeycomb, OpenSearch).


Level of coding experience beyond simple scripts with programming languages such as Go, Java, or Python.


Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).


Understanding of network protocols and security best practices.


Familiarity with DevOps culture and practices and experience with CI/CD toolchains (Jenkins, ArgoCD, SpaceLift).


Experience with Incident Response tools and processes (PagerDuty).


Experience with Infrastructure as Code (Terraform, Helm).


Strong problem-solving and troubleshooting skills.


Excellent communication and collaboration skills.


Ability to work independently and as part of a team.



Preferred Qualifications



Technology experience: Kafka, relational databases, performance tuning (JVM, Go).


Experience with Grafana K6 – Continuous Performance Tool.



Onboarding Timeline



In the first 30 days you will:



Meet team, understand the team’s mission and vision.


Gain clarity on various roles and expectations.


Complete development environment setup.


Read guides, documentation, perform mandatory training.


Learn company processes, benefits.



By 6 months you should:



Understand team goals and OKRs for the quarter and beyond.


Complete initial analysis and implementation of SRE team assignments.


Be comfortable with tools, systems, and processes used on a day-to-day basis.


Complete project work, both supervised and unsupervised.





Apply Now
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Experienced Remote Customer Service Representative – Delivering Exceptional Support and Guidance to Valued Customers at blithequark

Remote

(Lead) Product Manager — Energy Utility Platform (m/f/d)

Remote

Chat Support Agent (Remote) - Entry Level, No Degree Required - 15 - 18 per Hour

Remote

Entry-Level Remote Data Entry Specialist – Home‑Based Administrative Support – No Experience Required, Flexible Hours

Remote

Virtual Survey Contributor (Hiring Immediately)

Remote

[Remote] NationalLink Reviewer, Quality Assurance

Remote

Physician- Gastroenterology- Hybrid Schedule-NEW HOSPITAL- Columbia County Hospital

Remote

Apply Now: Home Depot Entry Level Near Me Jobs $25/Hour -

Remote

QA Engineer – Work from Home

Remote

RETAIL WAREHOUSE ATTENDANT - DELTA CENTER

Remote
← Back