Site Reliability Engineer

Remote Full-time
This role is for one of the Weekday's clientsMin Experience: 5 yearsJobType: full-timeWe are looking for a skilled and proactive Site Reliability Engineer to help build and maintain highly reliable, scalable, and secure infrastructure and applications. This role will focus on automating operations, improving system performance, and ensuring overall service health by applying modern SRE practices.RequirementsKey Responsibilities: Design, implement, and manage Kubernetes-based infrastructure. Utilize AWS services such as IAM, EC2, EKS, S3, and CloudWatch to build and support scalable cloud environments. Develop and maintain automation scripts and tools using Shell scripting or Python. Proactively identify, analyze, and troubleshoot complex application, network, and system-level issues. Optimize system performance and reliability, with deep expertise in Linux debugging and performance tuning. Build automation for system self-healing and recovery mechanisms. Develop monitoring and alerting solutions for high-performance and low-latency applications. Collaborate with development and operations teams to implement effective CI/CD pipelines. Apply SRE principles including service monitoring, alerting, error budget tracking, capacity planning, fault tolerance, automation, and toil reduction. Continuously seek opportunities to improve system reliability and engineering processes. Qualifications: Proven experience working with Kubernetes in production environments. Strong command of AWS cloud services with hands-on experience in infrastructure provisioning and management. Proficiency in scripting or programming (Shell or Python preferred). In-depth Linux knowledge including tools for diagnostics and performance optimization. Familiarity with modern observability tools for monitoring, logging, and alerting. Strong troubleshooting and problem-solving skills. Understanding and application of SRE concepts and best practices. Key Skills:Kubernetes · AWS (IAM, EC2, EKS, S3, CloudWatch) · Linux Debugging · Shell/Python Scripting · Monitoring & Alerting · Automation · CI/CD · Docker · Site Reliability Engineering (SRE) · Performance TuningOriginally posted on Himalayas

Apply Now
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

**Experienced Customer Service Representative – Flexible Part-Time Remote Jobs at arenaflex**

Remote

Real Estate Agent - Clients Provided Daily

Remote

Channel Account Executive

Remote

Quality of Care Review Nurse (Remote)

Remote

Sales Representative - Uniform (Olathe, KS, US, 66061)

Remote

Axiom Software Solutions Limited is hiring: [Hiring] Copy Editor @Axiom Software

Remote

Experienced Service Desk Specialist and Live Chat Agent for Mobile Application Support – Remote Opportunity in Colorado

Remote

Licensed Optician, Part-Time - Hoboken

Remote

Senior Consultant, Payer Performance - Remote

Remote

HV Electrical Engineer

Remote
← Back