[Remote] Senior Site Reliability Engineer - AWS
Note: The job is a remote job and is open to candidates in USA. Filevine is a Legal AI company delivering Legal Operating Intelligence for the future of legal work. As a Site Reliability Engineer, you will lead the reliability engineering efforts, designing and maintaining autonomous systems while collaborating with cross-functional teams to ensure system performance and security.ResponsibilitiesProvide strong leadership, mentoring, and sound judgment as the Reliability Engineering lead on your teamDesign and maintain autonomous systems for building, deploying, testing, and operating all Filevine productsAct as the authoritative voice of reliability across the full software development lifecycle (SDLC)Monitor, aggregate, dashboard, and alert on software/infrastructure events to ensure visibility and fast responseContinuously enhance CI/CD pipelines, automation scripts, playbooks, and tools to streamline processes and reduce resolution timeProactively identify and resolve gaps in system availability, performance, and security while defending overall security postureDocument processes, architecture, procedures, and best practices; research, adopt, or build reliable tools to boost engineer productivityCollaborate within your team (or independently), mentor junior engineers, participate in 24/7 on-call rotation for production support and emergency response, and communicate clearly with technical and management stakeholdersSkills8+ years of hands-on technical experience in software engineering, infrastructure, or operations roles, including a minimum of 7 years dedicated to Site Reliability Engineering (SRE)Demonstrated curiosity, self-motivation, continuous learning mindset, passion for improvement, and proactive enthusiasm to enhance systems and processes daily without needing directionStrong proficiency in Python, Bash, PowerShell, and other common SRE tooling and scripting technologiesExpert-level experience designing, building, and maintaining autonomous systems that handle software build, deployment, testing, monitoring, and operations with minimal human interventionProficient hands-on experience with AWS (e.g., EC2, Kubernetes/EKS, CloudWatch, Lambda, S3, IAM)Proficiency in all core skills expected of an SRE II, including monitoring/alerting, incident response, capacity planning, performance optimization, CI/CD pipeline enhancement, and reliability engineering best practicesBachelor's degree in Computer Science, Information Systems, or a related field; equivalent certifications (e.g., Google Cloud Professional certifications, AWS certifications); or substantial comparable direct work experienceProven track record of independently driving reliability improvements, reducing toil through automation, and contributing to high-availability, scalable production systems in a fast-paced environmentBenefitsPaid time off policyComprehensive benefits packageMedical, Dental, & Vision Insurance (for full-time employees)Competitive & Fair PayMaternity & paternity leave (for full-time employees)Short & long-term disabilityOpportunity to learn from a dedicated leadership teamTop-of-the-line company swagCompany OverviewDice is a job-searching platform for technology professionals. It is a sub-organization of DHI Group. It was founded in 1990, and is headquartered in Santa Clara, California, USA, with a workforce of 201-500 employees. Its website is http://www.dice.com.Company H1B SponsorshipDice has a track record of offering H1B sponsorships, with 2 in 2022, 4 in 2021, 5 in 2020. Please note that this does not guarantee sponsorship for this specific role.