[Remote] Senior Site Reliability Engineer
Note: The job is a remote job and is open to candidates in USA. Autodesk is a company that helps innovators turn their ideas into reality through software. They are seeking a Senior Site Reliability Engineer to build and operate reliable, secure, and scalable cloud services for Autodesk GovCloud products, focusing on improving production services and establishing operational excellence practices.ResponsibilitiesServe as a primary owner for the reliability, availability, performance, operability, and capacity of one or more production servicesDeploy, operate, maintain, and continuously improve production services running in Autodesk GovCloud environmentsPartner with engineering teams to ensure services are designed with reliability, scalability, security, and operability in mindDefine and operate reliability practices such as SLOs/SLIs, error budgets, production readiness reviews, service reviews, and operational health reviewsBuild automation to improve deployment safety, operational efficiency, incident response, and service recoveryDesign, develop, and maintain software, automation, and tooling that improve the reliability, scalability, and efficiency of production systemsImplement and improve monitoring, alerting, logging, tracing, and observability capabilities across supported servicesLead and participate in incident response, troubleshooting, and post-incident reviews focused on learning and continuous improvementDevelop and maintain operational documentation, runbooks, and recovery proceduresScale and enhance resilience testing and Gameday practices to validate system behavior, recovery capabilities, and operational readinessContinuously identify and eliminate operational toil through software engineering, automation, and process improvementEnsure supported services remain compliant with Autodesk security, privacy, and regulatory requirements, including FedRAMP and related controls where applicableParticipate in a 24x7 on-call rotation for production servicesFunction effectively in a fast-paced environment while helping establish and mature operational excellence practices for Autodesk GovCloudSkillsB.S. or higher in Computer Science, Engineering, or a related technical discipline, or equivalent practical experience7+ years of experience in Site Reliability Engineering, Software Engineering, Platform Engineering, Cloud Infrastructure, or Production OperationsExperience operating and supporting customer-facing production services in large-scale cloud environmentsStrong understanding of reliability engineering principles, including SLOs/SLIs, observability, incident management, capacity planning, production readiness, and automationExperience with AWS, Azure, or other public cloud platformsExperience developing automation using languages such as Python, Go, Java, PowerShell, Bash, or similarExperience with Infrastructure as Code, CI/CD pipelines, deployment automation, and modern cloud operations practicesUnderstanding of security, compliance, and operational risk management in production environmentsStrong written and verbal communication skills10+ years of experience operating highly available, customer-facing production systemsExperience with AWS GovCloud, FedRAMP, IL4/IL5, or other regulated cloud environmentsExperience supporting services with stringent availability, reliability, and security requirementsExperience with containers, Kubernetes, cloud-native architectures, APIs, load balancing, networking, DNS, and distributed systemsExperience with observability platforms such as Splunk, Dynatrace, Datadog, CloudWatch, or similar technologiesExperience operating databases, storage platforms, messaging systems, caching technologiesExperience designing and implementing operational automation at scaleExperience leading or participating in Gamedays, disaster recovery exercises, resilience testing, or operational readiness reviewsStrong incident management experience, including technical leadership during major incidents and stakeholder communicationStrong collaboration skills and ability to work effectively across engineering, security, compliance, and operations teamsPassion for building reliable, secure, and scalable systems that customers can trustBenefitsAnnual cash bonusesCommissions for sales rolesStock grantsA comprehensive benefits packageCompany OverviewAutodesk develops 3D design software for use in the architecture, engineering, construction, and media industries. It was founded in 1982, and is headquartered in San Francisco, California, USA, with a workforce of 10001+ employees. Its website is http://www.autodesk.com.