[Remote] Site Reliability Engineer
Note: The job is a remote job and is open to candidates in USA. Arctiq is a global, intelligence-driven technology services company delivering professional and managed services across various domains. The Site Reliability Engineer will focus on executing and maintaining reliability engineering practices for mission-critical government systems, bridging the gap between development and operations through automation and monitoring.ResponsibilitiesImplement and maintain dashboards and alerting rules using Prometheus, Grafana, or ELK Stack. Support the identification of Service Level Indicators (SLIs)Develop and maintain Infrastructure as Code (IaC) scripts using Terraform and Ansible to ensure repeatable, error-free deploymentsMaintain automated deployment pipelines, ensuring security scans and automated tests are integrated into the workflowParticipate in on-call rotations and assist in troubleshooting system outages. Contribute to blameless post-mortem reports to drive continuous improvementIdentify repetitive manual tasks and develop automation to reduce "toil," allowing the team to focus on high-value engineeringSkills3–5 years of experience in SRE, DevOps, or Systems Engineering rolesProficiency in scripting languages (Python, Go, or Bash)Hands-on experience with containerization (Docker, Kubernetes) and cloud platforms (AWS, Azure, or GCP)Familiarity with NIST SP 800-53 security controlsEducation: Bachelor's degree in Computer Science or a related technical fieldCompany OverviewArchitect Intelligence. Engineering transformative infrastructure, security and platform solutions. It was founded in 2024, and is headquartered in Irvine, California, USA, with a workforce of 501-1000 employees. Its website is https://www.arctiq.com.