[Remote] DevOps/Observability Engineer
Note: The job is a remote job and is open to candidates in USA. Quantiphi is an award-winning, AI-First global digital engineering company that helps Fortune 1000 organizations transform ideas into measurable business impact. They are seeking a highly experienced Senior DevOps/Observability Engineer to lead the design and implementation of a next-generation observability platform, focusing on architecting an observability pipeline using modern, open-source technologies on AWS.ResponsibilitiesLead the design and implementation of our next-generation, unified observability platformArchitect a sophisticated observability pipeline from the ground up, leveraging a modern, open-source-centric stack on Amazon Web Services (AWS)Deploy, configure, and integrate a suite of tools including Prometheus, Grafana, and Splunk to provide comprehensive insights into our complex, distributed systemsSkillsUnified Pipeline Architecture: Proven ability to design and implement end-to-end observability pipelines using OpenTelemetry, Prometheus, and Grafana on centralized infrastructureCross-Account AWS Observability: Deep expertise in centralizing AWS telemetry, including multi-account CloudTrail organization trails, cross-account CloudWatch metrics/logs, and VPC Flow LogsLog Aggregation & Routing: Strong experience designing log aggregation strategies, implementing noise reduction/filtering at the collector level, and configuring Splunk HTTP Event Collector (HEC) integrationsAdvanced Alerting & Dashboarding: Hands-on experience building comprehensive alerting frameworks using Alertmanager and CloudWatch Alarms, coupled with advanced dashboard engineering in Grafana (using PromQL)Infrastructure as Code (IaC): Advanced proficiency in writing Terraform modules specifically for deploying and managing observability stacks and EC2 infrastructureEnterprise Scale Log Management: Demonstrated experience managing, routing, and optimizing log pipelines at massive scale (TB/day)Kubernetes/Container Observability: Experience deploying Prometheus and OTel within Kubernetes (EKS) or containerized (ECS) environmentsCost Optimization: Proven track record of reducing observability spend through strategic metric dropping, log filtering, and efficient storage tieringBenefitsJoin one of the worldβs fastest-growing AI-first digital engineering companies and make a real impact at scale.Lead and collaborate with a high-energy team of talented, driven individuals solving complex, meaningful challenges.Work with Fortune 500 companies and disruptive innovators in a research-driven environment with 60+ patents.Stay ahead of the curve by gaining hands-on experience with cutting-edge AI, ML, data, and cloud technologies while continuously upskilling.Company OverviewQuantiphi is a digital engineering company that provides data science and machine learning software and services. It was founded in 2013, and is headquartered in Marlborough, Massachusetts, USA, with a workforce of 1001-5000 employees. Its website is http://www.quantiphi.com.Company H1B SponsorshipQuantiphi has a track record of offering H1B sponsorships, with 2 in 2026, 45 in 2025, 65 in 2024, 45 in 2023, 94 in 2022, 71 in 2021, 46 in 2020. Please note that this does not guarantee sponsorship for this specific role.