[Remote] Data Reliability Engineer
Note: The job is a remote job and is open to candidates in USA. PanAgora Asset Management is focused on transforming financial lives and fostering a flexible work environment. They are seeking a hands-on Data Reliability Engineer to ensure the reliability, stability, and operational excellence of their AWS-based data platform, while working closely with data and platform engineering teams to improve data systems.ResponsibilitiesOwn the reliability and stability of production data pipelines and data platform servicesDiagnose and resolve data pipeline failures, delays, and data quality issues in production environmentsInvestigate issues across distributed data systems (e.g., Spark/EMR workloads, ingestion pipelines, warehouse performance)Lead or support incident response, including triage, mitigation, and long-term resolutionPerform root cause analysis (RCA) and implement durable fixes to prevent recurrenceDefine and improve data SLAs (freshness, latency, completeness) and ensure adherenceDesign and enhance monitoring, alerting, and observability for data systemsDevelop automation and tooling to reduce operational toil and improve system resilienceContribute to disaster recovery (DR) and resiliency planning, including backup validation and recovery workflowsPartner with engineering teams to improve pipeline design, reliability, and operational readinessCreate and maintain runbooks, SOPs, and operational documentationParticipate in occasional off-hours support for production data systems when requiredSkillsMinimum 5 years of experience working with production data platforms in AWS environmentsPrior experience building data pipelines and seeing them through production, including exposure to real-world failures and operational challengesStrong experience with Python and SQL in real data systemsHands-on experience troubleshooting distributed data processing systems (e.g., Spark/EMR, Redshift, streaming systems)Proven ability to debug and resolve production issues in data pipelines and data platformsExperience with AWS data services (such as EMR, Redshift, DynamoDB, S3, or similar)Experience handling production incidents and performing root cause analysisStrong problem-solving mindset and ability to work through ambiguous production issuesExperience handling real-world data issues such as pipeline delays or failuresExperience with backfills and reprocessingExperience with late-arriving or incomplete dataExperience improving observability and alerting specifically for data systemsExperience influencing or guiding data pipeline reliability and operational practicesExposure to streaming/event-driven systems (Kafka, Kinesis, CDC patterns)Experience with disaster recovery, backup validation, and resiliency testingStrong communication during incidents with both technical and non-technical stakeholdersBenefitsMedical, dental, vision and life insuranceRetirement savings – 401(k) plan with generous company matching contributions (up to 6%), financial advisory services, potential company discretionary contribution, and a broad investment lineupTuition reimbursement up to $5,250/yearBusiness-casual environment that includes the option to wear jeansGenerous paid time off upon hire – including a paid time off program plus ten paid company holidays and three floating holidays each calendar yearPaid volunteer time — 16 hours per calendar yearLeave of absence programs – including paid parental leave, paid short- and long-term disability, and Family and Medical Leave (FMLA)Business Resource Groups (BRGs) – BRGs facilitate inclusion and collaboration across our business internally and throughout the communities where we live, work and play. BRGs are open to all.Company OverviewPanAgora Asset Management is a quantitative investment manager whose proprietary approach is designed to capitalize on inefficiencies across market cycles and to deliver relative and absolute returns through distinct and innovative equity, multi-asset and risk premia strategies. It was founded in 1989, and is headquartered in Boston, Massachusetts, USA, with a workforce of 51-200 employees. Its website is http://www.panagora.com.