[Remote] Site Reliability Engineer (APAC)
Note: The job is a remote job and is open to candidates in USA. Pod Network is building a next-generation decentralized exchange focused on fairness, performance, and user experience. They are seeking a Site Reliability Engineer to help operate, improve, and scale the reliability of their platform, working closely with the engineering team to manage production systems and enhance operational practices.ResponsibilitiesRespond to and resolve incidentsMonitor the health and performance of the platformRespond to production incidents and drive them through to resolutionInvestigate failures, identify root causes, and coordinate fixesEnsure issues are detected, understood, and addressed quicklyIdentify recurring operational pain points and eliminate themImprove software, deployment processes, and operational workflowsParticipate in incident reviews and help drive preventative improvementsContribute reliability-focused changes directly to production systemsDesign and maintain dashboards, metrics, alerting, and monitoring systemsImprove signal quality while reducing alert fatigueBuild automation and internal tools that make the platform easier to operateHelp establish reliability best practices across the engineering organizationSkillsStrong experience with Linux and cloud infrastructureExperience operating and supporting production systemsExperience with Docker and containerized environmentsExperience with observability and incident-management tools such as Grafana, Prometheus, PagerDuty, or similarAbility to automate workflows using Rust, Python, Bash, or similar languagesStrong troubleshooting and debugging skillsA high degree of ownership and the ability to make sound decisions independentlyExperience with distributed systemsExperience operating high-availability, low-latency servicesExperience with CI/CD systems and deployment automationExperience designing secure operational workflows and access controlsCompany OverviewPod network is a L1 primitive that is blockless, leaderless and relaxes the notion of transaction total ordering. It was founded in 2024, and is headquartered in New York, New York, USA, with a workforce of 11-50 employees. Its website is https://pod.network/.