[Remote] Senior DevOps Engineer/Site Reliability Engineer-East Coast
Note: The job is a remote job and is open to candidates in USA. Stellar Cyber is a fast-growing global leader in cybersecurity, trusted by major enterprises and government agencies. They are seeking a highly skilled Senior DevOps / Site Reliability Engineer to build, operate, and scale reliable cloud-native infrastructure and distributed data platforms while ensuring operational excellence and reliability best practices.ResponsibilitiesAdminister and maintain Kubernetes clusters and containerized workloadsManage cloud infrastructure across OCI, AWS, GCP, or Azure environmentsDevelop and maintain CI/CD pipelines for reliable application deploymentsImplement and manage Infrastructure as Code (IaC) using Terraform and HelmBuild automation tooling and operational workflows using Python, Go, or BashDrive observability initiatives including monitoring, logging, tracing, and alerting improvementsMonitor, troubleshoot, and resolve production incidents while participating in on-call rotationsSupport and optimize distributed data platforms including Kafka, Elasticsearch, Spark, Redis, and MongoDBImprove platform reliability, scalability, and operational efficiency using SRE best practicesCollaborate with cross-functional teams across multiple time zonesPerform Linux system administration and networking troubleshootingContribute to incident response processes, postmortems, and reliability improvementsSupport GitOps and deployment workflows using tools such as ArgoCD and GitHub ActionsEvaluate and implement AI-assisted operational tooling for auto-remediation, alert correlation, and operational intelligenceSkills5+ years of experience in DevOps, SRE, or Platform Engineering rolesStrong expertise with Kubernetes, Docker, and container orchestrationHands-on experience managing production cloud environmentsStrong Infrastructure as Code experience with Terraform and HelmExperience with CI/CD tools and deployment automationAdvanced troubleshooting skills in Linux systems, networking, and distributed systemsExperience with observability platforms including Prometheus, Grafana, Loki, Alertmanager, and Elastic StackStrong programming and scripting skills in Python, Bash, or GoExperience supporting high-availability production systems and on-call operationsKnowledge of incident management and reliability engineering practicesFamiliarity with data platform technologies such as Kafka, Spark, Elasticsearch, Redis, or MongoDBUnderstanding of AI-driven operational tooling and automated remediation conceptsExcellent communication, collaboration, and problem-solving skillsResides on the East CoastBenefitsPre-IPO Stock OptionsMedical, Dental & Vision care401(k)Employee Assistance ProgramEmployee Discount ProgramLife InsurancePaid time offReferral ProgramRewards and Recognition ProgramCompany OverviewStellar Cyber is an open XDR platform that offers comprehensive security while streamlining operations for efficiency. It was founded in 2015, and is headquartered in Santa Clara, California, USA, with a workforce of 51-200 employees. Its website is https://www.stellarcyber.ai.Company H1B SponsorshipStellar Cyber has a track record of offering H1B sponsorships, with 2 in 2026, 6 in 2025, 9 in 2024, 7 in 2023, 7 in 2022, 5 in 2021, 2 in 2020. Please note that this does not guarantee sponsorship for this specific role.