[Remote] Devops Engineer Principal
Note: The job is a remote job and is open to candidates in USA. Sagent is transforming the mortgage servicing industry by bringing the modern experience customers now expect from loan originations to loan servicing. The Cloud Engineering team builds the foundation that every engineering team at Sagent runs on, and they are seeking a Cloud Infrastructure Engineer to support infrastructure and application development teams on a large-scale microservices platform.ResponsibilitiesOperate and improve multi-region GKE clusters hosting hundreds of microservices across multiple environments from development through productionManage the Kubernetes platform layer: Istio service mesh, cert-manager, external-dns, RBAC, HPA/KEDA autoscaling, HashiCorp Vault secret injection, and Helm-based deploymentsDevelop and maintain Terraform modules across multiple IaC repositories covering GKE, networking (Shared VPC, Cloud NAT, Private Service Connect), Cloud SQL, Cloud Storage, Dataproc, Cloud Composer, Vault, and web hostingMaintain and extend Azure DevOps CI/CD pipelines using shared Terraform templates with multi-environment deployment workflowsSupport Confluent Kafka infrastructure including Connect workers with JDBC source connectors, consumer group health monitoring, and Kafka-lag-based autoscaling with KEDAManage Redis Enterprise clusters on Kubernetes with operator-managed lifecycle and replicationOperate the observability stack: Grafana Cloud (Alloy, Loki, Mimir, Tempo, Pyroscope via Private Service Connect), kube-prometheus-stack, Google Managed Prometheus, OpenTelemetry Operator/Collector, Beyla, and KubecostHarden cluster security posture: NetworkPolicies, Pod Security Standards, admission policy enforcement, CrowdStrike Falcon, Lacework, kube-bench, and cert-manager with Letβs Encrypt ACMESupport data infrastructure including Cloud SQL (PostgreSQL), Dataproc (Spark), Cloud Composer (Airflow), Matillion CDC pipelines, Snowflake, and BigQueryManage DNS across multiple providers (Azure DNS, Cloudflare, GCP Cloud DNS) via external-dns, and support Azure APIM and Cloudflare CDN/WAFPartner directly with application development teams to troubleshoot deployment failures, tune resource limits and autoscaling, and resolve Kafka consumer lag and connectivity issuesContribute to the Internal Developer Portal (Backstage) and internal CLI tooling that enables self-service for product engineersSkills7+ years of cloud or infrastructure engineering experience, including 3+ years of hands-on Azure OR GCP experienceStrong production experience with GKE, VPC networking, IAM, Cloud SQL, Cloud Storage, and Artifact RegistryAdvanced Terraform experience, including reusable module design, state management, and multi-environment patternsProduction Kubernetes expertise: Helm chart development and management, RBAC, resource tuning, and troubleshooting workloads at scaleHands-on experience with Istio service mesh: sidecar injection, mTLS, VirtualServices, AuthorizationPolicies, and traffic managementUnderstanding of CNI fundamentals (Cilium/Dataplane V2), east-west traffic flows, and network segmentationExperience with CI/CD pipeline development (Azure DevOps YAML pipelines or equivalent) and trunk-based development workflowsHands-on experience with secrets management, including HashiCorp Vault (Kubernetes auth, agent injection) and GCP Secret ManagerProficiency in scripting (Bash, Python, or Go) with the ability to write production-quality automation and toolingStrong security mindset with experience implementing least-privilege IAM, certificate management, and policy-driven controlsClear and effective communicator able to work across infrastructure and application development teamsExperience with event-driven architectures and Apache Kafka (Confluent Platform, Connect, consumer group management, KEDA-based scaling)Experience with Redis Enterprise on Kubernetes (operator-managed clusters, Active-Active replication)Familiarity with Grafana Cloud observability stack (Alloy, Loki, Mimir, Tempo, Pyroscope) and OpenTelemetryExperience with GCP data services: Dataproc (Spark), Cloud Composer (Airflow), BigQuery, Pub/SubFamiliarity with OPA/Rego or Kyverno for policy enforcementExperience with Azure APIM, Cloudflare, or multi-cloud DNS managementFamiliarity with Matillion or similar ETL/CDC tooling and Snowflake data warehouseExposure to financial services or mortgage/loan servicing domain and associated compliance requirementsExperience with Kubecost or similar FinOps tooling for cloud cost optimizationExperience building or contributing to an Internal Developer Portal (Backstage)BenefitsRemote/Hybrid workplace optionsHealth BenefitsUnlimited Flexible Time OffFamily Planning ServicesTuition ReimbursementPaid Family Leave401(k) MatchingPet InsuranceIn-person and Virtual Social ExperiencesCareer PathingFocus Time FridaysCompany OverviewSagent powers banks and lenders to make loans and homeownership simpler and safer for millions of consumers. It was founded in 2018, and is headquartered in King Of Prussia, Pennsylvania, USA, with a workforce of 501-1000 employees. Its website is https://sagent.com.Company H1B SponsorshipSagent has a track record of offering H1B sponsorships, with 1 in 2020. Please note that this does not guarantee sponsorship for this specific role.