[Remote] Senior Cloud DevOps & Infrastructure Engineer

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. Diverse Lynx is seeking a Senior Cloud DevOps & Infrastructure Engineer with a focus on GCP and AI. The role involves designing, deploying, and maintaining secure and scalable cloud infrastructure, primarily on a multi-cloud platform, while implementing GitOps best practices and supporting AI/ML workloads.ResponsibilitiesInfrastructure as Code (IaC): Architect and provision production-grade infrastructure using Terraform. Manage state files, modules, and ensure infrastructure immutabilityAIML: Experience with LLM Models - in multi cloud environmentKubernetes & Containerization: Design and manage clusters. Create and optimize Docker files (multi-stage builds, distroless/hardened images). Manage complex deployments using Helm ChartsCI/CD & GitOps: Build end-to-end CI/CD pipelines using GitLab CI. Implement GitOps workflows to synchronize infrastructure and application stateDesign, configure, and manage scalable and secure cloud infrastructure for MLOpsAI Infrastructure Support: Configure and maintain environments suitable for AI/ML workloads (GPU node pools, LLM integration, large model serving, high-performance storage)Production Support & Troubleshooting: Act as the primary escalation point for deployment failures, network and Infra issues. Perform Root Cause Analysis (RCA)Security & Compliance: Implement 'Secure by Design' principlesHaving good knowledge of network security, identity and privilege access management, landing zone concepts for cloud platforms (Azure, AWS)Multi-Cloud Strategy: While GCP is primary, maintain and support secondary environments in AWS (and potentially Azure) to ensure business continuitySkills6 – 8 Years of experience in Cloud Infrastructure & DevOps EngineeringExpert in Kubernetes, Terraform, and GitLab CI/CDExperience supporting AI/ML workloadsArchitect and provision production-grade infrastructure using TerraformExperience with LLM Models in multi cloud environmentDesign and manage Kubernetes clustersCreate and optimize Docker files (multi-stage builds, distroless/hardened images)Manage complex deployments using Helm ChartsBuild end-to-end CI/CD pipelines using GitLab CIImplement GitOps workflows to synchronize infrastructure and application stateDesign, configure, and manage scalable and secure cloud infrastructure for MLOpsConfigure and maintain environments suitable for AI/ML workloads (GPU node pools, LLM integration, large model serving, high-performance storage)Act as the primary escalation point for deployment failures, network and Infra issuesPerform Root Cause Analysis (RCA)Implement 'Secure by Design' principlesGood knowledge of network security, identity and privilege access management, landing zone concepts for cloud platforms (Azure, AWS)Maintain and support secondary environments in AWS (and potentially Azure)Deep expertise in GCP (Compute Engine, GKE, Cloud Storage, IAM)Strong working knowledge of AWS (EC2, EKS, S3, IAM)Knowledge of using various programming languages (Python required, knowledge of Java, C#, JavaScript is a plus)Advanced proficiency in KubernetesAbility to write and manage custom Helm chartsExperience with Ingress Controllers (Nginx), Service Mesh, and Autoscaling (HPA/VPA/Cluster Autoscaler)Expert-level knowledge of GitLab CI/CD (writing .gitlab-ci.yml, runners, artifacts, caching)Understanding GitOps principlesStrong hands-on experience with Terraform for provisioning cloud resources across multiple environments (Dev/Stage/Prod)Proficiency in Bash/Shell scripting and PythonStrong Linux administration skillsExperience setting up monitoring and using Cloud Native tools, Prometheus, and GrafanaExperience with Azure Cloud infrastructureKnowledge of Identity Providers (Keycloak, Azure AD/Entra ID) and OIDC integrationExperience with Service MeshUnderstanding of ITIL processes (Incident/Change Management) and tools like ServiceNow, JIRABasic understanding of Python/Flask/Fast API applications to assist developers in troubleshootingCompany OverviewDiverse Lynx is a WBENC- and NMSDC-certified partner, helping organizations turn diversity goals into measurable impact through staffing and contingent workforce solutions. It was founded in 2002, and is headquartered in Princeton, New Jersey, US, with a workforce of 1001-5000 employees. Its website is http://www.diverselynx.com.Company H1B SponsorshipDiverse Lynx has a track record of offering H1B sponsorships, with 1 in 2024, 1 in 2021. Please note that this does not guarantee sponsorship for this specific role.

Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

**Experienced Global Customer Solutions Specialist – Remote**

Remote

Fully Remote FT Scheduler for Notaries to Real ...

Remote

**Experienced Customer Service Representative – Disney Enthusiast Wanted for Remote Role**

Remote

Immediately Need ESL Teacher - Fairdale High School (2023-2024) - Start Time 8:40 in Fairdale, KY

Remote

Experienced Online Data Entry Specialist for Teens – Remote Part-Time Opportunities with arenaflex

Remote

Technical Consultant, Charles River Everywhere

Remote

Insurance Verification & Authorization Specialist – Remote

Remote

Urgently Hiring: [Entry level Remote Jobs] Amazon Customer

Remote

Technician

Remote

Experienced Remote Data Entry Specialist for Disney - Accurate and Efficient Data Management

Remote
← Back