[Remote] Architect - Platform Engineer
Note: The job is a remote job and is open to candidates in USA. Quantiphi is an award-winning, AI-First global digital engineering company that helps leading Fortune 1000 organizations transform bold ideas into measurable business impact. They are seeking a highly skilled Architect - Platform Engineer to design, optimize, and scale infrastructure for GenAI and LLM workloads, collaborating with cross-functional teams to bring cutting-edge AI solutions to life.ResponsibilitiesDesign and implement scalable infrastructure for LLM and GenAI workloads across multi-GPU environmentsPerform GPU profiling, benchmarking, and performance optimization for distributed training workloadsManage and schedule compute-intensive jobs using Slurm-based clusters and OpenShift/Kubernetes environmentsEnable and optimize the NVIDIA GPU stack (CUDA, cuDNN, NCCL, Triton, RAPIDS, etc.)Collaborate with cross-functional teams to deploy models in research and production environmentsBuild and support GenAI pipelines (fine-tuning, RAG, multi-modal inferencing, LLMOps)Develop reusable infrastructure templates using tools like Terraform and HelmContribute to internal innovation (PoCs, workshops) and support client-facing delivery engagementsDevelop and deliver automation software required for building & improving the functionality, reliability, availability, and manageability of applications and cloud platformsChampion and drive the adoption of Infrastructure as Code (IaC) practices and mindsetDesign, architect, and build self-service, self-healing, synthetic monitoring and alerting platform and toolsAutomate the development and test automation processes through CI/CD pipeline (Git, Jenkins, SonarQube, Artifactory, Docker containers)Build container hosting-platform using KubernetesIntroduce new cloud technologies, tools; processes to keep innovating in the commerce area to drive greater business valueLead the technical discussion regarding architecture designing and troubleshooting with the clients and provide solutions proactively as requiredSkillsStrong experience with Slurm and distributed training environmentsHands-on expertise with Red Hat OpenShift and/or KubernetesDeep knowledge of the NVIDIA GPU ecosystem (CUDA, cuDNN, NCCL, Nsight, Triton/TensorRT)Strong foundation in Linux systems, performance tuning, and multi-GPU optimizationExperience deploying GenAI workloads (LLM fine-tuning, RAG pipelines, multi-modal systems)Familiarity with Infrastructure-as-Code tools (Terraform, Ansible)Experience with cloud GPU environments (GCP, Azure, AWS, OCI) and/or on-prem GPU clustersServe as a mentor or guide for senior resources / team leadsLead the technical discussion regarding architecture designExperience with NVIDIA NIMs, DGX systems, or GPU-accelerated containersKnowledge of LLMOps frameworks and MLOps integrationFamiliarity with vector databases and retrieval systems for RAG architecturesComfortable working in client-facing environments and collaborating with AI solution teamsExperience working with FHIR R4, HL7 v2, or SMART on FHIRIntegration with EHR systems (e.g., Epic)Understanding of HIPAA compliance and healthcare data privacyExposure to clinical workflows, CDS Hooks, or patient-facing applicationsExperience building clinical decision support systems or healthcare interoperability solutionsCompany OverviewQuantiphi is a digital engineering company that provides data science and machine learning software and services. It was founded in 2013, and is headquartered in Marlborough, Massachusetts, USA, with a workforce of 1001-5000 employees. Its website is http://www.quantiphi.com.Company H1B SponsorshipQuantiphi has a track record of offering H1B sponsorships, with 2 in 2026, 45 in 2025, 65 in 2024, 45 in 2023, 94 in 2022, 71 in 2021, 46 in 2020. Please note that this does not guarantee sponsorship for this specific role.