[Remote] Senior AI Systems Engineer
Note: The job is a remote job and is open to candidates in USA. ARA is seeking a Senior AI Systems Engineer to lead the deployment and operational support of AI platforms and services. The role involves designing and optimizing AI infrastructure, operationalizing machine learning workflows, and ensuring compliance with security and governance requirements.ResponsibilitiesLead the deployment, integration, and operational support of AI platforms, tools, and services, ensuring compatibility with existing systems and enterprise processesDesign, implement, monitor, and optimize AI infrastructure, working with server, cloud, and platform engineering teamsOperationalize machine learning workflows and support AI-enabled applications from development through production deployment and sustainmentBuild and maintain CI/CD and MLOps pipelines for model packaging, testing, deployment, rollback, and lifecycle managementImplement infrastructure automation using scripting, Infrastructure as Code, and configuration management practicesProvide ongoing technical support, troubleshooting, root cause analysis, and documentation for AI platforms and user-facing AI servicesMaintain observability across AI systems through logging, metrics, performance monitoring, alerting, and incident response practicesEnsure security, compliance, and governance requirements are met, including participation in audits, vulnerability management, and secure architecture reviewsAssess and implement system enhancements to improve performance, scalability, reliability, and cost efficiencyCollaborate across divisions to support diverse AI initiatives and align technical implementations with mission and business objectivesEvaluate emerging AI tools, frameworks, and infrastructure approaches for operational fit, supportability, and long-term valueDevelop and maintain technical documentation, runbooks, architecture diagrams, and operational proceduresSkillsBachelor's degree in computer science, Engineering, Information Technology, or a related STEM field with 8-10 years of engineering experience2+ years of experience supporting AI/ML platforms, MLOps workflows, model deployment, or AI-enabled infrastructureStrong coding and automation skills in Python, Bash, or similar scripting languagesExperience with AI/ML frameworks and tooling such as PyTorch, Hugging Face, or similar ecosystemsProficiency with DevOps and MLOps practices, including CI/CD pipelines, Git-based workflows, containerization, and KubernetesExperience deploying AI/ML models or AI services into operational environments, including containerized, cloud, or high-performance computing environmentsFamiliarity with security frameworks and compliance standards such as NIST and CMMCFamiliarity with AI security functionality in enterprise environments including OAuthStrong communication skills and the ability to collaborate effectively across technical and non-technical teamsAdvanced degree or certifications related to AI or machine learningExperience integrating AI models into scientific workflowsFamiliarity with large language model (LLM) APIs and orchestration frameworks such as OpenAI, Hugging Face, LangGraph, or LangChainExperience with model serving, inference optimization, or AI platform tools such as MLflow, Kubeflow, vLLM, or similarExperience with simulations for scientific or engineering projects, particularly physical systems simulationsExperience with GPU-based systems or running AI models in HPC environmentsExperience writing and deploying MCP Servers on KubernetesDoD experienceSecret Security Clearance – Active or InactiveCompany OverviewARA provides research, engineering, and technical support services. It was founded in 1979, and is headquartered in Albuquerque, New Mexico, USA, with a workforce of 1001-5000 employees. Its website is https://www.ara.com.