[Remote] Lead devOps Engineer
Note: The job is a remote job and is open to candidates in USA. Tria Federal delivers digital services and technology solutions that support the health and safety of veterans, service members and civilians. They are looking for a highly skilled Lead DevOps Engineer to be part of a collaborative team that supports and delivers virtual infrastructure for mission-critical health IT solutions.ResponsibilitiesUse our company’s cloud-based performance management software updates function to get weekly, bi-weekly, semi-monthly, or monthly updates from each direct reportManage Team availability on your programEnforce the Cloud Practice SOP (standard operating procedures)Be the example by modeling what you want from your StaffWork Jira/Kanban board tickets; at least 75% of your job is to be technically hands-onHelp your staff grow; pick someone on the team to mentor and assign staff to mentor other staffBe available for all customer callsFollow up with your staff to make sure everything is going well, and tasks are getting completed in a timely manner, address injections as they ariseMake sure all work has a Jira ticket assigned to itMake sure all work you and your staff does is well documented, and PRs/Commits are attached to Jira ticketsUphold the Customer SLA, remember as people managers we are responsible for the work and performance of those who report to usRespond to emails and slack messages quickly, if you're in a meeting, have slack represent that and check your messages as soon as it's overHave a reoccurring productive and meaningful standup aligned with Kanban workflow, record it, make sure everyone attendsRepresent your team to the customerReport performance issues to the practice leadership as soon as they happenMake sure all staff are on any all hands callsReport all good news and wins back to the practiceReport program issues, failures immediately to practice leadershipContribute and help other teams and encourage your staff to do the sameShare your work with the practiceHelp grow and mature the practiceSkillsAbility to obtain a U.S. Federal Position of Trust clearance designationMust reside in and be able to perform work in the United StatesMust have lived in the United States for 3 of the last 5 yearsBachelor's degree is requiredLead level experience with at least 8+ years of overall experience in all the required skills6 or more years of experience working with Kanban and agile tools like Jira, Git, and Confluence8 or more years of hands-on experience with orchestration tools such as terraformStrong hands-on experience with Windows Server administration (2019/2022), including IIS configuration, Windows services, Active Directory integration, and Group Policy managementDemonstrated experience building and maintaining CI/CD pipelines for .NET applications (ASP.NET, .NET 4.x, 8+) and Node.js applications deployed to AWS ECS and LambdaProven experience with AWS ECS (Fargate and EC2 launch types), including service definitions, task definitions, load balancer integration, auto-scaling, and blue/green deploymentsHands-on experience with AWS Lambda functions using Node.js and/or Go Lang, including event-driven architectures, API Gateway integration, and Step Functions orchestrationExperience deploying, configuring, and maintaining Apache NiFi data flow pipelines for healthcare data integration, including dataflow versioning with NiFi RegistryProficiency with Terraform for infrastructure-as-code across AWS services including ECS, Lambda, API Gateway, CloudFront, RDS, VPC networking, IAM, etcStrong PowerShell and Bash scripting skills for automation of Windows and Linux administration tasks, deployment workflows, and operational runbooksDemonstrated experience leading a DevOps or platform engineering team of at least 2 direct reports, with accountability for team deliverables, performance, and professional growthProven ability to manage work using Kanban methodology, including maintaining Kanban boards, enforcing WIP limits, tracking cycle time and throughput metrics, and driving continuous improvementExperience mentoring and growing engineering staff, including establishing technical standards, conducting code reviews, and building a culture of accountability and continuous learningStrong customer-facing communication skills with experience representing technical teams to federal program stakeholders, managing expectations, and upholding SLAsTrack record of balancing hands-on technical work (at least 75% of role) with people management responsibilities, including performance management and corrective action when neededSomeone with experience with tools such as Jenkins to enable CI/CDRecent hands-on experience with AWSStrong experience with Windows Server environments (2019/2022) and solid understanding of Linux Systems (CentOS, RedHat, Amazon Linux), hosts, networks, security, applications, and proficiency in shell scripting and PowerShellStrong experience with containerization, particularly with ECS (Fargate and EC2 launch types), including deploying .NET and Node.js workloads to ECSStrong understanding of Zero Trust ArchitectureSolid understanding and proven experience with configuration management tools like Ansible, Jenkins, Terraform, Containerization, and data integration tools such as Apache NiFiBelieves in automation for consistent, scalable and fool-proof delivery of infrastructure and applicationsSupport production issues/high severity issues on weekends or off hours as requiredExperience managing and mentoring staff, managing expectations and prioritization of workAbility to balance many priorities and directions at once and keep the team focused and on scheduleExperience working day to day tasks while also innovating to make sure the team not only supports the program but actively innovates to make it betterExperience in working as DevOps leader focusing on CI/CD and CM tools and modern frameworks including .NET, Node.js, and serverless architectures in the ecosystemSolid hands-on experience with working on AWSStrong experience with serverless resources, particularly AWS Lambda (Node.js and .NET runtimes), API Gateway, and CloudFrontSolid expertise troubleshooting and managing both Windows Server and Linux systems. Hands-on experience in using Ansible, PowerShell, or PythonExperience with Splunk is a plusCandidates with proven certifications and socially accessible profiles that demonstrate the body of work and participation in modern collaboration hubsA great team player and genuinely believes in solving challenges as a teamWilling to learn new technologies and methodologies quicklyExplores alternatives and quickly prototyping to validate hypothetical architectures or solutionsStrong understanding of Kanban methodology and the core tenets of agile both in letter and spirit, with demonstrated experience running Kanban boards and managing WIP limitsExperience communicating with customers, specifically in the public sectorDesires to continuously learn and invest time expanding credentials relevant to role and customersDemonstrated ability come up with System designs, architecture, process flows and Concept of Operations for large complex systemsShould have an AWS Professional certification or be ready to obtain a certification within 60 days from the date of joiningKnowledge of Healthcare, Medicare and Medicaid systems and data is a plusBenefitsTop-tier benefits packageInvest in your physical, mental, and financial health and wellnessCompany OverviewTria Federal builds, modernizes, and operates mission-critical federal health platforms and programs. It was founded in 2021, and is headquartered in Arlington, Virginia, USA, with a workforce of 1001-5000 employees. Its website is https://triafed.com.