[Remote] Mid-Level Data Scientist
Note: The job is a remote job and is open to candidates in USA. Simple Technology Solutions is committed to prioritizing its people while delivering exceptional solutions to Federal Government clients. They are seeking a Mid-Level Data Scientist to join their federal data engineering team, where the role involves building AI/ML capabilities and delivering analytical products that support critical government decision-making.ResponsibilitiesBuild and maintain knowledge bases, vector stores, and Retrieval Augmented Generation (RAG) pipelines using Amazon Bedrock and Amazon OpenSearch Services to make financial and regulatory datasets AI-ready for advanced analytics and machine learning consumptionSupport the development, validation, and operationalization of statistical outputs and derived data products; coordinate with the agency data science team and SME data scientists to implement Airflow DAGs and AWS Glue jobs that ensure automated, recurring updatesSupport transition of data science outputs into production by validating accuracy, completeness, and reporting readiness; ensure all production data products are incorporated into the agency's ETL load and gap reporting infrastructureDevelop and validate machine learning models and analytical pipelines using large-scale financial and regulatory datasets in the data lakeLeverage AI-assisted development tools for code generation, debugging, and performance tuning; adhere to agency security standards and applicable federal AI governance requirementsWrite Python 3.10 code conforming to PEP 8; integrate analytical pipelines with the agency's ETL metadata infrastructure and produce required load and gap reporting outputsSupport entity resolution work to ensure consistent identification and linkage of records across high-volume financial datasetsProduce required documentation for all analytical models and pipelines: methodology, data lineage, model assumptions, refresh schedules, and IV&V QuestionnairesWrite automated tests achieving the 90% minimum code coverage threshold; complete security scans at least once per sprint as part of the Definition of Done per OWASP ASVS Level 2Participate in 2-week sprint ceremonies, quarterly PI planning, backlog refinement, and agile delivery using JIRA and GitHubSkillsUS Citizenship is requiredBachelor's Degree is requiredMinimum of 3-5 years' position related experience is requiredBachelor's degree or higher in Data Science, Statistics, Computer Science, Mathematics, or a related quantitative field3-5 years of experience in data science, machine learning engineering, or quantitative analyticsProficiency in Python 3.10 (PEP 8) including pandas, NumPy, scikit-learn, and related librariesHands-on experience with Amazon Bedrock, knowledge bases, vector stores, and RAG pipeline design on AWSExperience with Amazon OpenSearch Services or equivalent vector/search infrastructureExperience with Apache Airflow (MWAA) for DAG-based pipeline orchestrationFamiliarity with AWS Glue, S3, and Apache Spark for large-scale data processingExperience with SQL and query tools such as Trino, Athena, or RedshiftMust be able to work 8am-5pm Eastern Time regardless of home locationActive federal public trust suitability determination or ability to obtain one requiredExperience working with large-scale financial or regulatory datasets is strongly preferredKnowledge of federal AI governance requirements and responsible AI practices in a government settingExperience with agile development, CI/CD pipelines, GitHub, and sprint-based deliveryFamiliarity with FISMA, NIST 800-53, and Zero Trust principlesBenefitsFlexibility to help them thrive personally and professionallySpecial incentives for team members living in qualified HUBZonesCompany OverviewSimple Technology Solutions is a federal-focused digital strategy consultancy. It was founded in 2013, and is headquartered in Washington, District of Columbia, USA, with a workforce of 51-200 employees. Its website is https://www.simpletechnology.io.