[Remote] Mid-Level Data Engineer
Note: The job is a remote job and is open to candidates in USA. Simple Technology Solutions is a company that prioritizes its team members and offers flexibility for personal and professional growth. They are seeking a Mid-Level Data Engineer to join their federal data engineering team, where the role involves building and maintaining ETL pipelines on a cloud-based Enterprise Data Platform using AWS.ResponsibilitiesDevelop new ETL pipelines and data ingestion processes alongside senior engineers using AWS Glue (Spark-based, PySpark), MWAA (Airflow), Lambda, and SNS, fully conforming to the agency's Enterprise ETL Standards, ETL Common Library, and PEP 8 Python coding standardsIntegrate the agency's ETL Common Library into Glue jobs for standardized orchestration, error handling, metadata recording, and SNS notifications for all success and error job eventsIngest structured and semi-structured datasets (CSV, XML, JSON, Avro, pipe-delimited) into S3 landing, raw, and curated zones using Apache Iceberg tables with Parquet as the default format; enforce transactional loading and prevent duplicate loads per dataset reporting periodConfigure static ETL metadata in the centralized PostgreSQL metadata store; ensure dynamic metadata records job status and timestamps for all key execution stepsMonitor assigned production jobs and participate in operations support rotations; identify and escalate failed jobs and performance issues promptly to maintain data availability within contractually required ingestion timelinesEnsure ETL Load Reports are populated in real-time and ETL Gap Reports are updated on a weekly basis covering all gaps from the inception of the initial ingest processBuild and maintain materialized views and semantic layer objects in Trino and Athena to ensure optimized query performance and consistent business logicProduce and maintain required documentation for each assigned dataset: Business Requirements, ETL Design Documents, Data Models (Mermaid format), Data Dictionaries, Mapping Documents, Deployment Documents, O&M Guides, and ETL Test PlansWrite unit and integration tests achieving the 90% minimum code coverage threshold; complete security scans at least once per sprint as part of the Definition of DoneDeploy ETL resources using CloudFormation templates through the agency CICD pipeline; submit Change Requests to the Change Control Board within required timelinesSupport transition of ETL jobs from other agency teams by verifying standards conformance, performing deployments, and validating data loadsSupport disaster recovery exercises, pre-production deployments, and ad hoc data requests as assignedParticipate in 2-week sprint ceremonies, quarterly PI planning, backlog refinement, and agile delivery using JIRA and GitHubSkillsUS Citizenship is requiredBachelor's Degree is requiredMinimum of 3-5 years' position related experience is requiredBachelor's degree or higher in Computer Science, Information Systems, Data Engineering, or a related field3-5 years of experience in data engineering or a closely related technical roleHands-on experience with Python (PEP 8), PySpark, and SQL for ETL pipeline developmentExperience with AWS services including Glue, S3, MWAA (Airflow), Lambda, SNS, and SQSFamiliarity with Apache Iceberg, Parquet, and ORC file formats and S3 data lake zone conceptsExperience with PostgreSQL and basic familiarity with Redshift or OracleFamiliarity with Trino or Athena for query and semantic layer developmentExperience with CloudFormation, GitHub branching workflows, and CI/CD-integrated deploymentsAbility to produce clear ETL documentation including data models (Mermaid format) and data dictionariesUnderstanding of ETL metadata concepts including static and dynamic metadata, load reports, and gap reportsExperience in agile development environments with sprint-based deliveryExperience supporting IV&V and/or User Acceptance Testing (UAT) processes in a federal or technical program environmentExperience with automated testing frameworks; ability to write unit and integration tests achieving defined code coverage thresholdsMust be able to work 8am-5pm Eastern Time regardless of home locationActive federal public trust suitability determination or ability to obtain one requiredFamiliarity with FISMA, NIST 800-53, and OWASP ASVS Level 2 is a plusBenefitsFlexibility to help them thrive personally and professionallySpecial incentives for team members living in qualified HUBZonesCompany OverviewSimple Technology Solutions is a federal-focused digital strategy consultancy. It was founded in 2013, and is headquartered in Washington, District of Columbia, USA, with a workforce of 51-200 employees. Its website is https://www.simpletechnology.io.