[Remote] Data Engineer
Note: The job is a remote job and is open to candidates in USA. MSR Technology Group is seeking a Data Engineer to design and build robust data pipelines and optimize data models. The role involves working with Azure technologies to implement data ingestion, transformation, and storage solutions while ensuring efficient data workflows and monitoring.ResponsibilitiesDesign and build robust, reusable, parameter-driven ingestion and transformation pipelines using Azure Data Factory, Synapse Pipelines, Data Bricks and/or Microsoft Fabric Data FactoryImplement medallion architecture (Bronze / Silver / Gold) on Azure Data Lake Storage Gen2 using Delta Lake, Parquet, and structured streaming patternsBuild performant ELT workflows that leverage pushdown to source systems (Synapse Dedicated SQL Pool, Azure SQL, Teradata) where appropriateDevelop and optimize PySpark notebooks and jobs on Azure Databricks or Synapse SparkDesign dimensional models (Kimball star/snowflake) and data vault patterns for analytics consumptionImplement Slowly Changing Dimensions (Type 1/2/3), Change Data Capture, and late-arriving data patternsTune distributed SQL workloads in Synapse Dedicated SQL Pool / Fabric Warehouse, including distribution keys, partitioning, and clustered column store indexesImplement CI/CD for data pipelines using Azure DevOps (YAML pipelines, ARM/Bicep/Terraform) across Dev / SIT / UAT / Prod environmentInstrument pipelines with robust logging, auditing, and monitoring using Azure Monitor, Log Analytics, and KQLDefine and enforce coding standards, code review practices, branching strategies, and release managementLead or contribute to legacy-to-cloud migrations — e.g., Informatica PowerCenter to Azure Data Factory, on-premises Teradata / Oracle / SQL Server to Synapse or FabricPerform workload assessment, capacity planning, and cost modeling for target-state architecturesProduction incident response for critical pipelinesSkillsDeep hands-on expertise with Azure Data Factory: pipelines, datasets, linked services, triggers, parameterization, mapping data flows, and all three Integration Runtime types (Azure, Selfhosted, SSIS)Strong Experience in Data Bricks and PySparkProduction experience with one or more of: Azure Synapse Analytics (Dedicated and Serverless SQL Pools, Spark Pools) OR Azure Databricks (Delta Lake, Unity Catalog) OR Microsoft Fabric (Warehouse, Lakehouse, OneLake)Strong working knowledge of Azure Data Lake Storage Gen2 (hierarchical namespace, RBAC + ACLs, lifecycle management, security)Experience with Azure Key Vault, Azure AD / Entra ID (including managed identities and service principals), and private networking (VNet integration, private endpoints)Monitoring and troubleshooting with Azure Monitor, Log Analytics, and KQLAdvanced SQL — window functions, CTEs, query optimization, execution plan analysis, performance tuningStrong Python for data engineering — pandas, PySpark, REST API integration, unit testing (pytest)Proficient in T-SQL; familiarity with Spark SQL, KQL, PowerShell, and Bash shell scripting5+ years of data warehouse development experience5+ years of data modeling experience using ERWIN or similar tools2+ years of experience with Azure Data Factory and SnowflakeMedicaid Domain Knowledge is a plusCompany OverviewAs a leader in professional services, MSR Technology Group ignites growth for businesses across diverse industries. It was founded in undefined, and is headquartered in Wilmington, Delaware, US, with a workforce of 1001-5000 employees. Its website is https://www.msrtechnologies.com .Company H1B SponsorshipMSR Technology Group has a track record of offering H1B sponsorships, with 86 in 2026, 233 in 2025, 175 in 2024, 2 in 2022. Please note that this does not guarantee sponsorship for this specific role.