Lead Data Engineer - Scalable Data Pipelines - Contract to Hire

Remote Full-time
Lead Data Engineer (PySpark, Airflow, Azure) – Scalable Data Pipelines We’re looking for an experienced Senior Data Engineer to design, build, and optimize large-scale data pipelines powering analytics and machine learning workloads. This role is ideal for someone who is hands-on, performance-oriented, and comfortable leading other engineers while owning end-to-end data workflows. You’ll work on both batch and real-time processing, take ownership of Spark performance tuning, and help enforce best practices around data quality, governance, and reliability. ⸻ Responsibilities • Design, develop, and optimize scalable data pipelines using Python, PySpark, Apache Spark, and Airflow • Build and maintain batch and streaming data processing systems on Spark • Design and manage Airflow DAGs to orchestrate complex, dependency-heavy workflows • Implement data partitioning, caching, and Spark performance tuning to handle large datasets efficiently • Ensure data quality, governance, security, and reliability across the data lifecycle • Monitor, troubleshoot, and optimize data jobs, SLAs, and pipeline dependencies • Manage cloud infrastructure (Azure) for data workloads, including cost optimization • Implement CI/CD pipelines for data workflows using Git, Docker, and Infrastructure-as-Code tools • Support analytics and ML use cases by working with structured and unstructured data • Lead and mentor other data engineers, providing architectural guidance and code reviews • Promote best practices in coding standards, documentation, and version control • Collaborate effectively with distributed, remote teams in an Agile environment ⸻ ✅ Requirements • 8+ years of hands-on experience in Data Engineering • Strong expertise with Apache Spark / PySpark, including internals such as: • RDDs, DataFrames, DAG execution, partitioning, shuffles, and caching • Proven experience building and operating Airflow DAGs (scheduling, dependencies, retries, SLAs) • Advanced Python and SQL skills with a focus on performance and maintainability • Solid experience with Azure data and compute infrastructure • Working knowledge of Docker, Kubernetes, Terraform, and CI/CD best practices • Strong problem-solving skills and ability to optimize large-scale data processing systems • Prior experience leading or mentoring engineers • Comfortable working in Agile/Scrum environments • Excellent communication skills and ability to collaborate with remote teams ⸻ ⭐ Nice to Have • Experience with streaming frameworks (Spark Structured Streaming, Kafka, Event Hubs) • Familiarity with data governance, lineage, and observability tools • Experience supporting ML or advanced analytics pipelines • Background in cost-efficient Spark optimization at scale Apply tot his job
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

[Remote] Sr. Product Director - Home Infusion Therapy Solutions (PharmD)

Remote

Experienced Remote Data Entry Specialist for Dynamic Team – Part-Time Opportunity with Comprehensive Training and Career Growth

Remote

Data Entry Specialist (Typist) (Part time/Full time) – New York City, – Amazon Store

Remote

Regional End to End Manager

Remote

Clerk/Office Assistant Role

Remote

Experienced Remote Data Entry Clerk – Flexible Work Schedule and Competitive Compensation

Remote

**Experienced Virtual Customer Service Representative – Travel Industry Expert**

Remote

Restructuring & Turnaround Consulting Associate (Nationwide)

Remote

Join Today: Database Admin- REMOTE and 3RD SHIFT

Remote

Experienced Customer Support Representative – Remote Work Opportunity in the Aviation Industry with arenaflex

Remote
← Back