Data Engineer - Databricks Specialist

Remote Full-time
We are seeking an experienced Data Engineer with deep expertise in Databricks to design, build, and maintain scalable data pipelines and analytics solutions. This role requires 5 years of hands-on experience in data engineering with a strong focus on the Databricks platform. Key Responsibilities: - Data Pipeline Development & Management - Design and implement robust, scalable ETL/ELT pipelines using Databricks and Apache Spark - Process large volumes of structured and unstructured data - Develop and maintain data workflows using Databricks workflows, Apache Airflow, or similar orchestration tools - Optimize data processing jobs for performance, cost efficiency, and reliability - Implement incremental data processing patterns and change data capture (CDC) mechanisms Databricks Platform Engineering: - Build and maintain Delta Lake tables and implement medallion architecture (bronze, silver, gold layers) - Develop streaming data pipelines using Structured Streaming and Delta Live Tables - Manage and optimize Databricks clusters for various workloads - Implement Unity Catalog for data governance, security, and metadata management - Configure and maintain Databricks workspace environments across development, staging, and production Data Architecture & Modeling: - Design and implement data models optimized for analytical workloads - Create and maintain data warehouses and data lakes on cloud platforms (Azure, AWS, or GCP) - Implement data partitioning, indexing, and caching strategies for optimal query performance - Collaborate with data architects to establish best practices for data storage and retrieval patterns Performance Optimization & Monitoring: - Monitor and troubleshoot data pipeline performance issues - Optimize Spark jobs through proper partitioning, caching, and broadcast strategies - Implement data quality checks and automated testing frameworks - Manage cost optimization through efficient resource utilization and cluster management - Establish monitoring and alerting systems for data pipeline health and performance Collaboration & Best Practices: - Work closely with data scientists, analysts, and business stakeholders to understand data requirements - Implement version control using Git and follow CI/CD best practices for code deployment - Document data pipelines, data flows, and technical specifications - Mentor junior engineers on Databricks and data engineering best practices - Participate in code reviews and contribute to establishing team standards Required Qualifications Experience & Skills: - 5+ years of experience in data engineering with hands-on Databricks experience - Strong proficiency in Python and/or Scala for Spark application development - Expert-level knowledge of Apache Spark, including Spark SQL, DataFrames, and RDDs - Deep understanding of Delta Lake and Lakehouse architecture concepts - Experience with SQL and database optimization techniques - Solid understanding of distributed computing concepts and data processing frameworks - Proficiency with cloud platforms (Azure, AWS, or GCP) and their data services - Experience with data orchestration tools (Databricks Workflows, Apache Airflow, Azure Data Factory) - Knowledge of data modeling concepts for both OLTP and OLAP systems - Familiarity with data governance principles and tools like Unity Catalog - Understanding of streaming data processing and real-time analytics - Experience with version control systems (Git) and CI/CD pipelines Preferred Qualifications - Databricks Certified Data Engineer certification (Associate or Professional) - Experience with machine learning pipelines and MLOps on Databricks - Knowledge of data visualization tools (Power BI, Tableau, Looker) - Experience with infrastructure as code (Terraform, CloudFormation) - Familiarity with containerization technologies (Docker, Kubernetes)
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Traveling Superintendent

Remote

DATA ARCHITECT senior, international position, based in Poland

Remote

Senior Partner, Ad Partner Management, Walmart Connect

Remote

Analyst - Business Strategy & Execution

Remote

**Experienced Part-Time Customer Service Representative – Remote Work Opportunity with blithequark**

Remote

Clinical Research Associate II/Sr. Clinical Research Associate - FSP

Remote

Call Center Agent (100% Remote) **US ONLY** - Now Hiring

Remote

Educational Project Manager

Remote

Experienced Remote Biocompatibility Scientist – Medical Device Industry Expertise in Regulatory Compliance, Toxicology, and Analytical Chemistry

Remote

Grievance Customer Service Associate Analyst (Team Lead) – Cigna Healthcare – Remote in USA

Remote
← Back