[Remote] Data Engineer - GCP
Note: The job is a remote job and is open to candidates in USA. The Data Sherpas is a dynamic team focused on building innovative and scalable data solutions on Google Cloud Platform (GCP). They are seeking an experienced Google Cloud Data Engineer to design, develop, and manage scalable data pipelines and data infrastructure, ensuring data availability, accuracy, and performance for business insights and machine learning models.ResponsibilitiesDesign, build, and maintain scalable and reliable data pipelines using Cloud Dataflow, Cloud Pub/Sub, and Cloud ComposerDevelop ETL/ELT processes to process and transform large volumes of structured and unstructured dataOptimize data pipeline performance, scalability, and reliabilityEnsure data processing and ingestion workflows are monitored and meet performance SLAsDesign and implement data storage solutions using BigQuery, Cloud Storage, and FirestoreOptimize data structures and partitioning for performance and cost efficiencyEnsure data security, integrity, and availability in all storage solutionsManage data lifecycle policies and archiving processesDevelop data transformation processes using BigQuery, Apache Beam, and Cloud FunctionsImplement data quality checks, validation rules, and monitoring solutionsSupport real-time and batch data processing needsIntegrate data from multiple sources, including APIs, databases, and third-party applicationsAutomate data ingestion, transformation, and export using tools like Cloud Composer and Cloud FunctionsEnsure data consistency across different environments and systemsWork closely with data scientists and analysts to understand data needs and business goalsProvide technical guidance and best practices to the data engineering and business teamsCollaborate with security and compliance teams to ensure data governance standards are metMonitor data pipeline performance and troubleshoot issues in real-timeAnalyze data pipeline failures and implement fixes to prevent recurrenceSet up logging and monitoring using Stackdriver and Cloud MonitoringSkillsBachelor's degree in Computer Science, Data Engineering, or a related field; Master's degree is a plus3+ years of experience in data engineering, with at least 2+ years working with Google Cloud PlatformGoogle Professional Data Engineer certification is requiredStrong proficiency with GCP services such as BigQuery, Cloud Dataflow, Cloud Composer, Cloud Pub/Sub, Firestore, and Cloud FunctionsHands-on experience with big data tools and frameworks such as Apache Beam, Hadoop, Spark, or FlinkProficiency in programming languages such as Python, Java, or ScalaStrong knowledge of SQL, data modeling, and query optimizationExperience with CI/CD tools and version control (e.g., Git, Cloud Build)Strong understanding of data governance, security, and compliance requirementsAbility to manage large-scale data processing and real-time data pipelinesExcellent problem-solving, analytical, and communication skillsExperience with machine learning pipelines and AI/ML model deploymentFamiliarity with Terraform and Infrastructure as Code (IaC) principlesExperience with NoSQL databases and key-value stores on GCPKnowledge of containerization and orchestration using Google Kubernetes Engine (GKE)BenefitsCompetitive salary and performance-based incentives.Comprehensive health, dental, and vision coverage.Professional development and training opportunities (including GCP certification).Flexible work environment and remote work options.Company OverviewThe Data Sherpas emerge as a beacon of expertise and innovation in the rapidly evolving digital era. It was founded in 2010, and is headquartered in San Francisco, California, USA, with a workforce of 11-50 employees. Its website is https://www.thedatasherpas.com.