[Remote] Senior Data Engineer – AI-Driven Data Pipeline Automation
Note: The job is a remote job and is open to candidates in USA. Yahoo serves as a trusted guide for hundreds of millions of people globally, helping them achieve their goals online through our portfolio of iconic products. In this role, you will design, build, and optimize scalable data pipelines and infrastructure to power advanced analytic solutions, collaborating closely with software engineers and business stakeholders to ensure robust and secure data flows.ResponsibilitiesDesign, build, and maintain scalable data pipelines and ETL processes to support machine learning and AI initiatives on Google Cloud Platform (GCP)Implement and optimize data storage solutions using GCP services such as BigQuery, Cloud Storage, and DataflowEnsure data quality, integrity, and security throughout the data lifecycleCollaborate with analysts and business stakeholders to understand data requirements and deliver actionable insightsMonitor, troubleshoot, and maintain the health and performance of cloud-based data infrastructureAutomate manual processes and repetitive tasks to improve efficiency and reduce errorsApply data governance and compliance best practices to protect sensitive information and meet regulatory standardsStay current with new GCP features, tools, and best practices to continuously enhance data management capabilitiesDocument solutions, processes, and architectural decisions to facilitate knowledge sharing and maintainabilitySkillsBS or MS in Computer Science or a related major, or equivalent experience7+ years of software engineering experience, with a strong emphasis on system design and backend development2+ years hands-on experience with Google Cloud Platform ecosystem (BigQuery, Dataproc, Composer, Dataflow, Data Catalog, Observability) or AWS equivalentProven ability to design, build, and maintain data pipelines that support machine learning and AI model development, training, and deploymentFluency with at least one object-oriented programming language from Java, Python, or Scala is highly desirable, as these skills are critical for developing robust applications and managing data workflows effectivelySQL proficiency is also valued for database operationsFamiliarity with data security, compliance, and governance best practicesStrong problem-solving skills, attention to detail, and ability to work collaboratively with cross-functional teamsExcellent communication skills and ability to tell insightful stories using data and also manage communication within internal teams and stakeholdersExposure to AI-assisted development tools such as Claude, GitHub Copilot, Cursor, or similar is highly desirableExperience with Google Analytics 360 is a plusBenefitsIncentive compensation opportunities in the form of discretionary annual bonus or commissionsHealthcareA great 401kBackup childcareEducation stipendsCompany OverviewYahoo is a technology and media company that serves users through its portfolio of digital platforms, products, and services. It is a sub-organization of Verizon Media. It was founded in 1994, and is headquartered in Sunnyvale, California, USA, with a workforce of 5001-10000 employees. Its website is http://www.yahoo.com.Company H1B SponsorshipYahoo has a track record of offering H1B sponsorships, with 197 in 2023, 646 in 2022, 381 in 2021, 463 in 2020. Please note that this does not guarantee sponsorship for this specific role.