[Remote] Data Engineer
Note: The job is a remote job and is open to candidates in USA. You.com is building the AI Search Infrastructure that powers modern AI systems. They are seeking a hands-on Data Engineer to help build and scale their modern data platform, focusing on developing reliable, high-performance data pipelines and systems.ResponsibilitiesBuild and maintain scalable data pipelines (batch and streaming) using tools like Databricks, Spark, Kafka, and AWS servicesDesign, develop, and optimize ETL/ELT workflows using DBT, PySpark, SQL, and tools like FivetranPartner closely with marketing and growth teams to enable data use cases such as segmentation, campaign targeting, and lifecycle analyticsDevelop and maintain reverse ETL pipelines to sync data from the warehouse to tools like Salesforce, HubSpot, Braze, and other downstream systemsCreate and manage curated datasets to support analytics, reporting, and go-to-market initiativesBuild and maintain dashboards and reporting layers to support marketing and business performance trackingSupport AI/ML and agent-based applications by preparing and serving high-quality datasets for RAG pipelines and MCP (Model Context Protocol) integrationsMonitor pipeline performance, troubleshoot issues, and ensure high data reliability and qualityImplement data quality checks, validations, and alerting mechanisms across both ingestion and activation layersCollaborate with cross-functional teams to define data contracts and ensure consistency across systemsSkills6+ years of experience in data engineering or a related fieldStrong hands-on experience with Databricks, AWS (S3, Glue, Athena, EMR, etc.), and KafkaProficiency in Python (PySpark) and SQL for large-scale data processingExperience building and maintaining ETL/ELT pipelines (DBT/Airflow or similar experience preferred)Experience with data ingestion tools such as Fivetran (or similar)Familiarity with reverse ETL / data activation workflows and syncing data to tools like Salesforce, HubSpot, BrazeExposure to or experience with AI/ML data pipelines, including RAG architectures, vector databases, or embeddings workflowsFamiliarity with agent-based systems, MCP integrations, or LLM-powered applications is a strong plusExperience working with marketing, Product or growth teams on data use cases (segmentation, attribution, campaign analytics, etc.)Understanding of data modeling and working with large-scale datasets (batch and streaming)Experience creating dashboards and supporting reporting workflows (BI tools) for both internal and external audiencesStrong problem-solving skills and ability to debug production data issuesStrong communication skills and ability to work collaboratively across teamsBenefitsHubs in San Francisco and New York City offering regular in-person gatherings and co-working sessionsFlexible PTO with U.S. holidays observed and a week shutdown in December to rest and recharge*A competitive health insurance plan covers 100% of the policyholder and 75% for dependents*12 weeks of paid parental leave in the US*401k program, 3% match - vested immediately!*$500 work-from-home stipend to be used up to a year of your start date*$600 technology stipend to support a portion of our hybrid/remote team's cell phone and internet expenses*$1,200 per year Health & Wellness Allowance to support your personal goals*Certain perks and benefits are limited to full-time employees onlyCompany OverviewYou.com is a personalized AI search engine that delivers customized recommendations and allows natural conversation with its AI chatbot. It was founded in 2020, and is headquartered in Palo Alto, California, USA, with a workforce of 51-200 employees. Its website is https://you.com.Company H1B SponsorshipYou.com has a track record of offering H1B sponsorships, with 1 in 2026. Please note that this does not guarantee sponsorship for this specific role.