[Remote] Data Engineer, Web Scraping
Note: The job is a remote job and is open to candidates in USA. 10a Labs is a company focused on safety and threat-intelligence for AI systems, collaborating with prominent technology platforms and companies. The Data Engineer role involves designing and optimizing data pipelines, conducting web scraping, and collaborating with teams to develop actionable insights and tools.ResponsibilitiesDesign, implement, and optimize end-to-end data pipelines for scraping and processing structured and unstructured data using Google Cloud Platform (or similar) and best practicesConduct ad hoc web scraping and data collection to support research and intelligence initiativesPrepare data for further analysis, including data cleaning, transformation, anonymization, and maskingContribute to the development of internal and external APIs, following best practicesCollaborate with ML engineers, other data engineers, and software developers to deliver actionable insights and functional tools, including internal and external dashboards, APIs, and data dumps; andDrive other critical initiativesSkillsDegree (or equivalent work experience) in Computer Science, Engineering, Information Science, Data Science or a related field (graduate degree preferred)2+ years of professional experience in data engineering or a closely related fieldAbility to communicate complex technical ideas clearly to non-technical audiencesProficiency in Python, SQLExperience with web scraping/crawling (e.g., Beautiful Soup, Selenium, Scrapy)Experience with Google Cloud Platform (or similar), including storage and database services (e.g., Cloud Storage, CloudSQL, Cloud Spanner) and workflow orchestration (e.g., Cloud Composer/Airflow, Cloud Run, Pub/Sub)Experience building and managing data pipelines, especially for text dataComfort working in fast-moving, high-impact environments, such as startups, AI research labs, or security-focused teamsBenefitsPerformance-based annual bonusSupport for conferences, continuing education, or leadership trainingFully remote, U.S.-basedComprehensive health, dental, and vision coverageGenerous PTO and paid holiday schedule401(k) planCompany Overview10a Labs is the safety and threat-intelligence layer trusted by frontier AI labs, AI unicorns, Fortune 10 companies, and leading global technology platforms. It was founded in 2021, and is headquartered in , with a workforce of 11-50 employees. Its website is https://10alabs.com/.