[Remote] AI Data Engineer
Note: The job is a remote job and is open to candidates in USA. DeepRec.ai is partnering with an AI focused HealthTech company centered around early-stage cancer detection. This remote Data Engineering role focuses on building and maintaining scalable pipelines for large healthcare datasets, ensuring data quality and compliance with healthcare regulations.ResponsibilitiesWork with Data Scientists and ML Engineers to define data needs for LLM and ML modelsBuild and maintain scalable data pipelines for large healthcare datasetsEnsure data quality through cleaning, validation, and monitoringDesign efficient data structures and schemas for model training and useSource new data while ensuring compliance with healthcare regulations (e.g., HIPAA)SkillsBachelor's degree in Computer Science, Engineering, or a related fieldExperience as a Data Engineer working with large-scale or big data systems such as Apache SparkStrong programming skills in Python, Scala, or JavaExperience with ETL pipelines, data warehousing, and data modellingFamiliarity with cloud platforms (AWS, GCP, or Azure) and tools like Apache SparkStrong problem-solving skillsMaster's degree in Computer Science, Engineering, Data Science, or a related fieldExperience working with healthcare data and standards such as FHIR or HL7Familiarity with machine learning concepts and LLM fine-tuning workflowsExperience using data orchestration tools such as Apache AirflowBenefitsCompetitive salary, benefits, and flexible remote/hybrid working optionsContinuous learning with exposure to cutting-edge AI, ML, and healthcare technologiesCompany OverviewWe are your AI and Deep Tech recruitment specialists, driven by a mission to power progress in the world’s most exciting industries. It was founded in 2023, and is headquartered in Bishop's Stortford, Hertfordshire, GB, with a workforce of 51-200 employees. Its website is https://www.deeprec.ai/.