[Remote] Data Operations Lead
Note: The job is a remote job and is open to candidates in USA. Bioptimus is building the first universal AI foundation model for biology to fuel breakthrough discoveries and accelerate innovation in biomedicine. They are seeking a Data Operations Lead to manage the operational lifecycle of biomedical data partnerships and coordinate between external partners and internal teams to ensure successful data management and quality oversight.ResponsibilitiesOwn the operational lifecycle of external data partnerships following contract signatureAct as the primary operational and technical point of contact for hospitals, biobanks, CROs, and research laboratoriesCoordinate onboarding, data delivery timelines, and stakeholder communication to ensure successful execution of partnership milestonesManage secure biomedical data transfers using cloud infrastructure and standardized transfer protocolsCoordinate access management, encryption, and ingestion workflows across cloud storage systems (AWS S3, SFTP, APIs, direct upload pipelines)Ensure incoming datasets are delivered, validated, and tracked according to internal governance standardsCollaborate with internal technical and product teams to define and maintain harmonized data models and metadata standards across complex clinical and multi-modal datasetsOrganize and maintain relationships between clinical metadata and associated omics or imaging assets, including genomics, transcriptomics, spatial biology, and pathology dataWork closely with engineering and data teams to configure and maintain lightweight ingestion and QC pipelinesIdentify operational bottlenecks and repetitive workflows and convert them into scalable systems, scripts, templates, dashboards, or automation tools that improve operational efficiency and visibilityCoordinate automated and manual quality control checks across incoming datasetsIdentify missing data, inconsistencies, corruption, or metadata mismatches and work directly with external partners to resolve issuesEnsure data integrity, traceability, and version control throughout the ingestion processMaintain a centralized 'single source of truth' for all incoming datasets, including ingestion status, completeness, QC status, and milestone trackingBuild and maintain reporting dashboards and operational tools to provide visibility into project progress, ingestion velocity, and operational risksPartner closely with Data Science, Engineering, Legal, and Partnership teams to align operational execution with business and scientific prioritiesCommunicate technical issues clearly to both scientific collaborators and non-technical stakeholdersProvide regular updates on operational risks, blockers, and delivery progressConduct periodic visits to partner hospitals, biobanks, and laboratories to support onboarding, troubleshoot technical or operational bottlenecks, and strengthen long-term collaborationsSkillsStrong understanding of clinical and biomedical data structures, including real-world data, clinical trial datasets, and multi-omics data modalitiesFamiliarity with oncology, immunology, or related therapeutic areas is highly desirableProven experience managing data lifecycles in cloud environments, particularly AWS (S3, CLI, access management)Familiarity with secure data transfer protocols and large-scale biomedical data handling workflowsProficiency in Python or R, along with SQL for querying and transforming datasetsAbility to write lightweight scripts, automate workflows, and interact with APIs or cloud-based systemsDemonstrated ability to manage multiple external collaborations and operational workstreams simultaneouslyExcellent communication skills, with the ability to translate technical issues into clear guidance for both scientific and non-technical stakeholdersComfortable working independently in ambiguous environmentsStrong analytical and organizational skills with the ability to identify bottlenecks, improve processes, and drive operational efficiencyBachelor's or Master's degree in Life Sciences, Bioinformatics, Health Informatics, Computer Science, or a related quantitative fieldExperience working directly with hospitals, biobanks, laboratories, or clinical research organizationsFamiliarity with biomedical data standards, anonymization, and compliance frameworks (GDPR, HIPAA)Experience managing large-scale biomedical datasets in cloud environments, particularly AWSKnowledge of digital pathology and/or multi-omics data workflowsExperience handling genomics and transcriptomics file formats (e.g. FASTQ, BAM, VCF, TIFF)Experience building operational tracking tools, dashboards, or reporting systemsExperience automating operational workflows using scripts, APIs, or lightweight pipelinesProven ability to manage cross-functional and external stakeholder relationships in complex data projectsBenefitsA collaborative and mission-driven work environment.Competitive salary and equity package.Flexible work arrangements, including remote options.Opportunities for professional growth and leadership development.The opportunity to shape the future of biology and AI through groundbreaking work.Company OverviewBioptimus is an AI company that offers a foundation model for biology to fuel breakthrough discoveries and accelerate. It was founded in 2024, and is headquartered in Paris, Ile-de-France, FRA, with a workforce of 11-50 employees. Its website is https://www.bioptimus.com.