[Remote] Senior Software Engineer, Data Platform
Note: The job is a remote job and is open to candidates in USA. Medecision is seeking a Senior Software Engineer for their Data Platform to provide innovative solutions for managing health and care. The role involves designing and building a cloud-native data platform that supports clinical analytics and reporting, while ensuring the reliability and scalability of data pipelines.ResponsibilitiesDesign, develop, and maintain production-grade Python and Java data services and pipelines deployed on Google Cloud Platform, following established architectural conventions, coding standards, and data platform patternsBuild and evolve Google Cloud Dataflow batch and streaming pipelines for data ingestion, standardization, curation, and analytics load. Handling deduplication, validation, member-reference integrity, and incremental/full-reload modesImplement and maintain event-driven data workflows using Google Cloud Pub/Sub, including file-complete notification topics, FHIR ingest topics, and Pub/Sub ? BigQuery export subscriptionsDesign and manage BigQuery datasets, table schemas, partitioning strategies (range-bucket by member partition), clustering, and reporting views — across standard analytics, custom analytics, and curated storage layersBuild and maintain Cloud Composer (Apache Airflow) DAGs for workflow orchestration — including file ingestion DAGs, test execution DAGs, and third-party processing DAGs (e.g., MEG)Develop and maintain Cloud Run microservices (e.g., Ingestion Event Service, Custom Dataset Management Service) and Cloud Functions (inbound GCS bucket triggers)Participate in design and code reviews; mentor junior engineers and contribute to shared coding standards, patterns, and team knowledge baseCollaborate with on-shore and off-shore teams, architects, and tech leads to ensure on-time delivery and best engineering practicesContribute to CI/CD pipeline improvements using GitLab CI/CD, including build, test, containerization (Docker), and deployment automation to GCP environmentsEngage proactively in the triage and resolution of escalated production issues — diagnosing failures, investigating root causes, and driving durable fixes with a sense of urgency, clear communication to stakeholders, and a commitment to preventing recurrenceFollow and comply with all security policies and procedures established by the organization, including adherence to HIPAA and HITRUST regulationsOwn and evolve the end-to-end data ingestion pipeline: from SFTP/GCS file receipt through ingestion, standardization, curation, and load to BigQuery analytics datasetsDesign and implement custom dataset ingestion capabilities — including dynamic schema mapping, configuration-driven pipeline execution, and automatic BigQuery table/view provisioning on dataset activationMaintain and improve the Ingestion Event Service (IES). The platform's central tracking service for data ingestion progress — including Pub/Sub async processing and Firestore document managementImplement FHIR R4 data ingestion workflows via Pub/Sub and GCP Healthcare Datasets, including XSD/schema validation and HL7 resource mappingSupport population matching data flows, ensuring custom dataset filters integrate correctly with population builder services and BigQuery analytics queriesBuild and maintain reporting dataset views (BI-compatible BigQuery views) for tenant data exposureIntegrate with third-party clinical analytics engines via GCE VM instance templates, Airflow DAGs, and GCS-based data exchangeEnsure all data services meet HIPAA requirements: PHI handling, tenant data isolation, audit logging, and data classificationDemonstrate a solid understanding of AI concepts, capabilities, and limitations as they apply to software engineering workflows, including code generation, test scaffolding, and documentationUse Claude Code as a primary productivity tool for code drafting, refactoring, test generation, and technical documentation — applying it with judgment, rigor, and accountabilityLeverage AI-assisted workflows to accelerate implementation, surface edge cases, generate structured artifacts, and conduct at-scale analysis of service dependencies and API contractsContribute to building and exposing MCP-wrapped APIs that enable AI agents to safely interact with platform servicesMaintain strict HIPAA discipline in all AI-assisted work: no real PHI in prompts or AI-generated artifacts; adhere to managed-settings policies and complete mandatory HIPAA + AI trainingContribute to the team's shared AI knowledge base. Validated prompts, skills, and workflows — and participate in the AI Champions community of practiceSkillsBachelor's degree in Computer Science, Software Engineering, Data Engineering, or equivalent practical experience5+ years of data engineering or backend software engineering experience building production data pipelines and platform servicesProven hands-on experience with Google Cloud Platform data services: BigQuery (schema design, partitioning, clustering, query optimization), Cloud Storage, Cloud Pub/Sub, Cloud Dataflow (Apache Beam), Cloud Composer (Airflow), Cloud Run, Cloud Functions, Firestore, Cloud SQL (PostgreSQL), and Secret ManagerStrong proficiency in Java for data pipeline developmentProficiency in Python for Airflow DAG authoring, and automation scriptingExperience designing and implementing batch and streaming data pipelines — including file-based ingestion, event-driven processing, deduplication, validation, incremental load, and full-reload patternsProficiency with BigQuery data modeling: partitioned and clustered tables, dataset organization (standardized, curated, analytics, custom analytics, reporting layers), and SQL query optimizationExperience with Apache Airflow / Cloud Composer — authoring, deploying, and maintaining production DAGs with parameterized configurations and robust error handlingExperience with containerization (Docker) and deploying services to cloud-native environmentsProficiency with GitLab CI/CD for pipeline automation and multi-environment deploymentExcellent communication skills — able to articulate technical decisions, participate in design reviews, and collaborate effectively with cross-functional teamsFamiliarity with Datadog for service monitoring, alerting, and observability in a cloud-native data platformFamiliarity with Sisense or equivalent BI/reporting platforms and BigQuery view-based reporting patternsSolid understanding of AI concepts, capabilities, and limitations as they apply to software engineering and product delivery workflowsHands-on experience with Claude Code or equivalent AI-assisted tools. Used as a primary productivity tool for code generation, refactoring, test scaffolding, and documentation, not just experimentallyAbility to evaluate AI-generated code critically: identifying hallucinations, logic errors, security gaps, and missing edge cases before they reach productionPractical understanding of MCP (Model Context Protocol) or strong willingness to learn — for building tool wrappers that expose platform APIs to AI agents safely and with appropriate guardrailsCommitment to responsible AI use: applying AI with judgment, rigor, and personal accountability. Consistent with the principle that humans own decisions, agents own toilHIPAA discipline in AI-assisted work: understanding of PHI boundaries in AI workflows and commitment to managed-settings policies and mandatory HIPAA + AI trainingOpenness to contributing to and learning from a shared AI knowledge base. Validated prompts, skills, and workflows — and active participation in the AI Champions community of practiceKnowledge of HIPAA and experience working in HIPAA-regulated product environments, including PHI handling, data classification, and audit requirementsHands-on experience with HAPI FHIR R4 and healthcare interoperability standards (HL7, FHIR resource mapping, validation workflows)Understanding of multi-tenant SaaS architecture patterns — tenant context propagation, per-tenant feature flags, and data isolationCompany OverviewMEDecision provides collaborative health care management solutions for clients. It was founded in 1988, and is headquartered in Wayne, Pennsylvania, USA, with a workforce of 201-500 employees. Its website is http://www.medecision.com.