Mid-Level Data Engineer – Healthcare Domain

Remote

Full–time

Posted on: 56 years ago

About The Role

We are seeking a motivated and detail-oriented Mid-Level Data Engineer with 2–3 years of experience in designing, developing, and optimizing data pipelines within the healthcare domain. The ideal candidate will have hands-on experience with Databricks, strong SQL skills, and a solid understanding of healthcare data standards (e.g., HL7, EDI X12 – 837/835, HCC, CPT/ICD codes).

Key Responsibilities
  • Design, develop, and maintain scalable ETL/ELT pipelines using Databricks, PySpark, and Delta Lake for large-scale healthcare datasets.
  • Collaborate with data scientists, analysts, and product managers to understand data requirements and deliver clean, reliable data.
  • Ingest, process, and transform healthcare-related data such as claims (837/835), EHR/EMR, provider/member, and clinical datasets.
  • Implement data quality checks, validations, and transformations to ensure high data integrity and compliance with healthcare regulations.
  • Optimize data pipeline performance, reliability, and cost in cloud environments (preferably Azure or AWS).
  • Maintain documentation of data sources, data models, and transformations.
  • Support analytics and reporting teams with curated datasets and data marts.
  • Adhere to HIPAA and organizational standards for handling PHI and sensitive data.
  • Assist in troubleshooting data issues and root cause analysis across systems.

  • Required Qualifications
  • 2–3 years of experience in a data engineering role, preferably in the healthcare or healthtech sector.
  • Hands-on experience with Databricks, Apache Spark (PySpark), and SQL.
  • Familiarity with Delta Lake, data lakes, and modern data architectures.
  • Solid understanding of healthcare data standards: EDI 837/835, CPT, ICD-10, DRG, or HCC.
  • Experience with version control (e.g., Git), CI/CD workflows, and task orchestration tools (e.g., Airflow, Azure Data Factory, dbt).
  • Ability to work with both structured and semi-structured data (JSON, Parquet, Avro, etc.).
  • Strong communication skills and ability to collaborate in cross-functional teams.

  • Education
  • Bachelor’s degree in Business Administration, Healthcare Informatics, Information Systems, or a related field.