AIS Research Insights

← Back to Library Language:

Using Large Language Models for Healthcare Data Interoperability: A Data Mediation Pipeline to Integrate Heterogeneous Patient-Generated Health Data and FHIR

Torben Ukena, Robin Wagler, and Rainer Alt

This study explores the use of Large Language Models (LLMs) to streamline the integration of diverse patient-generated health data (PGHD) from sources like wearables. The researchers propose and evaluate a data mediation pipeline that combines an LLM with a validation mechanism to automatically transform various data formats into the standardized Fast Healthcare Interoperability Resources (FHIR) format.

Problem Integrating patient-generated health data from various devices into clinical systems is a major challenge due to a lack of interoperability between different data formats and hospital information systems. This data fragmentation hinders clinicians' ability to get a complete view of a patient's health, potentially leading to misinformed decisions and obstacles to patient-centered care.

Outcome - LLMs can effectively translate heterogeneous patient-generated health data into the valid, standardized FHIR format, significantly improving healthcare data interoperability.
- Providing the LLM with a few examples (few-shot prompting) was more effective than providing it with abstract rules and guidelines (reasoning prompting).
- The inclusion of a validation and self-correction loop in the pipeline is crucial for ensuring the LLM produces accurate and standard-compliant output.
- While successful with text-based data, the LLM struggled to accurately aggregate values from complex structured data formats like JSON and CSV, leading to lower semantic accuracy in those cases.

FHIR, semantic interoperability, large language models, hospital information system, patient-generated health data