Skip to content

Commit 0007823

Browse files
Feature/fhir api gateway (#128)
* Refactor sandbox to separate module * Fix tests * Gateway module WIP * Update poetry.lock * Add fhirclient, bump python min to 3.9, update poetry * Update module structure * Use consistent pydantic models to silence serialization warning * Fix bugs in configs * Fix code snippet in docs * Migrating service module to gateway/services * Implement cdshooks and notereader services plus clean up on base and events * Remove async from cdshooks * Update relatesTo access * Add fields to CdsRequest * Update sandbox usage and tests * Update tests for gateway module * Remove scrap * Add HealthChainAPI class and FhirRouter placeholder * Update poetry.lock * Update CI python version * Fix typo * Pass test * Fix namespace conflict * Fix patching issue in tests for python 3.10 * Fix pydantic to <2.11 * Tidy up structure * Added event dispatch and unified everything to gateways * Update dependencies * Added dependency injection and protocol with tests * Deprecate Service module * Fix tests * Chaotic WIP * Added connection pool management * Add validation of response * Add handler validation and read only method WIP * poetry.lock * Only run CI on non-draft PRs * Simpler FHIR client implementation * Update poetry - remove fhirclient * Update dependencies * Added oauth2.0 flow and dynamic jwt token management * Added tests for client and auth * Update connection pool impl and fhir methods * Update tests for client and pool * Refactor cdshooks and notereader as services instead of gateways * Add validation for client secret path in auth and add tests * Add FHIR error handling to methods * Refactor FHIRGateway to /core and clean up base class + separate connection and error handling * Remove result validation and use strict type[Resource] inputs * Move connection string methods to auth config and removed unused methods * Remove debug log * Refactor event dispatch * Clean up tests * Minor refactor * Add crud operations to fhirgateway * Add metadata and status endpoints in fhirgateway * Clean up protocols and event dispatcher * Refactor event emission * Update tests * Refactor HealthChainAPI lifespan and route management * Refactor EventDispatcher and tidy some tests * Fix pass kwargs to client * Potential fix for code scanning alert no. 9: Incomplete URL substring sanitization Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Test fix for python3.9 * 3.9 Event loop fix * Update .gitignore * Use asyncio in test * Use asyncio instead of anyio * Remove unused dependencies * Add gateway reference to docs * Update README.md * Update docs * Remove security and monitoring modules from gateway (moved to separate branch) * Clean up sandbox docs --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
1 parent 6687e32 commit 0007823

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

69 files changed

+6807
-2293
lines changed

β€Ž.github/workflows/ci.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,11 @@ on:
88
branches: [ "main" ]
99
pull_request:
1010
branches: [ "main" ]
11+
types: [opened, synchronize, reopened, ready_for_review]
1112

1213
jobs:
1314
test:
15+
if: github.event.pull_request.draft == false
1416
strategy:
1517
matrix:
1618
python-version: ["3.9", "3.10", "3.11"]

β€Ž.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -165,3 +165,5 @@ scrap/
165165
.vscode/
166166
.ruff_cache/
167167
.python-version
168+
.cursor/
169+
scripts/

β€ŽREADME.md

Lines changed: 110 additions & 103 deletions
Original file line numberDiff line numberDiff line change
@@ -10,32 +10,115 @@
1010

1111
</div>
1212

13-
Build simple, portable, and scalable AI and NLP applications in a healthcare context πŸ’« πŸ₯.
13+
Connect your AI models to any healthcare system with a few lines of Python πŸ’« πŸ₯.
14+
15+
Integrating AI with electronic health records (EHRs) is complex, manual, and time-consuming. Let's try to change that.
1416

15-
Integrating electronic health record systems (EHRs) data is complex, and so is designing reliable, reactive algorithms involving unstructured healthcare data. Let's try to change that.
1617

1718
```bash
1819
pip install healthchain
1920
```
2021
First time here? Check out our [Docs](https://dotimplement.github.io/HealthChain/) page!
2122

22-
Came here from NHS RPySOC 2024 ✨?
23-
[CDS sandbox walkthrough](https://dotimplement.github.io/HealthChain/cookbook/cds_sandbox/)
24-
[Slides](https://speakerdeck.com/jenniferjiangkells/building-healthcare-context-aware-applications-with-healthchain)
2523

2624
## Features
27-
- [x] πŸ”₯ Build FHIR-native pipelines or use [pre-built ones](https://dotimplement.github.io/HealthChain/reference/pipeline/pipeline/#prebuilt) for your healthcare NLP and ML tasks
28-
- [x] πŸ”Œ Connect pipelines to any EHR system with built-in [CDA and FHIR Connectors](https://dotimplement.github.io/HealthChain/reference/pipeline/connectors/connectors/)
29-
- [x] πŸ”„ Convert between FHIR, CDA, and HL7v2 with the [InteropEngine](https://dotimplement.github.io/HealthChain/reference/interop/interop/)
30-
- [x] πŸ§ͺ Test your pipelines in full healthcare-context aware [sandbox](https://dotimplement.github.io/HealthChain/reference/sandbox/sandbox/) environments
31-
- [x] πŸ—ƒοΈ Generate [synthetic healthcare data](https://dotimplement.github.io/HealthChain/reference/utilities/data_generator/) for testing and development
32-
- [x] πŸš€ Deploy sandbox servers locally with [FastAPI](https://fastapi.tiangolo.com/)
25+
- [x] πŸ”Œ **Gateway**: Connect to multiple EHR systems with [unified API](https://dotimplement.github.io/HealthChain/reference/gateway/gateway/) supporting FHIR, CDS Hooks, and SOAP/CDA protocols
26+
- [x] πŸ”₯ **Pipelines**: Build FHIR-native ML workflows or use [pre-built ones](https://dotimplement.github.io/HealthChain/reference/pipeline/pipeline/#prebuilt) for your healthcare NLP and AI tasks
27+
- [x] πŸ”„ **InteropEngine**: Convert between FHIR, CDA, and HL7v2 with a [template-based engine](https://dotimplement.github.io/HealthChain/reference/interop/interop/)
28+
- [x] πŸ”’ Type-safe healthcare data with full type hints and Pydantic validation for [FHIR resources](https://dotimplement.github.io/HealthChain/reference/utilities/fhir_helpers/)
29+
- [x] ⚑ Event-driven architecture with real-time event handling and [audit trails](https://dotimplement.github.io/HealthChain/reference/gateway/events/) built-in
30+
- [x] πŸš€ Deploy production-ready applications with [HealthChainAPI](https://dotimplement.github.io/HealthChain/reference/gateway/api/) and FastAPI integration
31+
- [x] πŸ§ͺ Generate [synthetic healthcare data](https://dotimplement.github.io/HealthChain/reference/utilities/data_generator/) and [sandbox testing](https://dotimplement.github.io/HealthChain/reference/sandbox/sandbox/) utilities
3332

3433
## Why use HealthChain?
35-
- **EHR integrations are manual and time-consuming** - HealthChain abstracts away complexities so you can focus on AI development, not EHR configurations.
36-
- **It's difficult to track and evaluate multiple integration instances** - HealthChain provides a framework to test the real-world resilience of your whole system, not just your models.
37-
- [**Most healthcare data is unstructured**](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6372467/) - HealthChain is optimized for real-time AI and NLP applications that deal with realistic healthcare data.
38-
- **Built by health tech developers, for health tech developers** - HealthChain is tech stack agnostic, modular, and easily extensible.
34+
- **EHR integrations are manual and time-consuming** - **HealthChainAPI** abstracts away complexities so you can focus on AI development, not learning FHIR APIs, CDS Hooks, and authentication schemes.
35+
- **Healthcare data is fragmented and complex** - **InteropEngine** handles the conversion between FHIR, CDA, and HL7v2 so you don't have to become an expert in healthcare data standards.
36+
- [**Most healthcare data is unstructured**](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6372467/) - HealthChain **Pipelines** are optimized for real-time AI and NLP applications that deal with realistic healthcare data.
37+
- **Built by health tech developers, for health tech developers** - HealthChain is tech stack agnostic, modular, and easily extensible with built-in compliance and audit features.
38+
39+
## HealthChainAPI
40+
41+
The HealthChainAPI provides a secure, asynchronous integration layer that coordinates multiple healthcare systems in a single application.
42+
43+
### Multi-Protocol Support
44+
45+
Connect to multiple healthcare data sources and protocols:
46+
47+
```python
48+
from healthchain.gateway import (
49+
HealthChainAPI, FHIRGateway,
50+
CDSHooksService, NoteReaderService
51+
)
52+
53+
# Create your healthcare application
54+
app = HealthChainAPI(
55+
title="My Healthcare AI App",
56+
description="AI-powered patient care platform"
57+
)
58+
59+
# FHIR for patient data from multiple EHRs
60+
fhir = FHIRGateway()
61+
fhir.add_source("epic", "fhir://fhir.epic.com/r4?client_id=...")
62+
fhir.add_source("medplum", "fhir://api.medplum.com/fhir/R4/?client_id=...")
63+
64+
# CDS Hooks for real-time clinical decision support
65+
cds = CDSHooksService()
66+
67+
@cds.hook("patient-view", id="allergy-alerts")
68+
def check_allergies(request):
69+
# Your AI logic here
70+
return {"cards": [...]}
71+
72+
# SOAP for clinical document processing
73+
notes = NoteReaderService()
74+
75+
@notes.method("ProcessDocument")
76+
def process_note(request):
77+
# Your NLP pipeline here
78+
return processed_document
79+
80+
# Register everything
81+
app.register_gateway(fhir)
82+
app.register_service(cds)
83+
app.register_service(notes)
84+
85+
# Your API now handles:
86+
# /fhir/* - Patient data, observations, etc.
87+
# /cds/* - Real-time clinical alerts
88+
# /soap/* - Clinical document processing
89+
```
90+
91+
### FHIR Operations with AI Enhancement
92+
93+
```python
94+
from healthchain.gateway import FHIRGateway
95+
from fhir.resources.patient import Patient
96+
97+
gateway = FHIRGateway()
98+
gateway.add_source("epic", "fhir://fhir.epic.com/r4?...")
99+
100+
# Add AI transformations to FHIR data
101+
@gateway.transform(Patient)
102+
async def enhance_patient(id: str, source: str = None) -> Patient:
103+
async with gateway.modify(Patient, id, source) as patient:
104+
# Get lab results and process with AI
105+
lab_results = await gateway.search(
106+
Observation,
107+
{"patient": id, "category": "laboratory"},
108+
source
109+
)
110+
insights = nlp_pipeline.process(patient, lab_results)
111+
112+
# Add AI summary to patient record
113+
patient.extension = patient.extension or []
114+
patient.extension.append({
115+
"url": "http://healthchain.org/fhir/summary",
116+
"valueString": insights.summary
117+
})
118+
return patient
119+
120+
# Automatically available at: GET /fhir/transform/Patient/123?source=epic
121+
```
39122

40123
## Pipeline
41124
Pipelines provide a flexible way to build and manage processing pipelines for NLP and ML tasks that can easily integrate with complex healthcare systems.
@@ -139,116 +222,40 @@ cda_data = engine.from_fhir(fhir_resources, dest_format=FormatType.CDA)
139222

140223
## Sandbox
141224

142-
Sandboxes provide a staging environment for testing and validating your pipeline in a realistic healthcare context.
143-
144-
### Clinical Decision Support (CDS)
145-
[CDS Hooks](https://cds-hooks.org/) is an [HL7](https://cds-hooks.hl7.org) published specification for clinical decision support.
146-
147-
**When is this used?** CDS hooks are triggered at certain events during a clinician's workflow in an electronic health record (EHR), e.g. when a patient record is opened, when an order is elected.
148-
149-
**What information is sent**: the context of the event and [FHIR](https://hl7.org/fhir/) resources that are requested by your service, for example, the patient ID and information on the encounter and conditions they are being seen for.
150-
151-
**What information is returned**: β€œcards” displaying text, actionable suggestions, or links to launch a [SMART](https://smarthealthit.org/) app from within the workflow.
152-
225+
Test your AI applications in realistic healthcare contexts with [CDS Hooks](https://cds-hooks.org/) sandbox environments.
153226

154227
```python
155228
import healthchain as hc
156-
157-
from healthchain.pipeline import SummarizationPipeline
158229
from healthchain.sandbox.use_cases import ClinicalDecisionSupport
159-
from healthchain.models import Card, Prefetch, CDSRequest
160-
from healthchain.data_generator import CdsDataGenerator
161-
from typing import List
162230

163231
@hc.sandbox
164232
class MyCDS(ClinicalDecisionSupport):
165-
def __init__(self) -> None:
166-
self.pipeline = SummarizationPipeline.from_model_id(
167-
"facebook/bart-large-cnn", source="huggingface"
168-
)
169-
self.data_generator = CdsDataGenerator()
233+
def __init__(self):
234+
self.pipeline = SummarizationPipeline.from_model_id("facebook/bart-large-cnn")
170235

171-
# Sets up an instance of a mock EHR client of the specified workflow
172236
@hc.ehr(workflow="encounter-discharge")
173-
def ehr_database_client(self) -> Prefetch:
237+
def ehr_database_client(self):
174238
return self.data_generator.generate_prefetch()
175239

176-
# Define your application logic here
177-
@hc.api
178-
def my_service(self, data: CDSRequest) -> CDSRequest:
179-
result = self.pipeline(data)
180-
return result
181-
```
182-
183-
### Clinical Documentation
184-
185-
The `ClinicalDocumentation` use case implements a real-time Clinical Documentation Improvement (CDI) service. It helps convert free-text medical documentation into coded information that can be used for billing, quality reporting, and clinical decision support.
186-
187-
**When is this used?** Triggered when a clinician opts in to a CDI functionality (e.g. Epic NoteReader) and signs or pends a note after writing it.
188-
189-
**What information is sent**: A [CDA (Clinical Document Architecture)](https://www.hl7.org.uk/standards/hl7-standards/cda-clinical-document-architecture/) document which contains continuity of care data and free-text data, e.g. a patient's problem list and the progress note that the clinician has entered in the EHR.
190-
191-
```python
192-
import healthchain as hc
193-
194-
from healthchain.pipeline import MedicalCodingPipeline
195-
from healthchain.sandbox.use_cases import ClinicalDocumentation
196-
from healthchain.models import CdaRequest, CdaResponse
197-
from fhir.resources.documentreference import DocumentReference
198-
199-
@hc.sandbox
200-
class NotereaderSandbox(ClinicalDocumentation):
201-
def __init__(self):
202-
self.pipeline = MedicalCodingPipeline.from_model_id(
203-
"en_core_sci_md", source="spacy"
204-
)
205-
206-
# Load an existing CDA file
207-
@hc.ehr(workflow="sign-note-inpatient")
208-
def load_data_in_client(self) -> DocumentReference:
209-
with open("/path/to/cda/data.xml", "r") as file:
210-
xml_string = file.read()
211-
212-
cda_document_reference = create_document_reference(
213-
data=xml_string,
214-
content_type="text/xml",
215-
description="Original CDA Document loaded from my sandbox",
216-
)
217-
return cda_document_reference
218-
219-
@hc.api
220-
def my_service(self, data: CdaRequest) -> CdaResponse:
221-
annotated_ccd = self.pipeline(data)
222-
return annotated_ccd
223-
```
224-
### Running a sandbox
225-
226-
Ensure you run the following commands in your `mycds.py` file:
227-
228-
```python
229240
cds = MyCDS()
230241
cds.start_sandbox()
231-
```
232-
This will populate your EHR client with the data generation method you have defined, send requests to your server for processing, and save the data in the `./output` directory.
233242

234-
Then run:
235-
```bash
236-
healthchain run mycds.py
243+
# Run with: healthchain run mycds.py
237244
```
238-
By default, the server runs at `http://127.0.0.1:8000`, and you can interact with the exposed endpoints at `/docs`.
239245

240246
## Road Map
241-
- [x] πŸ”„ Transform and validate healthcare HL7v2, CDA to FHIR with template-based interop engine
242-
- [ ] πŸ₯ Runtime connection health and EHR integration management - connect to FHIR APIs and legacy systems
247+
- [ ] πŸ”’ Built-in HIPAA compliance validation and PHI detection
243248
- [ ] πŸ“Š Track configurations, data provenance, and monitor model performance with MLFlow integration
244249
- [ ] πŸš€ Compliance monitoring, auditing at deployment as a sidecar service
245-
- [ ] πŸ”’ Built-in HIPAA compliance validation and PHI detection
246-
- [ ] 🧠 Multi-modal pipelines that that have built-in NLP to utilize unstructured data
250+
- [ ] πŸ”„ HL7v2 parsing and FHIR profile conversion support
251+
- [ ] 🧠 Multi-modal pipelines
252+
247253

248254
## Contribute
249255
We are always eager to hear feedback and suggestions, especially if you are a developer or researcher working with healthcare systems!
250256
- πŸ’‘ Let's chat! [Discord](https://discord.gg/UQC6uAepUz)
251257
- πŸ› οΈ [Contribution Guidelines](CONTRIBUTING.md)
252258

253-
## Acknowledgement
254-
This repository makes use of [fhir.resources](https://github.com/nazrulworld/fhir.resources), and [CDS Hooks](https://cds-hooks.org/) developed by [HL7](https://www.hl7.org/) and [Boston Children’s Hospital](https://www.childrenshospital.org/).
259+
260+
## Acknowledgements πŸ€—
261+
This project builds on [fhir.resources](https://github.com/nazrulworld/fhir.resources) and [CDS Hooks](https://cds-hooks.org/) standards developed by [HL7](https://www.hl7.org/) and [Boston Children's Hospital](https://www.childrenshospital.org/).
219 KB
Loading

β€Ždocs/index.md

Lines changed: 15 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# Welcome to HealthChain
1+
# Welcome to HealthChain πŸ’« πŸ₯
22

3-
HealthChain πŸ’«πŸ₯ is an open-source Python framework designed to streamline the development, testing, and validation of AI, Natural Language Processing, and Machine Learning applications in a healthcare context.
3+
HealthChain is an open-source Python framework for building real-time AI applications in a healthcare context.
44

55
[ :fontawesome-brands-discord: Join our Discord](https://discord.gg/UQC6uAepUz){ .md-button .md-button--primary }
66
&nbsp;&nbsp;&nbsp;&nbsp;
@@ -19,19 +19,19 @@ HealthChain πŸ’«πŸ₯ is an open-source Python framework designed to streamline t
1919

2020
[:octicons-arrow-right-24: Pipeline](reference/pipeline/pipeline.md)
2121

22-
- :octicons-beaker-24:{ .lg .middle } __Test in a sandbox__
22+
- :material-connection:{ .lg .middle } __Connect to multiple data sources__
2323

2424
---
2525

26-
Test your models in a full health-context aware environment from day 1
26+
Connect to multiple healthcare data sources and protocols with **HealthChainAPI**.
2727

28-
[:octicons-arrow-right-24: Sandbox](reference/sandbox/sandbox.md)
28+
[:octicons-arrow-right-24: Gateway](reference/gateway/gateway.md)
2929

3030
- :material-database:{ .lg .middle } __Interoperability__
3131

3232
---
3333

34-
Configuration-driven InteropEngine to convert between FHIR, CDA, and HL7v2
34+
Configuration-driven **InteropEngine** to convert between FHIR, CDA, and HL7v2
3535

3636
[:octicons-arrow-right-24: Interoperability](reference/interop/interop.md)
3737

@@ -49,16 +49,17 @@ HealthChain πŸ’«πŸ₯ is an open-source Python framework designed to streamline t
4949

5050
## Why HealthChain?
5151

52-
You've probably heard every *AI will revolutionize healthcare* pitch by now, but if you're one of the people who think: wait, can we go beyond just vibe-checking and *actually* build products that are reliable, reactive, and easy to scale in complex healthcare systems? Then HealthChain is probably for you.
52+
Healthcare AI development has a **missing middleware layer**. Traditional enterprise integration engines move data around, EHR platforms serve end users, but there's nothing in between for developers building AI applications that need to talk to multiple healthcare systems. Few solutions are open-source, and even fewer are built in modern Python where most ML/AI libraries thrive.
5353

54-
Specifically, HealthChain addresses two challenges:
54+
HealthChain fills that gap with:
5555

56-
1. **Scaling Electronic Health Record system (EHRs) integrations of real-time AI, NLP, and ML applications is a manual and time-consuming process.**
56+
- **πŸ”₯ FHIR-native ML pipelines** - Pre-built NLP/ML pipelines optimized for structured / unstructured healthcare data, or build your own with familiar Python libraries such as πŸ€— Hugging Face, πŸ€– LangChain, and πŸ“š spaCy
57+
- **πŸ”’ Type-safe healthcare data** - Full type hints and Pydantic validation for FHIR resources with automatic data validation and error handling
58+
- **πŸ”Œ Multi-protocol connectivity** - Handle FHIR, CDS Hooks, and SOAP/CDA in the same codebase with OAuth2 authentication and connection pooling
59+
- **⚑ Event-driven architecture** - Real-time event handling with audit trails and workflow automation built-in
60+
- **πŸ”„ Built-in interoperability** - Convert between FHIR, CDA, and HL7v2 using a template-based engine
61+
- **πŸš€ Production-ready deployment** - FastAPI integration for scalable, real-time applications
5762

58-
2. **Testing and evaluating unstructured data in complex, outcome focused systems is a challenging and labour-intensive task.**
59-
60-
We believe more efficient end-to-end pipeline and integration testing at an early stage in development will give you back time to focus on what actually matters: developing safer, more effective and more explainable models that scale to real-world *adoption*. Building products for healthcare in a process that is *human*-centric.
61-
62-
HealthChain is made by a (very) small team with experience in software engineering, machine learning, and healthcare NLP. We understand that good data science is about more than just building models, and that good engineering is about more than just building systems. This rings especially true in healthcare, where people, processes, and technology all play a role in making an impact.
63+
HealthChain is made by a small team with experience in software engineering, machine learning, and healthcare NLP. We understand that good data science is about more than just building models, and that good engineering is about more than just building systems. This rings especially true in healthcare, where people, processes, and technology all play a role in making an impact.
6364

6465
For inquiries and collaborations, please get [in touch](mailto:[email protected])!

0 commit comments

Comments
Β (0)