Learn how to standardize data using the pipe!
make install # creates a virtual environment and install the dependencies
make build # build the docker image
make publish # publish the docker image to the docker hub
But first, you need some data! We will use the MIMIC III Clinical Database Demo, for which you need to get credentials. It's quite straightforward: register on the Physionet website to get access to the demo data, it takes 30 seconds and you will get a username and a password. Then, access this page to sign the agreement needed to download the data (you don't need to download the data yourself) or click on this button:
Then, copy .env.example
into .env
and edit the file last to add your physionet credentials.
cp .env.example .env
vi .env
source .env
You can use Docker to start quickly playing with demos. Alternatively, the 2.B Manual Setup section explains how to install the pipeline step by step.
Start or install Docker, and then build the docker images.
docker-compose build
Then start the services:
docker-compose up
Then, switch to another tab and connect to the pipeline container:
docker exec -ti fhir-pipe /bin/bash
You can now directly go to 3 Launch the pipe.
If you're not experienced with the project, we recommend that you first go through the 2.A Docker Setup. We still use Docker to get the MIMIC database set up along with the database where the FHIR data will be stored, but the ETL will be run from your local computer.
Open docker-compose.yml
to check that the local ports used are not used by your current apps.
vi docker-compose.yml
Run in specific tabs the two database containers we depend on:
docker-compose up --build mimic
docker-compose up mongo
Check that the container ports defined in config.yml
match with those specified in docker-compose.yml
.
You'll also need to provide a token in config.yml
for the graphql access. Contact us at [email protected] to get one.
vi config.yml
Run the python setup to use our commands in the terminal:
# you might need to install packages manually:
pip install -r requirements.txt
python setup.py install
The prefered way of using fhirpipe is to use the API. See API.md for the API documentation.
But fhirpipe also has a CLI.
fhirpipe-run --resource_ids <resource-id-1> ...
You can use the option
--mapping=path/to/response.json
to fetch the mapping rules directly from a static file instead of the pyrog api.
Et voilà!
You can check mongo to see if the data was correctly loaded (make sure the port is correct):
mongo --port 27017 --host localhost fhirstore
> db.Patient.find({})
To check if the data was correctly loaded in the mimic container, you can execute this in a new terminal:
$ docker exec -ti mimic psql -d mimic -U mimicuser -c 'SELECT count(subject_id) FROM patients'
count
-------
100
(1 row)
You might want to change stuff directly on the container.
Example for nano
:
apt update
apt install nano
You can then shutdown the database with ctrl-c
or with docker-compose down mimic
In case of troubles, note that the setup scripts are executed only if the PSQL database is empty. To clear the previously build dockers and databases to re-start from a clean state, please run :
docker-compose down mimic
docker volume rm mimic-db
And, after that, if you have to re-build the docker (for example to take into account the values setup in .env
file), please run :
docker-compose build mimic
⚠️ You may experience trouble if the postgres port is already taken. You can modify it indocker-compose.yml
You should install it in an isolated virtual environment, by using virtualenv or Pipenv for example.
pip install -r requirements.txt
pip install -e .
Make sure you already have the docker containers with mimic and mongo running.
Copy config_local.yml
(from the fhirpipe
directory) into config.yml
and put there your credentials. (Don't forget to change the postgres ports if needed).
cp config_local.yml config.yml
Finish the install and run the tests to check all works fine
cd ..
python setup.py install
python setup.py test
Let's now run locally the pipeline!
You are all set! Run:
fhirpipe-run --project=Mimic --resource=Patient --main-table=Patients --mock-pyrog-mapping=test/integration/fixtures/graphql_mimic.json
Remove --mock-pyrog-mapping
to get the latest mapping rules from the pyrog api.