This changelog documents both the changes to the AmsterdamUMCdb database and the amsterdamumcdb
Python package
containing tools for working with the database.
- improved dictionary.csv generation
- re-added some missing functions for OMOP version
- added better documentation for OMOP tables
- moved changelog to separate CHANGELOG.md file
- updated requirements.txt
- Support for using AmsterdamUMCdb version 1.5.0 for the 6th ESICM Datathon 2024
- The legacy AmsterdamUMCdb format can be used using the
legacy=true
flag in the package functions. - The code for generating ICU scores and the sepsis cohort based on these scores is currently not implemented for version 1.5.0.
- Changed the data model to the OMOP Common Data Model version 5.4 using the AMSTEL repository
- Improved mapping to standardized vocabularies to >95% coverage for both attributes and categorical values
- Minor quasi-identifiers, identified during mapping review, were removed from the dataset.
- Support for the legacy versions (<= v1.0.2) will be dropped after release of AmsterdamUMCdb v2.0.0
- unified PostgreSQL and BigQuery processing
- Added unit tests for the cohorts. Forces
pd.Na
to be consideredFalse
during comparisons.
- Fixed filtering based boolean indexing using
np.NaN
(pandas default) andpd.NA
(used by pandas-gbq since the default isInt64
) leading to different results, particularly when finding the complement using negation.
- By default, uses only validated data for
amsterdamumcdb
functions for consistent PostgreSQL and BigQuery results. - Removed aggregation in SQL code of Glasgow Coma Scale (GCS) in line with other functions.
- Update SOFA GCS processing for more accurate scoring
- Added rounding to 1 decimal for conversions (from
kPa
tommHg
) in SQL statements - Updated deprecated matplotlib style (
seaborn-darkgrid
) - Update engine to SQLAlchemy for official pandas support instead of direct psycopg2 connection
- Moved code from Jupyter
reason_for_admission.ipynb
notebook intoamsterdamumcdb.cohorts
module.
- sample cohorts (sepsis-3, mechanical ventilation, shock) in
cohorts.py
- SOFA scoring to
scores.py
- util functions for PostgresSQL to BigQuery translation
- moved SQL code from notebooks to separate SQL files
- dictionary: as part of the
ICUnity
project in preparation for the The Dutch ICU Data Warehouse a number of commonly used items have been mapped to LOINC, SNOMED CT or ATC with data in thevocabulary_id
,vocabulary_concept_code
,vocabulary_concept_name
columns. - dictionary: UCUM units added
- fixes binning in
outliers_histogram()
when all values < 1
- Improves binning in
outliers_histogram()
- Allows negative dates in
get_fluidbalance()
- Fixes Google BigQuery
Project must be a string
error.
Initial release for ESICM Datathon 2021
- Reduces re-identification risk by removing additional quasi-identifiers.
admissions
: Fixes incorrect labeling of ICU (IC
) patients as MCU (MC
) patient in thelocation
column of theadmissions
table.numericitems
: Fixes an error in temperature values due to a conversion error from degrees Fahrenheit to Celsius.
- Reduces re-identification risk for additional adversary attacks by outlier suppression based on presumed background knowledge.
admissions
: Uses the more common 'height' for tallness (human height or stature) instead of 'length' in the admissions table (columns:heightgroup
,heightsource
).
First publicly released version of AmsterdamUMCdb.