crosswalk

HCP implementation of a suite of in-house and pip-installed (python-based) tools to facilitate open, debatable, and reproducible harmonization between local REDCap databases and BOX sources with NDA data dictionary. Use this repo to identify HCP-lifespan specific variable names in either database system, and/or to develop a similar Crosswalk for your own data (e.g. to talk to your Redcap, Box, and NDA data dictionary APIs using the libraries developed and/or curated here). Take what you like, leave the rest. This repository is as general as we can make it without breaking record of the history of decisions made by the HCP, every one of which matters when it comes to debating the definition of 'Harmonized.'

Directory structure

Crosswalk/ - The object oriented library used for mapping
definitions/ - The redcap database definitions
nda/ - The structure definitions from nda.nih.gov
maps/hca - The mapping files for HCA
maps/hcd - The mapping files for HCD

Usage

Step 0: If this is your first time cloning a jupyter notebook from a github repository, or you think we might have forgotten to mention something, follow the instructions in helpMeSetUpThisRepoInMyLocalEnvironment.md

Step 1: Install vtcmd.py from the NDA, which will be called on to validate any structures you create, per instructions (https://github.com/NDAR/nda-tools)

Step 2: First run the Setup_Definitions notebook. This will download the latest data dictionaries from your REDCap databases and the NDA. If this is your first time cloning a jupyter notebook from a github repository, follow the instructions in helpMeSetUpThisRepoInMyLocalEnvironment.md to get here

Step 3: Modify and run the notebooks HCD and HCA according to the names/variables in your own local REDCap and Box systems. They will download the data, transform it, and validate it against the NDA servers. Your most time consuming task will be in modifying the map betweeen variables in your Redcap Databases and their destinations at the NDA.

For example, every variable that you would like to send to the NDA has to be in one of the maps, such as https://github.com/humanconnectome/NDA_submissions/blob/master/maps/hcd/asr01.yaml.
Note any actionable errors.

Step 4: After you have all of your structures validated (you'll know this if you get no error messages in any of the notebooks herein) you can upload them to the NDA under your collection ID using the vtcmd.py on the command line, or via the NDA's GUI tool: https://nda.nih.gov/vt/ but at this point you'll need an admin to grant permission to submit data under your NDA user name.

Step 5: Make sure everything went according to plan, by using the maps you created with the local and remote annotation sources for every variable within the complete set of uploaded or downloaded structures into a single table called a 'Crosswalk' with the Make_Crosswalk_Table_4_Documentation.ipynb notebook. Crosswalk_Lifespan_Behavioral_2.0_01_12_2021.csv in this repo is the version most up to date output of this notebook as of January 11, 2020. Note that while hcp_variable_upload, nda_element, and nda_structure won't change, the annotation at the NDA CAN AND WILL change. Moreover, by placing our data into NDA structures, we have been forced to remove the natural groupings of variables that provided context for understanding their relationships, so the HCP labels grabbed from our local sources may make less sense. We are actively looking into programmatic ways of addressing this consequence of 'harmonizing' with the NDA. NOTE THAT THE HCP CROSSWALK ALSO PULLS INFO FROM those created by the NIH Toolbox pipeline https://github.com/humanconnectome/NIHToolbox2NDA.

Step 6: Browse the documentation headings on https://pypi.org/project/nda-tools/ so you know more about the vtcmd.py tool and can discuss its use cases. In particular, note that this tool is also documented as capable of DOWNLOADING data from the NDA.

Name		Name	Last commit message	Last commit date
Latest commit History 211 Commits
Crosswalk		Crosswalk
definitions		definitions
dummycredentials		dummycredentials
maps		maps
nda		nda
.gitignore		.gitignore
Crosswalk_Lifespan_3.0.csv		Crosswalk_Lifespan_3.0.csv
Crosswalk_Lifespan_3.0.ipynb		Crosswalk_Lifespan_3.0.ipynb
Crosswalk_Lifespan_Behavioral_2.0_01_12_2021.csv		Crosswalk_Lifespan_Behavioral_2.0_01_12_2021.csv
FaceName.ipynb		FaceName.ipynb
FacenameIntraDB.sh		FacenameIntraDB.sh
HCA.ipynb		HCA.ipynb
HCA_SSAGA_2022.ipynb		HCA_SSAGA_2022.ipynb
HCD.ipynb		HCD.ipynb
HCD_Eprime_deldisk01_Feb18_2022.ipynb		HCD_Eprime_deldisk01_Feb18_2022.ipynb
HCD_KSADS_scores.ipynb		HCD_KSADS_scores.ipynb
HCD_ndar_edinburgh_Jan20_2022.ipynb		HCD_ndar_edinburgh_Jan20_2022.ipynb
HCPYA_Structure_Request_and_Data_Prep.ipynb		HCPYA_Structure_Request_and_Data_Prep.ipynb
Intro to Behavioral Data Harmonization with the NDA for Redcap and Box users.pptx		Intro to Behavioral Data Harmonization with the NDA for Redcap and Box users.pptx
LifespanCompleteness.ipynb		LifespanCompleteness.ipynb
Make_Crosswalk_Table_4_Documentation.ipynb		Make_Crosswalk_Table_4_Documentation.ipynb
README.md		README.md
Setup_Definitions.ipynb		Setup_Definitions.ipynb
Template_Rosetta.csv		Template_Rosetta.csv
config.yml		config.yml
funcs.py		funcs.py
helpMeSetUpThisRepoInMyLocalEnvironment.md		helpMeSetUpThisRepoInMyLocalEnvironment.md
nda_yaml.py		nda_yaml.py
requirements.minimal.txt		requirements.minimal.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

crosswalk

Directory structure

Usage

About

Releases

Packages

Contributors 3

Languages

humanconnectome/NDA_submissions

Folders and files

Latest commit

History

Repository files navigation

crosswalk

Directory structure

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages