Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demographic data validation #99

Open
adriansteffan opened this issue Feb 8, 2024 · 2 comments
Open

Demographic data validation #99

adriansteffan opened this issue Feb 8, 2024 · 2 comments
Assignees
Labels
step: validation Important, so earlier stages of the pipeline can fix upcoming issues it

Comments

@adriansteffan
Copy link
Collaborator

The MB Validator spared us a lot of trouble, but we should still think about if it makes sense to validate the values entered into the demogrpahic data before analysis

@adriansteffan adriansteffan added the step: validation Important, so earlier stages of the pipeline can fix upcoming issues it label Feb 8, 2024
@mzettersten mzettersten self-assigned this Feb 15, 2024
@mzettersten
Copy link
Collaborator

It looks like there are widespread parsing issues across very many labs in the demographic .csv's. These almost always take the form that a "," in a cell is apparently inadvertently treated as a separator, spreading a single cell entry across one or more columns. Could it be that this is not due to issues in the original lab data per se, but due to an issue in how we processed that original data into a common .csv format? (either in the instructions given to labs or in how we exported the original data?). It might save a lot of effort (tracking down individual issues by hand) if this is the case and we can identify the processing/ exporting issue.

@mzettersten
Copy link
Collaborator

mostly solved, possibly???

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
step: validation Important, so earlier stages of the pipeline can fix upcoming issues it
Projects
None yet
Development

No branches or pull requests

2 participants