Data Model

The Varify Data Warehouse (VDW) data model can be broken up into two primary groups: variant-related and sample-related data.

VDW ER Diagram

Dump Variants and Annotations

pg_dump --data-only --jobs=4 --no-privileges -dbname=DATABASE_NAME
        -t chromosome \
        -t variant -t variant_type \
        -t variant_effect -t effect -t functional_class -t effect_impact -t effect_region \
        -t evs -t '"1000g"' -t polyphen2 -t sift \
        -t phenotype -t variant_phenotype \
        -t gene -t gene_family -t synonym -t gene_phenotype -t exon -t transcript \
| bzip2 --compress --stdout > vdw_variants.sql.bz2

Omit --data-only to include the schema definitions, e.g. CREATE TABLE..
Omit --jobs=N to reduce database load (it will take more time however)
Remember to specify the database name and any other parameters such as --host

Load Variants and Annotations

Given the output from above, we can decompress the file and pipe it to the psql command

bzip2 --decompress --stdout vdw_variants.sql.bz2 | psql --dbname DATABASE_NAME

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Data Model

Dump Variants and Annotations

Load Variants and Annotations

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally