Skip to content

Commit

Permalink
Handle bgzipped inputs to validation, using non-indexed retrieval to …
Browse files Browse the repository at this point in the history
…get headers from VCF file. Remove need for ref-file when retrieving VCF iterators.
  • Loading branch information
chapmanb committed Mar 3, 2014
1 parent 7c8ed9b commit 6b95e16
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 12 deletions.
7 changes: 7 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
## 0.1.4 (in progress)

- Allow bgzipped/tabix inputs to validation.
- Improve representation of ensemble variants ensuring standard expected keys
are present when available in an individual genotype call. Thanks to Shalabh
Suman.

## 0.1.3 (15 January 2014)

- Update dependencies to the GATK 2.8.1 MIT licensed framework and associated
Expand Down
28 changes: 16 additions & 12 deletions src/bcbio/variation/variantcontext.clj
Original file line number Diff line number Diff line change
Expand Up @@ -85,16 +85,20 @@
"Create a Tribble FeatureSource for VCF file.
Handles indexing and parsing of VCF into VariantContexts.
We treat gzipped files as tabix indexed VCFs."
[in-file ref-file]
(if (.endsWith in-file ".gz")
(AbstractFeatureReader/getFeatureReader in-file (VCFCodec.) false)
(AbstractFeatureReader/getFeatureReader (.getAbsolutePath (file in-file)) (VCFCodec.)
(create-vcf-index in-file))))
([in-file]
(if (.endsWith in-file ".gz")
(AbstractFeatureReader/getFeatureReader in-file (VCFCodec.) false)
(AbstractFeatureReader/getFeatureReader (.getAbsolutePath (file in-file)) (VCFCodec.)
(create-vcf-index in-file))))
([in-file ref-file]
(get-vcf-source in-file)))

(defn get-vcf-iterator
"Create an iterator over VCF VariantContexts."
[in-file ref-file]
(.iterator (get-vcf-source in-file ref-file)))
([in-file]
(.iterator (get-vcf-source in-file)))
([in-file ref-file]
(get-vcf-iterator in-file)))

(defn variants-in-region
"Retrieve variants located in potentially multiple variant files"
Expand Down Expand Up @@ -136,8 +140,8 @@
(defn get-vcf-header
"Retrieve header from input VCF file."
[vcf-file]
(with-open [vcf-reader (.makeSourceFromStream (VCFCodec.) (input-stream vcf-file))]
(.readActualHeader (VCFCodec.) vcf-reader)))
(with-open [vcf-reader (AbstractFeatureReader/getFeatureReader vcf-file (VCFCodec.) false)]
(.getHeader vcf-reader)))

;; ## Writing VCF files

Expand Down Expand Up @@ -202,7 +206,7 @@
[vcf ref out-part fname fdesc passes?]
(let [out-file (fsp/add-file-part vcf out-part)]
(when (itx/needs-run? out-file)
(with-open [vcf-iter (get-vcf-iterator vcf ref)]
(with-open [vcf-iter (get-vcf-iterator vcf)]
(write-vcf-w-template vcf {:out out-file}
(map (partial maybe-add-filter fname passes?) (parse-vcf vcf-iter))
ref
Expand All @@ -214,7 +218,7 @@
[in-file passes? file-out-part ref-file & {:keys [out-dir]}]
(let [out-file (fsp/add-file-part in-file file-out-part out-dir)]
(when (itx/needs-run? out-file)
(with-open [in-iter (get-vcf-iterator in-file ref-file)]
(with-open [in-iter (get-vcf-iterator in-file)]
(write-vcf-w-template in-file {:out out-file}
(map :vc (filter passes? (parse-vcf in-iter)))
ref-file)))
Expand Down Expand Up @@ -261,7 +265,7 @@
.make)))

(defn -main [vcf ref approach]
(with-open [vcf-iter (get-vcf-iterator vcf ref)]
(with-open [vcf-iter (get-vcf-iterator vcf)]
(letfn [(item-iter []
(case approach
"gatk" (iterator-seq (.iterator vcf-iter))
Expand Down

0 comments on commit 6b95e16

Please sign in to comment.