Skip to content

Commit 61698ab

Browse files
author
jhpoelen
committed
add documentation for validation; include recent usage text
1 parent 15c35c3 commit 61698ab

File tree

1 file changed

+42
-1
lines changed

1 file changed

+42
-1
lines changed

README.md

+42-1
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,27 @@ Usage: <main class> [command] [command options]
7272
point to properties file to override defaults.
7373
Default: <empty string>
7474
75+
append-json embeds term matches into json
76+
Usage: append-json [options] [matcher]
77+
Options:
78+
--properties, -p
79+
point to properties file to override defaults.
80+
Default: <empty string>
81+
82+
validate-term Validate terms
83+
Usage: validate-term [options]
84+
Options:
85+
--properties, -p
86+
point to properties file to override defaults.
87+
Default: <empty string>
88+
89+
validate-term-link Validate term links
90+
Usage: validate-term-link [options]
91+
Options:
92+
--properties, -p
93+
point to properties file to override defaults.
94+
Default: <empty string>
95+
7596
matchers Lists all or selected matcher configuration(s)
7697
Usage: matchers
7798
@@ -83,7 +104,7 @@ Usage: <main class> [command] [command options]
83104
Default: <empty string>
84105
85106
properties Lists properties.
86-
Usage: properties [options] [matcher]
107+
Usage: properties [options]
87108
Options:
88109
--properties, -p
89110
point to properties file to override defaults.
@@ -143,6 +164,26 @@ expected output includes tab separated lines like, where the first two columns a
143164
Canis lupus SAME_AS GBIF:5219173 Canis lupus species Animalia | Chordata | Mammalia | Carnivora | Canidae | Canis | Canis lupus GBIF:1 | GBIF:44 | GBIF:359 | GBIF:732 | GBIF:9701 | GBIF:5219142 | GBIF:5219173kingdom | phylum | class | order | family | genus | species http://www.gbif.org/species/5219173
144165
```
145166

167+
### validate taxonCache and taxonMap
168+
169+
To validate terms (aka TaxonCache) and term linkages (aka TaxonMap) to be used with the offline term matchers, you can use the ```validate-term``` and ```validate-term-link``` commands.
170+
171+
For instance, if you'd like to validate the first 10 lines of the taxonCache as published in https://zenodo.org/record/1213465 do:
172+
173+
```curl -L "https://zenodo.org/record/1213465/files/taxonCacheFirst10.tsv" | java -jar nomer.jar validate-term```
174+
175+
Expected result looks something like ```[FAIL|OK]\t[validation test]\t[...]``` where [...] is the validated line. Parts of the result of the above command includes:
176+
177+
```
178+
OK 9 columns 4701dc84-660a-4c51-bd16-593997f2370b Coelomomyces iliensis species Fungi | Chytridiomycota | Blastocladiomycetes | Blastocladiales | Coelomomycetaceae | Coelomomyces | Coelomomyces iliensis urn:lsid:indexfungorum.org:names:90156 | urn:lsid:indexfungorum.org:names:90736 | urn:lsid:indexfungorum.org:names:90742 | urn:lsid:indexfungorum.org:names:90414 | urn:lsid:indexfungorum.org:names:80619 | urn:lsid:indexfungorum.org:names:20136 | 4701dc84-660a-4c51-bd16-593997f2370b kingdom | phylum | class | order | family | genus | species
179+
FAIL supported id 4701dc84-660a-4c51-bd16-593997f2370b Coelomomyces iliensis species Fungi | Chytridiomycota | Blastocladiomycetes | Blastocladiales | Coelomomycetaceae | Coelomomyces | Coelomomyces iliensis urn:lsid:indexfungorum.org:names:90156 | urn:lsid:indexfungorum.org:names:90736 | urn:lsid:indexfungorum.org:names:90742 | urn:lsid:indexfungorum.org:names:90414 | urn:lsid:indexfungorum.org:names:80619 | urn:lsid:indexfungorum.org:names:20136 | 4701dc84-660a-4c51-bd16-593997f2370b kingdom | phylum | class | order | family | genus | species
180+
FAIL prefixed id 4701dc84-660a-4c51-bd16-593997f2370b Coelomomyces iliensis species Fungi | Chytridiomycota | Blastocladiomycetes | Blastocladiales | Coelomomycetaceae | Coelomomyces | Coelomomyces iliensis urn:lsid:indexfungorum.org:names:90156 | urn:lsid:indexfungorum.org:names:90736 | urn:lsid:indexfungorum.org:names:90742 | urn:lsid:indexfungorum.org:names:90414 | urn:lsid:indexfungorum.org:names:80619 | urn:lsid:indexfungorum.org:names:20136 | 4701dc84-660a-4c51-bd16-593997f2370b kingdom | phylum | class | order | family | genus | species
181+
```
182+
183+
This validation report tell us that the line starting with ```4701dc84-660a-4c51-bd16-593997f2370b Coelomomyces iliensis``` has (expected) 9 columns, but has an id that is not supported by nomer nor does the id conform to the ```[some namespace]:[some id]``` format. Note that the GloBI Taxon Graph publication at http://doi.org/10.5281/zenodo.1213465 prompted the development of the validation features. For more historic context, please see https://github.com/globalbioticinteractions/nomer/issues/5 .
184+
185+
A similar feature for term links (aka TaxonMap) are available through the command ```validate-term-link```.
186+
146187
## Contribute
147188

148189
Feel free to join in. All welcome. Open an [issue](https://github.com/globalbioticinteractions/nomer/issues)!

0 commit comments

Comments
 (0)