Skip to content

Commit

Permalink
v0.0.1.0 (#4)
Browse files Browse the repository at this point in the history
* initial commit and spellcheck

* set.version now dynamically writes current packageVersion()

* set.version now dynamically writes packageVersion() to EML

* replace 1:length() with seq_along() to better handle edge cases (e.g. where the vector is empty)

* 1:length to seq_along to deal with edge cases in for loops - specifically empty vectors

* add set.Title. fix spelling on set.DRRdoi

* fix spelling for get.DRRdoi

* minor spelling changes and updates to documentation

* initial commit

* add set.title via document()

* update set.parkUnits: now includes id=UnitConnections; includes required child tags with dummy bounding coordinates set to 0.

* update set.parkUnits documentation

* update set.parkUnits .Rd via devtools::document()

* update set.DRRdoi to create required child elements for EML validation.

* update set.DRRdoi documentation via devtools::document()

* limit CUI argument options in set.CUI to one of 6 accepted CUI codes.

* update set.Version to work result in schema-valid EML. I think.

* update set.CUI to generate schema-valid EML. I hope.

* update set.version documentation

* set.NPSpublisher now yields schema-valid EML (I hope)

* update documentation via devtools::document()

* set.parkUnits now accepts a list of strings as park units, generates a separate geographic coverage element for each unit, auto-populates the corresponding bounding boxes for each unit.

* add get.UnitPolygon

* add get.unitPolygon (from IMDQC: could use remotes IF IMDQC was confirmed to work)

* get.UnitPolygon to get.unitPolygon in set.parkUnits

* update get.parkUnits to handle multiple separate <geographicCoverage> elements.

* fix set.version. AGAIN.

* turn set.version back on after bug fix

* add get.unitPolygon from utils.R

* initial commit

* update via devtools::document()

* fix reference to eml_validate

* fix set.CUI: now works no matter whether additionalMetadata elements already exist. Cannot overwrite existing CUI, just stops.

* fix set.version to work no matter the number of additionalMetadata elements (0, 1, 2 ... N)

* minor spelling fix in documentation

* set.Publisher now checks just eml$dataset$publisher and replaces it with desired info. Leaves other instances of $publisher alone.

* add license file

* add mit license; add Imports httr, readr, sf

* initial commit

* update edit.DOI documentation

* update set.NPSpublisher documentation

* change edit.DOI to new.DOI

* edit.DOI to new.DOI; add NPSONLY as a CUIcode to set.CUI

* minor edits during package checking/debug

* remove alaised functions during debug

* deleted: changed function name to new.DOI

* removed aliased functions during debug

* minor documentation changes during debug

* initial commit

* minor changes in documentation to reflect new CUIcode

* add set.forByNPS

* add set.forByNPS function; make set.NPSpublisher call the set.forByNPS subfunction.

* update documentation to include subfunction set.forByNPS

* initial commit

* change set.byOrForNPS to set.forByNPS

* update set.byOrForNPS to set.forByNPS

* update set.forByNPS

* update documentation via devtools::document()

* version 0.0.0.9000 to v0.0.1.0

* version incrimented to 0.0.1.0. Info about "by or for NPS" added.
  • Loading branch information
RobLBaker authored Sep 29, 2022
1 parent 0542771 commit ceb9a1c
Show file tree
Hide file tree
Showing 32 changed files with 547 additions and 267 deletions.
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
^EMLeditor\.Rproj$
^\.Rproj\.user$
^README\.Rmd$
^LICENSE\.md$
9 changes: 6 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
Package: EMLeditor
Title: View and Edit EML
Version: 0.0.0.9000
Version: 0.0.1.0
Authors@R:
person("Robert", "Baker", , "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0001-7591-5035"))
Description: This package will be of most use to the U.S. National Park Service data scientists and managers seeking to generate EML-formatted metadata for datapackages. EML-formatted .xml files are typically constructed using EDI's EMLassemblyline package and then imported as an R-object using the EML package. EMLeditor allows the user to view the contents of the R object and add aspects of metadata crucial for publication in the U.S. National Park Service DataStore repository. For instance, a user can view and edit a DOI, a link to a DRR, Park Unit connections, information about Confidential Unclassified Information (CUI), and more. EMLeditor allows the user to write a mockup of a README.txt to preview what the README automatically generated by DataStore upon upload will look like.
License: `use_mit_license()`, `use_gpl3_license()` or friends to pick a
license
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.1
Expand All @@ -20,3 +19,7 @@ Depends:
lubridate,
reader
Remotes: NCEAS/arcticdatautils
Imports:
httr,
readr,
sf
2 changes: 2 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
YEAR: 2022
COPYRIGHT HOLDER: EMLeditor authors
21 changes: 21 additions & 0 deletions LICENSE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# MIT License

Copyright (c) 2022 EMLeditor authors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
7 changes: 4 additions & 3 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
# Generated by roxygen2: do not edit by hand

S3method(edit,DOI)
export(eml_getNPS)
export(eml_get_simpleNPS)
export(get.CUI)
export(get.DOI)
export(get.DRRdoi)
Expand All @@ -16,12 +13,16 @@ export(get.fileInfo)
export(get.lit)
export(get.parkUnits)
export(get.title)
export(get.unitPolygon)
export(new.DOI)
export(set.CUI)
export(set.DOI)
export(set.DRRdoi)
export(set.NPSpublisher)
export(set.abstract)
export(set.forByNPS)
export(set.lit)
export(set.parkUnits)
export(set.title)
export(set.version)
export(write.readMe)
210 changes: 163 additions & 47 deletions R/editEMLfunctions.R

Large diffs are not rendered by default.

71 changes: 45 additions & 26 deletions R/getEMLfunctions.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,10 @@
#' @details returns the date from the <beginDate> tag. Although dates should be formatted according to ISO-8601 (YYYY-MM-DD) it will also check for a few other common formats and return the date as a text string: "DD Month YYYY"
#'
#' @param emlObject is an R object imported (typically from an EML-formatted .xml file) using EmL::read_eml(<filename>, from="xml").
#'
#' @return a text string
#' @export
#' @examples
#' @example
#' get.beginDate(emlObject)
get.beginDate<-function(emlObject){
begin<-arcticdatautils::eml_get_simple(emlObject, "beginDate")
Expand All @@ -26,10 +27,10 @@ get.beginDate<-function(emlObject){
#'
#' @details returns the date from the <endDate> tag. Although dates should be formatted according to ISO-8601 (YYYY-MM-DD) it will also check a few other common formats and return the date as a text string: "DD Month YYYY"
#'
#' #' @param emlObject is an R object imported (typically from an EML-formatted .xml file) using EmL::read_eml(<filename>, from="xml").
#' @param emlObject is an R object imported (typically from an EML-formatted .xml file) using EmL::read_eml(<filename>, from="xml").
#' @return a text sting
#' @export
#' @examples
#' @example
#' get.endDate(emlObject)
get.endDate<-function(emlObject){
end<-arcticdatautils::eml_get_simple(emlObject, "endDate")
Expand Down Expand Up @@ -64,7 +65,7 @@ get.abstract<-function(emlObject){
else{
Encoding(doc)<-"UTF-8" #helps with weird characters
txt<-NULL
for(i in 1:length(doc)){
for(i in seq_along(doc)){
if(nchar(doc[i])>0){
mypara <- gsub("[\r?\n|\r]", "", doc[i]) #get rid of line breaks and carriage returns
mypara <- gsub("&#13;", " ", mypara) #get rid of carriage symbols
Expand All @@ -86,7 +87,7 @@ get.abstract<-function(emlObject){
#'
#' @details accesses all of the <title> tags (there can be several, if each file was given a separate title). Assumes that the first instance of <title> referes to the entire data package and returns it as a text string, ignoring the contents of all other <title> tags.
#'
#' #' @param emlObject is an R object imported (typically from an EML-formatted .xml file) using EmL::read_eml(<filename>, from="xml").
#' @param emlObject is an R object imported (typically from an EML-formatted .xml file) using EmL::read_eml(<filename>, from="xml").
#' @return a text string
#' @export
#' @example
Expand Down Expand Up @@ -117,7 +118,7 @@ get.DSRefID<-function(emlObject){
RefID<-NA # to do: check write.readMe whether NA needs to be in quotes.
}
else{
for(i in 1:length(pid)){
for(i in seq_along(pid)){
if(stringr::str_detect(pid[i], "doi: ")){
doi<-pid[i]
}
Expand All @@ -137,7 +138,7 @@ get.DSRefID<-function(emlObject){
#'
#' @details allows the user to preview the what the citation will look like. The Harper's Ferry Style Guide recommends using the Chicago Manual of Style for formatting citations. The citation is formatted according to to a modified version of the Chicago Manual of Style's Author-Date journal article format because currently there is no Chicago Manual of Style format specified for datasets or data packages. In compliance wiht DataCite's recommendations regarding including DOIs in citations, the citation displays the entire DOI as https://www.doi.org/10.58370/xxxxxx".
#'
#' #' @param emlObject is an R object imported (typically from an EML-formatted .xml file) using EmL::read_eml(<filename>, from="xml").
#' @param emlObject is an R object imported (typically from an EML-formatted .xml file) using EmL::read_eml(<filename>, from="xml").
#' @return a text string
#' @export
#' @example
Expand Down Expand Up @@ -189,7 +190,8 @@ get.citation<-function(emlObject){
#'
#' @details get.authorList assumes every author has at least 1 first name (either <givenName> or <givenName1>) and only one last name (<surName>). Middle names (<givenName2>) are optional. The author List is formatted with the last name, comma, first name for the first author and the fist name, last name for all subsequent authors. The last author's name is preceeded by an 'and'.
#'
#' #' @param emlObject is an R object imported (typically from an EML-formatted .xml file) using EmL::read_eml(<filename>, from="xml").
#' @param emlObject is an R object imported (typically from an EML-formatted .xml file) using EmL::read_eml(<filename>, from="xml").
#'
#' @return a text string
#' @export
#' @example
Expand All @@ -210,7 +212,7 @@ get.authorList<-function(emlObject){
#extract givenName; should handle middle names too!
FirstName<-NULL
first<-NULL
for(i in 1:length(authors)){
for(i in seq_along(authors)){
if(stringr::str_detect(names(authors)[i], "givenName\\b")){
FirstName<-append(FirstName, authors[i][[1]])
}
Expand All @@ -226,7 +228,7 @@ get.authorList<-function(emlObject){

#extract surName
LastName<-NULL
for(i in 1:length(authors)){
for(i in seq_along(authors)){
if(stringr::str_detect(names(authors)[i], "surName")){
LastName<-append(LastName, authors[i][[1]])
}
Expand All @@ -244,7 +246,7 @@ get.authorList<-function(emlObject){

#multi-author:
else{
for(i in 1:length(LastName)){
for(i in seq_along(LastName)){
if(i==1){
}
if(i>1 && i<length(LastName)){
Expand All @@ -269,6 +271,7 @@ get.authorList<-function(emlObject){
#' @details accesses the contents of the<alternateIdentifier> tag and does some text manipulation to return a string with the DOI including the URL and prefaced by 'doi: '.
#'
#' @param emlObject is an R object imported (typically from an EML-formatted .xml file) using EmL::read_eml(<filename>, from="xml").
#'
#' @return a text string
#' @export
#' @example
Expand All @@ -283,7 +286,7 @@ get.DOI<-function(emlObject){
else{
mylist<-NULL
if(length(pid)>=1){
for(i in 1:length(pid)){
for(i in seq_along(pid)){
if(stringr::str_detect(pid[i], "doi:" )){
mylist<-append(mylist, pid[i])
}
Expand Down Expand Up @@ -313,18 +316,32 @@ get.parkUnits<-function(emlObject){
punits<-NA #to do: test whether NA needs quotes for write.README.
}
else{
punits<-NULL
for(i in 1:length(units)){
#pull out just geographic description for unit connections:
unitcons<-NULL
for(i in seq_along(units)){
if(stringr::str_detect(units[i], "NPS Unit Connections:")){
punits<-units[i]
unitcons<-append(unitcons, units[i])
}
}
if(is.null(punits)){
warning("No Park Unit Connections specified. Use the set.parkUnits() function to add Park Unit Connections.")
punits<-NA #to do: test whether NA needs quotes for write.README.

#make a string that is just comma separated unit connection codes:
punits<-NULL
for(i in seq_along(unitcons)){
if(unitcons[i]== tail(unitcons, 1)){
remtext<-sub('NPS Unit Connections: ', '', unitcons[i])
punits<-append(punits, remtext)
}
else{
remtext<-sub('NPS Unit Connections: ', '', unitcons[i])
punits<-append(punits, paste0(remtext, ", "))
}
}
list.units<-paste(unlist(punits), collapse="", sep=",")

#add "NPS Unit Connections: " prefix back in to the sting:
list.units<-paste0("NPS Unit Connections: ", list.units)
}
return(punits[[1]])
return(list.units)
}

#' returns a CUI statement
Expand Down Expand Up @@ -377,6 +394,7 @@ get.CUI<-function(emlObject){
#' @details returns the file names (listed in the <objectName> tag), the size of the files (listed in the <size> tag) and converts it from bytes (B) to a more easily interpretable unit (KB, MB, GB, etc). Technically this uses powers of 2^10 so that KB is actually a kibibyte (1024 bytes) and not a kilobyte (1000 bytes). Similarly MB is a mebibyte not a megabyte, GB is a gibibyte not a gigabyte, etc. But for most practical purposes this is probably irrelevant. Finally, a short description is provided for each file (from the <entityDescription> tag).
#'
#' @param emlObject is an R object imported (typically from an EML-formatted .xml file) using EmL::read_eml(<filename>, from="xml").
#'
#' @return a text string
#' @export
#' @example
Expand Down Expand Up @@ -415,21 +433,22 @@ get.fileInfo<-function(emlObject){
#'
#' @description get.DRRdoi returns a text string with the associated Data Release Report (DRR)'s DOI.
#'
#' @details get.DRRdoi accesses the <useageCitation> tag(s) and searches for the string "DRR: https://doi.org/". If that string is found, the contents of that tag are returned. If the <useageCitation> tag is empty or not present, the user is informed and pointed to the set.DRRdoi() function to add the DOI of an associated DRR.
#' @details get.DRRdoi accesses the <usageCitation> tag(s) and searches for the string "DRR: https://doi.org/". If that string is found, the contents of that tag are returned. If the <usageCitation> tag is empty or not present, the user is informed and pointed to the set.DRRdoi() function to add the DOI of an associated DRR.
#' @param emlObject is an R object imported (typically from an EML-formatted .xml file) using EmL::read_eml(<filename>, from="xml").
#'
#' @return a text string
#' @export
#' @example
#' get.DRRdoi(emlObject)
get.DRRdoi<-function(emlObject){
doi<-arcticdatautils::eml_get_simple(emlObject, "useageCitation")
doi<-arcticdatautils::eml_get_simple(emlObject, "usageCitation")
if(is.null(doi)){
warning("You have not specified a DRR associated with this data package. If you have an associated DRR, specify its DOI using set.DRRdoi.")
DRRdoi<-NA #to do: test whether NA needs quotes for write.README.
}
else{
DRRdoi<-NULL
for(i in 1:length(doi)){
for(i in seq_along(doi)){
if(stringr::str_detect(doi[i], "DRR: https://doi.org/")){
DRRdoi<-doi[i]
}
Expand All @@ -443,15 +462,15 @@ get.DRRdoi<-function(emlObject){
#'
#' @description get.lit prints bibtex fromated literature cited to the screen.
#'
#' @details get.lit currently only supports bibtex formated references. get.lit gets items from the <literatureCited> tag and prints them to the screen.
#' @details get.lit currently only supports bibtex formatted references. get.lit gets items from the <literatureCited> tag and prints them to the screen.
#'
#' @param emlObject is an R object imported (typically from an EML-formatted .xml file) using EmL::read_eml(<filename>, from="xml").
#'
#' @return character string
#' @export
#'
#' @examples lit<-get.lit(emLObject); writeLines(lit)
#'
#' @examples
#' get.lit(emlObject)
get.lit<-function(emlObject){
lit<-eml_get_simple(emlObject, "literatureCited")
lit<-arcticdatautils::eml_get_simple(emlObject, "literatureCited")
}
Loading

0 comments on commit ceb9a1c

Please sign in to comment.