Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General Datateam Training Fixes #219

Merged
merged 14 commits into from
Sep 28, 2020
2 changes: 2 additions & 0 deletions training/01_introduction.Rmd
Original file line number Diff line number Diff line change
@@ -67,6 +67,8 @@ How to generate a reprex:
3. fix until everything runs smoothly
4. copy the result to ask your question

When copy and paste code slack message or github issues, use three backticks for code blocks and two backticks for a small piece of code will prevent issues with slack formats quotation.

For more information and examples check out more of Jenny Bryan's [slides](https://speakerdeck.com/jennybc/reprex-help-me-help-you)
or watch the [video](https://vimeo.com/208749032) starting at about the 10 min mark.

4 changes: 3 additions & 1 deletion training/04_editing_eml.Rmd
Original file line number Diff line number Diff line change
@@ -42,8 +42,10 @@ Add the following lines to all of your data processing scripts.
The metadata for the dataset created earlier in Exercise 2 was not very complete. Here we will add a attribute and physical to our entity (the csv file).

* Make sure your package from [before](#exercise-2) is loaded into R.
* Convert `otherEntity` into `dataTable`.
* Replace the existing `dataTable` with a new `dataTable` object with an `attributelist` and `physical` section you write in R using the above commands.
* We will continue using the objects created and updated in this exercise in 3b
* Add semantic annotations for attribute area.
* We will continue using the objects created and updated in this exercise in 3b.

```{r, child = '../workflows/edit_eml/edit_spatialdata.Rmd'}
```
2 changes: 1 addition & 1 deletion training/09_first_ticket.Rmd
Original file line number Diff line number Diff line change
@@ -14,7 +14,7 @@ Before opening a R script first look over the initial checklist first to identif

We have developed some partially filled R scripts to get you started on working on your first dataset. They outline common functions used in processing a dataset. However, it will differ depending on the dataset.

You can use this template where you can [fill in the blanks](data/dataset_processing_example_blanks.R) to get familiar with the functions we use and workflow at first. We also have a more minimal example [A filled example](dadataset_processing_example_skeleton.R_.R) as a intermediate step. You can look at the [filled example](data/dataset_processing_example_filled.R) if you get stuck or message the #datateam.
You can use this template where you can [fill in the blanks](data/dataset_processing_example_blanks.R) to get familiar with the functions we use and workflow at first. We also have a more minimal example [A filled example](data/dataset_processing_example_skeleton.R) as a intermediate step. You can look at the [filled example](data/dataset_processing_example_filled.R) if you get stuck or message the #datateam.

Once you have updated the dataset to your satisfaction and reviewed the Final Checklist, post the link to the dataset on #datateam for peer review.

6 changes: 2 additions & 4 deletions workflows/edit_eml/edit_attributelists.Rmd
Original file line number Diff line number Diff line change
@@ -81,7 +81,8 @@ data <- read.csv(text=rawToChar(getObject(adc_test, pkg$data)))
EML::shiny_attributes(data = data)

# From an existing attribute table
EML::shiny_attributes(attributes = attributes)
attributeList <- get_attributes(doc$dataset$dataTable[[i]]$attributeList)
EML::shiny_attributes(data = NULL, attributes = attributeList$attributes)

# From scratch
atts <- EML::shiny_attributes()
@@ -99,7 +100,6 @@ new_attribute <- datamgmt::edit_attribute(doc$dataset$dataTable[[1]]$attributeLi
doc$dataset$dataTable[[1]]$attributeList$attribute[[1]] <- new_attribute

```
<<<<<<< HEAD

### Edit custom units

@@ -144,8 +144,6 @@ Custom units are then added to `additionalMetadata` using the following command:
unitlist <- set_unitList(custom_units, as_metadata = TRUE)
doc$additionalMetadata <- list(metadata = list(unitList = unitlist))
```
=======
>>>>>>> 9f923bd81a51f2f7215cd9e1881ac8d79c60ad24

### Edit factors

12 changes: 6 additions & 6 deletions workflows/edit_eml/edit_custom_units.Rmd
Original file line number Diff line number Diff line change
@@ -21,12 +21,12 @@ To manually generate the custom units list, create a dataframe with the fields m
```{r, eval = FALSE}
custom_units <- data.frame(

id = c('siemensPerMeter', 'decibar'),
unitType = c('resistivity', 'pressure'),
parentSI = c('ohmMeter', 'pascal'),
multiplierToSI = c('1','10000'),
abbreviation = c('S/m','decibar'),
description = c('siemens per meter', 'decibar'),
id = c('partsPerThousand', 'decibar', 'wattsPerSquareMeter', 'micromolesPerGram', 'practicalSalinityUnit'),
unitType = c('dimensionless', 'pressure', 'power', 'amountOfSubstanceWeight', 'dimensionless'),
parentSI = c(NA, 'pascal', 'watt', 'molesPerKilogram', NA),
multiplierToSI = c(NA, '10000', '1', '1000000000', NA),
abbreviation = c('ppt', 'decibar', 'W/m^2', 'umol/g', 'PSU'),
description = c('parts per thousand', 'decibar', 'watts per square meter', 'micro moles per gram', 'used to describe the concentration of dissolved salts in water, the UNESCO Practical Salinity Scale of 1978 (PSS78) defines salinity in terms of a conductivity ratio'),

stringsAsFactors = FALSE)
```
2 changes: 1 addition & 1 deletion workflows/edit_eml/set_coverages.Rmd
Original file line number Diff line number Diff line change
@@ -38,4 +38,4 @@ coverage <- EML::set_coverage(beginDate = '2012-01-01',
doc$dataset$coverage$geographicCoverage <- list(geocov1, geocov2)
```

For arctic circle geographic coverage, we only have the starting vertical line of the circle shown in the projection. <a href 'https://arcticdata.io/catalog/view/doi%3A10.18739%2FA2QJ77Z7P' target='_blank'>Here</a> is an example with arctic circle geographic coverage.
For arctic circle geographic coverage, we only have the starting vertical line of the circle shown in the projection. <a href = 'https://arcticdata.io/catalog/view/doi%3A10.18739%2FA2QJ77Z7P' target='_blank'>Here</a> is an example with arctic circle geographic coverage.
6 changes: 3 additions & 3 deletions workflows/edit_eml/set_project.Rmd
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
## Set the Project section

The project section in an EML document is automatically filled out by the metacatUI editor. It sets the project title and project personell to the submission's title and creators. Most of the time at least some of this information is incorrect and we need to update it.
The project section in an EML document is automatically filled out by the metacatUI editor. It sets the project title and project personnel to the submission's title and creators. Most of the time at least some of this information is incorrect and we need to update it.

Start by searching for the funding information using <a href = 'https://www.nsf.gov/awardsearch/' target='_blank'>NSF's award search</a>. This will give us the project title, abstract, and personell - along with some additional metadata.
Start by searching for the funding information using <a href = 'https://www.nsf.gov/awardsearch/' target='_blank'>NSF's award search</a>. This will give us the project title, abstract, and personnel - along with some additional metadata.

Using this information we will set the title, personell, and funding number. For NSF funded projects prepend the funding number with "NSF". If there are multiple awards associated with one dataset then additional `funding`, `title`, and `personell` elements should be added to reflect the additional awards.
Using this information we will set the title, personnel, and funding number. For NSF funded projects prepend the funding number with "NSF". If there are multiple awards associated with one dataset then additional `funding`, `title`, and `personnel` elements should be added to reflect the additional awards.


```
2 changes: 1 addition & 1 deletion workflows/explore_eml/navigate_through_eml.Rmd
Original file line number Diff line number Diff line change
@@ -7,7 +7,7 @@ library(dataone)
library(EML)
```

```{r}
```{r, eval = F}
# Need to be in this member node to explore file
cn_staging <- CNode('STAGING')
adc_test <- getMNode(cn_staging,'urn:node:mnTestARCTIC')
Original file line number Diff line number Diff line change
@@ -62,6 +62,10 @@ File contents and relationships among files are clear

> Could you provide a short description of the files submitted? Information about how each file was generated (what software, source files, etc.) will help us create more robust metadata for long term use.

Data layout

> Would you be able to clarify how the data in your files is laid out? Specifically, what do the rows and columns represent?

We try not to prescribe a way the researchers must format their data as long as reasonable. However, in extreme cases (for example Excel spreadsheets with data and charts all in one sheet) we will want to kindly ask them to reformat.

> We would like to suggest a couple of modifications to the structure of your data. This will others to re-use it most effectively. [DESCRIBE WHAT MAY NEED TO BE CHANGED IN THE DATA SET]. Our data submission guidelines page (<a href = 'https://arcticdata.io/submit/' target='_blank'>https://arcticdata.io/submit/</a>) outlines what are best practices for data submissions to the Arctic Data Center. Let us know if you have any questions or if we can be of any help.
Original file line number Diff line number Diff line change
@@ -25,7 +25,7 @@

> <a href = 'https://doi.org/10.18739/A20X0X' target='_blank'>https://doi.org/10.18739/A20X0X</a>

> Please let us know if you need any further assistance.
> Please let us know if you need any further assistance. However, any further changes to the dataset will result in a new DOI. If you would like to maintain the same DOI please let us know.

*New Submission: Abstract, methods, excel to csv, and attributes*
> Thank you for your submission to the Arctic Data Center. From my preliminary examination of your dataset a few fields need to be updated before we can assign a DOI.
9 changes: 6 additions & 3 deletions workflows/pi_correspondence/final_review_checklist.Rmd
Original file line number Diff line number Diff line change
@@ -11,8 +11,8 @@ the format ids are correct
### General EML
Included lines for FAIR:
```{r eval=F}
doc <- eml_add_publisher(doc)
doc <- eml_add_entity_system(doc)
doc <- eml_add_publisher(doc)
doc <- eml_add_entity_system(doc)
```
### Title

@@ -34,7 +34,7 @@ Included lines for FAIR:
- **Variables** match what is in the file
- **Measurement domain** - if appropirate (ie dateTime correct)
- **Missing Value Code** - accounted for if applicable
- appropriate semantic annotations added
- **Semantic Annotation** - appropriate semantic annotations added

### People
- complete information for each person in each section
@@ -62,3 +62,6 @@ Included lines for FAIR:
- Granted access to PI using `set_rights_and_access()`
+ make sure it is `http://` (no s)
- **note** if it is a part of portals there might be specific access requirements for it to be visible using `set_access()`

### SFTP Files
- if there are files transferred to us via SFTP, delete those files when the ticket is resolved
2 changes: 1 addition & 1 deletion workflows/pi_correspondence/initial_review_checklist.Rmd
Original file line number Diff line number Diff line change
@@ -33,7 +33,7 @@ Before responding to a new submission use this checklist to review the submissio
* Coverages
+ Includes coverages that make sense
- Start date BEFORE end date
- Temporal coverage matches geographic description (check hemispheres)
- Spatial coverage matches geographic description (check hemispheres)
- Geographic description is from the local to state or country level, at the least
- Taxonomic coverage if appropriate
* Project Information