Proposed new design for /datasets API: Dataset metadata is still stored in json metadata in S3, but generated by separate github repo

**Background:**
Right now /v1/datasets returns datasets generated by the dataset metadata generator and stored on S3 (e.g. dev-dataset-metadata.json).
What does the metadata lambda do now?
- generates a temporal domain, list of dates that are valid for this particular dataset
- does something similar for "sites"

**Goals:**
- Users can PR new or updated datasets and have them automatically picked up by the datasets API (when merged to a `main` branch).
- Users can have a mosaicjson endpoint to visualize their dataset / data collection.

**Problem with this approach:**
- If people want to add _new_ layers to the dashboard, they would still need to open PR and have it reviewed and approved.
- Alternatives: people can POST new datasets to the dataset API (but these could not work)

**Acceptance criteria:**
* `<env>-dataset-metadata.json` stored on S3 is updated whenever PR is merged to new `dashboard-datasets-starter` repo
* Config files PR'd to this new repo can include STAC API URL and query parameters to generate a mosaic. The lambda will generate the mosaic endpoint and include that endpoint in the `<env>-dataset-metadata.json`
* For MAAP: user can create a PR to `dashboard-datasets-maap` for data with existing tiles endpoint and it will add layer to dashboard (once merged)
* For MAAP: user can create a PR to `dashboard-datasets-maap` for SRTM data mosaic (using STAC API and query parameters) to add SRTM layer to dashboard (once merged)

**Proposed solution:**

![MAAP viz + dashboard use case design](https://user-images.githubusercontent.com/15016780/120356702-799cf580-c2b9-11eb-86d5-1aac56bcb99c.png)

**Tasks:**
- [x] Create new repo in NASA-IMPACT “dashboard-datasets(-starter)” and reference / reuse code from dashboard-api-starter dataset metadata generator lambda for generating metadata via a lambda function and storing it on S3. This code could be copy/paste but acceptance criteria for this first task is just to take the existing dataset config files(s) (e.g. for MODIS) stored in the same repo and updates the S3 metadata file. S3 bucket location should configurable.
    - Lambda code: (Code will should also be removed from) https://github.com/NASA-IMPACT/dashboard-api-starter/blob/main/lambda/dataset_metadata_generator
    - https://github.com/NASA-IMPACT/dashboard-api-starter/blob/main/dashboard_api/db/static/
- [ ] Design what a revised config file should look like (e.g. one that can use STAC API and titiler endpoints)
- [ ] Include in new repo github workflow to trigger the lambda whenever a dataset is updated or created to the `main` branch
- [ ] Metadata generator lambda generates mosaic(s), creates or updates dataset json metadata and updates updates dataset json on S3

**Improvements:**
- [ ] Github linting of new updated dataset config files based off some basic checks

**Questions:** 
- Can we remove all the `/sites` code for now and re-implement when requested?
- Can we assume the generation of the temporal domain (e.g. this function https://github.com/NASA-IMPACT/dashboard-api-starter/blob/main/lambda/dataset_metadata_generator/src/main.py#L209) can and should still work?
- Is it the right approach to have the lambda call the POST <TITILER>/mosaic or will users always do that themselves and then PR a config file with the mosaic URL already defined?
- For MAAP, we might want to restrict datasets which can be visualized to those published in CMR. How can we quality control datasets? 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposed new design for /datasets API: Dataset metadata is still stored in json metadata in S3, but generated by separate github repo #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposed new design for /datasets API: Dataset metadata is still stored in json metadata in S3, but generated by separate github repo #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions