Commit 547b4bb
feat: ingestor component for datasets (#2040)
## Description
This is a big PR that introduces two changes.
1. This is a big change and it introduces many new components and a
dependency on another backend. This, however, can be turned off in the
`config.json` file by setting:
```
"ingestorComponent": {
"ingestorEnabled": false,
}
```
2. At the SciCatCon 2025, we discussed a new [project to simplify
ingestion](https://github.com/orgs/SciCatProject/projects/20/). This PR
also introduces a change to how the button "Create Dataset" below the
filter side bar at the Dataset page behaves according to #1912 . (
Option to control it already present in config - `addDatasetEnabled`)
## Motivation
At PSI with OpenEM we have been working on a new ingestor backend that
will allow data ingestion from sites different from the host of SciCat
Catalog. This is represented by Point 1. An addition of the Ingestor
backend repo into SciCatProject [is
planned](https://github.com/orgs/SciCatProject/projects/20/views/1?pane=issue&itemId=117173574&issue=SciCatProject%7Cscicat-backend-next%7C2014)
as well.
## Changes:
__For point 2__:
* when user wants to ingest dataset into SciCat from frontend, the
dialog opens, where user enters dataset-specific information.
* User can provide a url to json schema for scientific metadata. We only
check if provided JSON is a valid object.
* If user provided schema, an additional set of questions is created
based on that schema ( with Json forms) and user can specify the
scientific metadata details based on it
* Conformation page where user can review entered metadata
* Dataset is being added after the submission of the form
__For point 1__:
* `config.json` changes include this new object:
```
"ingestorComponent": {
"ingestorEnabled": true,
"ingestorAutodiscoveryOptions": [
{
"mailDomain": "university.org",
"description": "University/facility of Choice",
"facilityBackend": "https://facility-ingestor.facility.org"
}
]
},
```
* The main option to turn off the component entirely is controlled by the `ingestorEnabled` value. This will redirect call to ingestor to 404. When turned on, the ingestor component is available at `/ingestor/` with a link in the hamburger menu.
* `ingestorAutodiscoveryOptions` is an optional argument and constitutes an array of available facilities running ingestor software.
* `facilityBackend` is a reachable backend of the ingestor service.
* `mailDomain` is used to match the email of logged-in user against the `mailDomain` value as a regular expression and in case of success, automatically connect to the respective backend. A regular expression is used to connect to the email of form "staff.university.org" or similar.
* `description` is optional, but in case of the match with `mailDomain` will prefill the creationLocation property in the dataset schema
* Ingestor component ( when used with the backend ) looks similar to the __Point 2__ and represents a set of dialogs for SciCat dataset and scientific metadata ingestion, with most of the information prefilled. For this, it interacts with ingestor backend, which does all the hard work such as:
* loading set of available methods, which correspond to the available metadata extractors.
* upon method selction, extraction of the metadata into a newly generated json object, that is used in the `scientificMetadata`
* creation of a dataset on SciCat
* creation of a transferring job to the tape.
## Tests included
- [ ] Included for each change/fix?
- [ ] Passing? (Merge will not be approved unless this is checked)
## Documentation
- [ ] swagger documentation updated \[required\]
- [ ] official documentation updated \[nice-to-have\]
### official documentation info
If you have updated the official documentation, please provide PR # and URL of the pages where the updates are included
## Backend version
- [ ] Does it require a specific version of the backend
- which version of the backend is required:
## Ingestor backend:
https://github.com/SwissOpenEM/Ingestor
---------
Co-authored-by: martintrajanovski <[email protected]>
Co-authored-by: Jay <[email protected]>
Co-authored-by: Max Novelli <[email protected]>
Co-authored-by: Spencer Bliven <[email protected]>
Co-authored-by: Despina <[email protected]>
Co-authored-by: David Wiessner <[email protected]>
Co-authored-by: dwiessner-unibe <[email protected]>
Co-authored-by: consolethinks <[email protected]>
Co-authored-by: Philipp Wissmann <[email protected]>
Co-authored-by: Philipp Wissmann <[email protected]>
Co-authored-by: phwissmann <[email protected]>1 parent 44b3008 commit 547b4bb
File tree
132 files changed
+13115
-1927
lines changed- cypress/e2e/published-data
- src
- app
- _layout/app-header
- app-routing
- lazy/ingestor-routing
- datasets
- dashboard
- datafiles-actions
- datafiles
- dataset-detail
- dataset-detail-dynamic
- dataset-detail
- help/help
- ingestor
- ingestor-dialogs
- confirmation-dialog
- creation-dialog
- creation-pages
- dialog-mounting-components
- ingestor-file-browser
- transfer-detail-view
- ingestor-metadata-editor
- ingestor-page
- helper
- state-management
- actions
- effects
- models
- reducers
- selectors
- state
- assets
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
132 files changed
+13115
-1927
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
4 | 6 | | |
5 | 7 | | |
6 | 8 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
151 | 151 | | |
152 | 152 | | |
153 | 153 | | |
| 154 | + | |
| 155 | + | |
154 | 156 | | |
155 | 157 | | |
156 | 158 | | |
| |||
0 commit comments