Skip to content

Commit 547b4bb

Browse files
sofyalaskimartin-trajanovskiJunjiequannitrosxsbliven
authored
feat: ingestor component for datasets (#2040)
## Description This is a big PR that introduces two changes. 1. This is a big change and it introduces many new components and a dependency on another backend. This, however, can be turned off in the `config.json` file by setting: ``` "ingestorComponent": { "ingestorEnabled": false, } ``` 2. At the SciCatCon 2025, we discussed a new [project to simplify ingestion](https://github.com/orgs/SciCatProject/projects/20/). This PR also introduces a change to how the button "Create Dataset" below the filter side bar at the Dataset page behaves according to #1912 . ( Option to control it already present in config - `addDatasetEnabled`) ## Motivation At PSI with OpenEM we have been working on a new ingestor backend that will allow data ingestion from sites different from the host of SciCat Catalog. This is represented by Point 1. An addition of the Ingestor backend repo into SciCatProject [is planned](https://github.com/orgs/SciCatProject/projects/20/views/1?pane=issue&itemId=117173574&issue=SciCatProject%7Cscicat-backend-next%7C2014) as well. ## Changes: __For point 2__: * when user wants to ingest dataset into SciCat from frontend, the dialog opens, where user enters dataset-specific information. * User can provide a url to json schema for scientific metadata. We only check if provided JSON is a valid object. * If user provided schema, an additional set of questions is created based on that schema ( with Json forms) and user can specify the scientific metadata details based on it * Conformation page where user can review entered metadata * Dataset is being added after the submission of the form __For point 1__: * `config.json` changes include this new object: ``` "ingestorComponent": { "ingestorEnabled": true, "ingestorAutodiscoveryOptions": [ { "mailDomain": "university.org", "description": "University/facility of Choice", "facilityBackend": "https://facility-ingestor.facility.org" } ] }, ``` * The main option to turn off the component entirely is controlled by the `ingestorEnabled` value. This will redirect call to ingestor to 404. When turned on, the ingestor component is available at `/ingestor/` with a link in the hamburger menu. * `ingestorAutodiscoveryOptions` is an optional argument and constitutes an array of available facilities running ingestor software. * `facilityBackend` is a reachable backend of the ingestor service. * `mailDomain` is used to match the email of logged-in user against the `mailDomain` value as a regular expression and in case of success, automatically connect to the respective backend. A regular expression is used to connect to the email of form "staff.university.org" or similar. * `description` is optional, but in case of the match with `mailDomain` will prefill the creationLocation property in the dataset schema * Ingestor component ( when used with the backend ) looks similar to the __Point 2__ and represents a set of dialogs for SciCat dataset and scientific metadata ingestion, with most of the information prefilled. For this, it interacts with ingestor backend, which does all the hard work such as: * loading set of available methods, which correspond to the available metadata extractors. * upon method selction, extraction of the metadata into a newly generated json object, that is used in the `scientificMetadata` * creation of a dataset on SciCat * creation of a transferring job to the tape. ## Tests included - [ ] Included for each change/fix? - [ ] Passing? (Merge will not be approved unless this is checked) ## Documentation - [ ] swagger documentation updated \[required\] - [ ] official documentation updated \[nice-to-have\] ### official documentation info If you have updated the official documentation, please provide PR # and URL of the pages where the updates are included ## Backend version - [ ] Does it require a specific version of the backend - which version of the backend is required: ## Ingestor backend: https://github.com/SwissOpenEM/Ingestor --------- Co-authored-by: martintrajanovski <[email protected]> Co-authored-by: Jay <[email protected]> Co-authored-by: Max Novelli <[email protected]> Co-authored-by: Spencer Bliven <[email protected]> Co-authored-by: Despina <[email protected]> Co-authored-by: David Wiessner <[email protected]> Co-authored-by: dwiessner-unibe <[email protected]> Co-authored-by: consolethinks <[email protected]> Co-authored-by: Philipp Wissmann <[email protected]> Co-authored-by: Philipp Wissmann <[email protected]> Co-authored-by: phwissmann <[email protected]>
1 parent 44b3008 commit 547b4bb

File tree

132 files changed

+13115
-1927
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

132 files changed

+13115
-1927
lines changed

.eslintrc.json

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
{
22
"root": true,
3-
"ignorePatterns": ["projects/**/*"],
3+
"ignorePatterns": [
4+
"projects/**/*",
5+
"src/app/shared/sdk/models/ingestor/**"], // Ignore autogenerated files
46
"parserOptions": {
57
"ecmaVersion": 2020,
68
"sourceType": "module",

cypress/e2e/published-data/published-data.cy.js

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,8 @@ describe("Datasets general", () => {
151151

152152
cy.get("a.button").click();
153153

154+
cy.finishedLoading();
155+
154156
cy.get('[data-cy="batch-table"] mat-row').its("length").should("eq", 2);
155157

156158
cy.get("#saveChangesButton").click();

0 commit comments

Comments
 (0)