Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search within an individual work in the Universal Viewer/IIIF #2209

Open
4 tasks
eporter23 opened this issue Sep 27, 2023 · 0 comments
Open
4 tasks

Search within an individual work in the Universal Viewer/IIIF #2209

eporter23 opened this issue Sep 27, 2023 · 0 comments

Comments

@eporter23
Copy link
Contributor

eporter23 commented Sep 27, 2023

Stories

As an End User, I want to be able to search the full text of a work I am currently viewing, so that I can locate matching terms without having to read or download the entire volume

As an End User, I want to be able to pass my search query from the main results list to an individual work, so that I do not have to re-enter my search when viewing an individual record

As an End User, I want to see an indication that a given work is full-text searchable in the main repository search results, so that I know I can perform additional searches within a work

As an End User, I want to be able to see which pages in the work I am currently viewing contain matches, so that I can quickly navigate to those pages

As a Repository Administrator, I want to have a self-service option in the Curate UI that allows me to selectively reindex works for IIIF searching, so that I can manage resource intensive indexing jobs while trying to maintain ongoing ingests

Acceptance Criteria

Use one or more of the following options to provide acceptance criteria.

The following components are anticipated to support this feature:

  • Indexing process to generate and store page level text and word coordinate data in SOLR and IIIF manifests
  • Curate interface provides Administrator-level users the ability to selectively reindex individual works for IIIF searching, either as part of the existing Index for Full-text Search process or a new option
  • Curate user interface supports full text within a work in search results and View Work pages [wireframe TBD]
  • Lux user interface provides full text matches in search results and allows users to send their search to an individual work [wireframe]
  • IIIF/Universal Viewer user interface allows users to see which pages contain matches. Optimally, the user will see highlighted matching terms on each page [wireframe]

Notes

This work follows the first phase of work supporting full text searching across works in the entire repository.
See prior research notes related to IIIF/Universal Viewer searching. See also the work planning notes for full-text both at the repository and work levels.

Digitized book material currently consists of a mix of volume-level as well as page-level files. For digitized books material, we always ingest a PDF of the entire volume along with page-level image files (e.g. TIFF). The digitization team has also started producing OCR and TXT outputs for the entire volume, but previously ingested books will not have these files.

For Kirtas-digitized books (example from Yellowbacks) we also ingest:

  • TXT contents for each individual page
  • OCR file for entire volume

For LIMB-digitized books (example from Yearbooks):

  • METS file for entire volume
  • TXT contents of each individual page
  • ALTO xml (OCR) file for each page

Links to Additional Information

Preliminary Wireframes
Search result: handoff to individual work result
View search results in Universal Viewer for individual work

Checklist

  • Build an indexing process that will provide page level searching
  • Build a derivatives process that will generate page-level word coordinates data that can be used in IIIF search to show highlighted terms within individual pages, if suitable page-level files are not already ingested for a work
  • If feasible, ensure that new processes are compatible or combined with the already-implemented Reindex for Fulltext Search process
  • Ensure that any adjustments to our Universal Viewer configuration(s) do not disrupt current functionality, access controls, etc.
@eporter23 eporter23 added the Epic label Oct 30, 2023
@eporter23 eporter23 changed the title Search within an individual work in the Universal Viewer/IIIF [placeholder] Search within an individual work in the Universal Viewer/IIIF Nov 9, 2023
@eporter23 eporter23 changed the title Search within an individual work in the Universal Viewer/IIIF Search within an individual work in the Universal Viewer/IIIF [placeholder] Nov 9, 2023
@eporter23 eporter23 changed the title Search within an individual work in the Universal Viewer/IIIF [placeholder] Search within an individual work in the Universal Viewer/IIIF Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant