-
Notifications
You must be signed in to change notification settings - Fork 0
System Architecture
The AMNH API Portal currently only hosts the unified AMNH Library API. This page will document a model of Library systems, the existing infrastructure and architecture of the Library API, and the Library MetaSearch prototype application that runs on top of the Library API.
The AMNH Research Library holds an extensive collection of analog publications and archives, most of which can be found in catalogs available online (Sierra). Additionally, the full digitized text of Museum publications are available in the Library's digital repository (DSpace). The Digital Special Collections (Omeka) holds thousands of images from digitized photographic negatives, archives, and rare books. To support archive materials and primary resources, the Library has been creating unique entity records of museum expeditions and expedition personnel, departments, permanent halls, and temporary exhibitions (xEAC). These entity records provide general descriptions offering context and sometimes links to related resources in the museum.
The library has resources about the Whitney South Sea Expedition in all of our systems. Currently you have to query each database separately to pull this set of information.
The Library API leverages the APIs of the existing Library Systems to scrape relevant metadata about the data in those systems. These data are scraped once a week and stored in Elasticsearch indexes, where they become searchable information. The Library API primarily relies on the unique entity records from xEAC to return collated results; ultimately the goal is to return related search results based upon the relationships between xEAC entities, as per the image referenced in the previous section. An API server application then waits for RESTful requests to respond with data about People, Departments, Expeditions, and Exhibitions or for text or image search requests. The API server queries Elasticsearch for search requests or to provide detail results for individual unique xEAC entities.
An Elasticsearch server stores multiple indexes of relevant metadata, sometimes several indexes for a single Library System. Some or all of these indexes can be queried by the API server application. This Elasticsearch server can also be explored or create visualizations on internal AMNH networks using Kibana. There are Python scripts that scrape the Library Systems using their individual APIs and then create the necessary search indexes; these scripts are within the "scrape" directory of this repository, and run as cron tasks on a weekly basis.
The API server is a node.js Express application which runs as a systemd service on Library VM servers. The majority of the code in this repository is for the API server application. It receives and processes REST requests and queries Elasticsearch for search and detail results.
The Library MetaSearch application is a React application that makes calls to the Library API Server to perform searches and return results. You can learn more about the Library Metasearch application in its project wiki.