Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Making Core Search logic more modular #10804

Closed
whatisgalen opened this issue Apr 19, 2024 · 0 comments
Closed

Discussion: Making Core Search logic more modular #10804

whatisgalen opened this issue Apr 19, 2024 · 0 comments
Labels
Needs Discussion The change proposed needs further discussion before it can be validated

Comments

@whatisgalen
Copy link
Member

whatisgalen commented Apr 19, 2024

As a developer, you can create or customize search components as you like, you just have to ensure the sortorder if one search filter/component has some kind of dependency on another. Where that customization ends is what I would call the "core search logic". The logic that exists outside of any search component governs the following:

  • what properties of the document should be included
  • what exactly should happen when the search query gets executed and should there be only 1 (for example)
  • how many results should be returned (which might then later get paginated by the paging-filter)
  • how localized descriptors should be applied
  • what properties are on the response object

It's conceivable that 1 or all of those could be implemented differently by a developer, but to do so requires overriding methods like search_results(), export_results() or the entire SearchView. It would then be easier and a more modular solution to include the core logic as essentially a core search component.

Here's some of that logic:

    dsl.include("graph_id")
    dsl.include("root_ontology_class")
    dsl.include("resourceinstanceid")
    dsl.include("points")
    dsl.include("permissions.users_without_read_perm")
    dsl.include("permissions.users_without_edit_perm")
    dsl.include("permissions.users_without_delete_perm")
    dsl.include("permissions.users_with_no_access")
    dsl.include("geometries")
    dsl.include("displayname")
    dsl.include("displaydescription")
    dsl.include("map_popup")
    dsl.include("provisional_resource")
    if load_tiles:
        dsl.include("tiles")
    if for_export or pages:
        results = dsl.search(index=RESOURCES_INDEX, scroll="1m")
        scroll_id = results["_scroll_id"]
        if not pages:
            if total <= settings.SEARCH_EXPORT_LIMIT:
                pages = (total // settings.SEARCH_RESULT_LIMIT) + 1
            if total > settings.SEARCH_EXPORT_LIMIT:
                pages = int(settings.SEARCH_EXPORT_LIMIT // settings.SEARCH_RESULT_LIMIT) - 1
        for page in range(int(pages)):
            results_scrolled = dsl.se.es.scroll(scroll_id=scroll_id, scroll="1m")
            results["hits"]["hits"] += results_scrolled["hits"]["hits"]
    else:
        results = dsl.search(index=RESOURCES_INDEX, id=resourceinstanceid)

    ret = {}
    if results is not None:
        if "hits" not in results:
            if "docs" in results:
                results = {"hits": {"hits": results["docs"]}}
            else:
                results = {"hits": {"hits": [results]}}

A few ways to implement what I'm talking about would be to:

  • let the search filters determine which document properties/mappings to include or exclude
  • pass in the response object ret instead of just the results object in the post_search_hooks of each filter
  • let the search filters determine how many results to collect from the query and other mechanisms like search result caching

Obviously, if a developer deviates too much in how their custom search component handles the query execution and response, other parts of Arches that use search could break. However, I don't think that's a good reason against customization, it just implies the necessity of more streamlined guidance for search component development in the arches documentation.

The other implication of modularizing the core search logic on the backend is that the frontend would also need to be more aware/responsive and less hard-coded/static of which search filters to take into consideration. For example, the search-results component references specific properties it expects from each result. It would be more modular to interrogate the other search-filters (which it already could do as term-filter and others do) and determine what properties it has access to from each search-filter.

To see how this could be implemented, take a look at my PR.

@chiatt chiatt added this to pipeline Apr 19, 2024
whatisgalen added a commit that referenced this issue Apr 19, 2024
whatisgalen added a commit that referenced this issue Apr 19, 2024
whatisgalen added a commit that referenced this issue Apr 19, 2024
whatisgalen added a commit that referenced this issue Apr 19, 2024
whatisgalen added a commit that referenced this issue Apr 19, 2024
…o include request pos arg, re #10804"

This reverts commit 4426e32.

reverts because already has self.request, re #10804
whatisgalen added a commit that referenced this issue Apr 19, 2024
whatisgalen added a commit that referenced this issue Aug 12, 2024
whatisgalen added a commit that referenced this issue Aug 12, 2024
whatisgalen added a commit that referenced this issue Aug 13, 2024
apeters added a commit that referenced this issue Aug 21, 2024
whatisgalen added a commit that referenced this issue Aug 21, 2024
@github-project-automation github-project-automation bot moved this to ✅ Done in pipeline Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Discussion The change proposed needs further discussion before it can be validated
Projects
Archived in project
Development

No branches or pull requests

1 participant