docs(adr): propose unified search index mapping by dschmidt · Pull Request #2662 · opencloud-eu/opencloud

dschmidt · 2026-04-23T09:27:10Z

📄 Rendered ADR: https://github.com/dschmidt/opencloud/blob/docs/adr-reflection-based-search-mapping/docs/adr/0005-unified-search-index-mapping.md

Proposes ADR 0005 — a single central Go-struct + overrides map as the source of truth for the search index layout across bleve and OpenSearch, so both backends stop drifting on implicit per-backend defaults and adding new index features (geopoint, case-preserving aggregation siblings, ...) becomes a one-line change in one place.

Also records the end-to-end case-handling principle for facet data (indexed as case-preserving keywords so aggregation buckets return correct display values) and the cross-backend sibling-field pattern that makes features like geopoint trivially uniform across bleve and OpenSearch.

#2659 is a proof-of-concept implementation this ADR emerged from; scope and APIs there will be revisited once the decision here lands.

Deciders TBD — please add reviewers who should sign off on the technical direction.

Propose a single central Go-struct + overrides map as the source of truth for the search index layout across bleve and OpenSearch — the same definition drives the per-backend index mapping, the write-time adapter, the hit-decoding path, and the KQL compiler's case-folding rules, so the two backends cannot drift silently again. Also records the end-to-end case-handling principle for facet data (indexed as case-preserving keywords so aggregation buckets return correct display values), the sibling-field pattern for geopoint on Location, and the rationale for replacing the two backends' implicit defaults with an explicit, backend-agnostic contract. PR opencloud-eu#2659 is a proof-of-concept implementation the proposal emerged from; scope and APIs there will be revisited once this ADR lands.

sonarqubecloud · 2026-04-27T07:45:24Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

dschmidt · 2026-04-27T08:02:40Z

Updated it to make it more focused :)

codacy-production · 2026-05-12T07:00:57Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

AI Reviewer: first review requested successfully. AI can make mistakes. Always validate suggestions.

_{TIP This summary will be updated as you push new changes.}

codacy-production

Pull Request Overview

The ADR for unified search index mapping is technically sound and aligns with Codacy standards. However, the document is currently incomplete as the 'Deciders' list remains empty; formal identification of stakeholders is required before this can be approved. While the proposed architecture eliminates behavioral drift, it explicitly accepts a performance trade-off for Bleve write-paths and creates a temporary feature gap for WebDAV REPORTs until graph-search is implemented. These should be acknowledged by the leadership team. Additionally, all identified verification scenarios for the implementation phase are currently missing and must be addressed in subsequent PRs.

About this PR

The 'Deciders' list in the ADR is currently empty. Technical leadership stakeholders must be formally identified and added to ensure the proposal has the necessary oversight.
The reliance on a future graph-search implementation to resolve facet exposure in WebDAV REPORTs leaves a temporary feature gap that may impact consumers relying on those reports.
The proposal relies on a JSON round-trip for bleve write-paths. This is an acceptable performance trade-off for now but will require monitoring during high-volume ingestion to ensure it does not become a bottleneck.

Test suggestions

Verify that a single struct definition generates matching semantic mappings for both bleve and OpenSearch backends.
Ensure facet sub-fields (e.g., audio.artist) correctly preserve casing in the index to support accurate aggregation bucket labels.
Verify the KQL compiler correctly folds query values to lowercase for fields like 'Name' but preserves case for 'audio.artist' based on the shared mapping.
Validate that the search service fails to start if the overrides map contains typos or references non-existent struct fields.
Verify that existing indexes remain operational under their old mappings while new indexes adopt the unified schema.

Prompt proposal for missing tests

Consider implementing these tests if applicable:
1. Verify that a single struct definition generates matching semantic mappings for both bleve and OpenSearch backends.
2. Ensure facet sub-fields (e.g., audio.artist) correctly preserve casing in the index to support accurate aggregation bucket labels.
3. Verify the KQL compiler correctly folds query values to lowercase for fields like 'Name' but preserves case for 'audio.artist' based on the shared mapping.
4. Validate that the search service fails to start if the overrides map contains typos or references non-existent struct fields.
5. Verify that existing indexes remain operational under their old mappings while new indexes adopt the unified schema.

_{TIP Improve review quality by adding custom instructions}
_{TIP How was this review? Give us feedback}

codacy-production · 2026-05-12T07:02:04Z

+- OpenSearch facet sub-strings (`audio.*`, `photo.*`, `image.*`)
+  change from the dynamic `text + keyword` multi-field to
+  `keyword`-only, implementing the principle above. As noted in
+  the context section, the tokenized leg is not actually reachable


_{⚪ LOW RISK}

Suggestion: The parenthetical list of facet sub-strings is missing location.*. Including it ensures consistency with the facet definitions provided in the Context and Decision Outcome sections where location is explicitly mentioned.

The bullet enumerating which OpenSearch facet sub-fields change from text+keyword to keyword-only listed audio, photo and image but omitted location, which is in the same dynamic-template-handled facet block per the context section. Add it for consistency.

dschmidt · 2026-05-12T07:33:20Z

Test suggestions: out of scope for this PR. This is an ADR, so there is no code to test. The five suggestions are all behaviors of the future implementation that this ADR enables (struct-to-mapping equivalence across backends, case-preservation in facet sub-fields, KQL compiler folding behavior, startup validation of the overrides map, mixed-mapping coexistence). They belong on the implementation PR(s) that will follow this decision, not on the decision record itself imho.

WebDAV REPORT feature gap: already discussed in Follow-ups out of scope for this ADR. The WebDAV search endpoint currently renders none of the facet fields back to the client, so this is a pre-existing missing feature, not a regression introduced by the proposal. Resolution path is also called out: let the graph-search endpoint take over once graph search lands.

Bleve JSON round-trip performance trade-off: already discussed in Known trade-off. On hot paths it is measurable but not significant, and if it ever matters a direct reflection walker can replace the json round-trip without changing any call site.

Empty Deciders list: yes, we know. we haven't decided yet :)

Inline comment on line 222 (location.* missing): fixed

dschmidt requested review from butonic, dragotin, fschade, kulmann, micbar and rhafer April 23, 2026 09:27

dschmidt force-pushed the docs/adr-reflection-based-search-mapping branch from e94393b to 8954125 Compare April 27, 2026 07:44

dschmidt mentioned this pull request May 4, 2026

fix(search): preserve value case for non-lowercased bleve fields #2633

Merged

dschmidt requested review from aduffeck May 5, 2026 12:43

codacy-production Bot reviewed May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(adr): propose unified search index mapping#2662

docs(adr): propose unified search index mapping#2662
dschmidt wants to merge 2 commits into
opencloud-eu:mainfrom
dschmidt:docs/adr-reflection-based-search-mapping

dschmidt commented Apr 23, 2026 •

edited

Loading

Uh oh!

sonarqubecloud Bot commented Apr 27, 2026

Uh oh!

dschmidt commented Apr 27, 2026

Uh oh!

codacy-production Bot commented May 12, 2026 •

edited

Loading

Uh oh!

codacy-production Bot left a comment

Uh oh!

codacy-production Bot May 12, 2026

Uh oh!

Uh oh!

dschmidt commented May 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dschmidt commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sonarqubecloud Bot commented Apr 27, 2026

Quality Gate passed

Uh oh!

dschmidt commented Apr 27, 2026

Uh oh!

codacy-production Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Up to standards ✅

Uh oh!

codacy-production Bot left a comment

Choose a reason for hiding this comment

Pull Request Overview

About this PR

Test suggestions

Uh oh!

codacy-production Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dschmidt commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dschmidt commented Apr 23, 2026 •

edited

Loading

codacy-production Bot commented May 12, 2026 •

edited

Loading

dschmidt commented May 12, 2026 •

edited

Loading