Skip to content

Add positionIncrementGap to keyword field#256

Open
JCRPaquin wants to merge 1 commit into
mainfrom
jcrpaquin/bug/keyword-gap
Open

Add positionIncrementGap to keyword field#256
JCRPaquin wants to merge 1 commit into
mainfrom
jcrpaquin/bug/keyword-gap

Conversation

@JCRPaquin
Copy link
Copy Markdown
Member

Why?

Keyword searches can span multiple independent values for a single document: if a document has “chemistry” and “mars” as two separate keywords then keyword:“chemistry mars” will match the document. This is not the expected behavior.

What?

Adds a new field type to the Solr schema: ads_text_gap. This type includes a position increment gap, which causes Solr to calculate token positions for separate field values as if they were separated by the gap value.

@JCRPaquin JCRPaquin requested a review from kelockhart March 20, 2026 18:41
Copy link
Copy Markdown
Member

@kelockhart kelockhart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my longer comment in-line, but basically, see if we can just incorporate positionIncrementGap in ads_text instead of creating a copy into ads_text_gap.

Also, this definitely needs some unittests.

</analyzer>
</fieldType>

<fieldType name="ads_text_gap" class="solr.TextField" positionIncrementGap="100">
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a copy of ads_text, which is a little worrisome, as presumably these need to stay in sync. Is it ok to just set the positionIncrementGap for all ads_text fields instead of creating a new ads_text_gap field type? I'm guessing this only effects multiValued=true fields, which currently include title, comment, pubnote, all (what is this? do we use this?), caption, gpn, planetary_feature, and unfielded_search. I don't know about all and unfielded_search but for all the rest, this change should be neutral or positive (i.e. for gpn/planetary_feature, we will want to use ads_text_gap anyway, for the same reasons as for keyword)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants