Add positionIncrementGap to keyword field#256
Conversation
kelockhart
left a comment
There was a problem hiding this comment.
See my longer comment in-line, but basically, see if we can just incorporate positionIncrementGap in ads_text instead of creating a copy into ads_text_gap.
Also, this definitely needs some unittests.
| </analyzer> | ||
| </fieldType> | ||
|
|
||
| <fieldType name="ads_text_gap" class="solr.TextField" positionIncrementGap="100"> |
There was a problem hiding this comment.
This is a copy of ads_text, which is a little worrisome, as presumably these need to stay in sync. Is it ok to just set the positionIncrementGap for all ads_text fields instead of creating a new ads_text_gap field type? I'm guessing this only effects multiValued=true fields, which currently include title, comment, pubnote, all (what is this? do we use this?), caption, gpn, planetary_feature, and unfielded_search. I don't know about all and unfielded_search but for all the rest, this change should be neutral or positive (i.e. for gpn/planetary_feature, we will want to use ads_text_gap anyway, for the same reasons as for keyword)
Why?
Keyword searches can span multiple independent values for a single document: if a document has “chemistry” and “mars” as two separate keywords then keyword:“chemistry mars” will match the document. This is not the expected behavior.
What?
Adds a new field type to the Solr schema:
ads_text_gap. This type includes a position increment gap, which causes Solr to calculate token positions for separate field values as if they were separated by the gap value.