Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hamlet returned instead of the country for query: Hungary #3266

Closed
gabortim opened this issue Dec 4, 2023 · 3 comments · Fixed by #3332
Closed

Hamlet returned instead of the country for query: Hungary #3266

gabortim opened this issue Dec 4, 2023 · 3 comments · Fixed by #3332

Comments

@gabortim
Copy link

gabortim commented Dec 4, 2023

What did you search for?

I searched for Hungary.

What result did you get?

The first result is a single node (hamlet) located in the United States. It's reproducible when the accept-language header is set to hu-HU. You can verify this with the following links:

What result did you expect?

Hungary, the country as a first result:

Further details

This issue was not present until this summer, and it was working correctly. The problem was originally identified in Overpass Turbo, which use the first Nominatim search result for geocoding. Consequently, queries such as tourism=museum in Hungary are broken since for most people.

For more information, please check Overpass Turbo extension documentation here.

@mtmail
Copy link
Collaborator

mtmail commented Dec 5, 2023

Just to explain the Overpass Turbo interaction for other readers: The Overpass Turbo Wizard feature converts the text tourism=museum in Hungary into

[out:json][timeout:25];
{{geocodeArea:Hungary}}->.searchArea;
nwr["tourism"="museum"](area.searchArea);
out geom;

When clicking "> RUN" the browser (Javascript) does a HTTP GET request https://nominatim.openstreetmap.org/search?X-Requested-With=overpass-turbo&format=json&q=Hungary and selects the first result.

I see my browser automatically adds the HTTP header for Accept-Language. In your browser that's hu-HU or is it a mix of languages? https://www.whatismybrowser.com/detect/what-http-headers-is-my-browser-sending

I know you're searching in English, Overpass Turbo website is English but the interaction is telling Nominatim: search for something in Hungarian.

@gabortim
Copy link
Author

gabortim commented Dec 5, 2023

In your browser that's hu-HU or is it a mix of languages?

Mixed hu,en-GB;q=0.9,en;q=0.8,hu-HU;q=0.7,en-US;q=0.6, see the cURL formatted version of the request:

curl 'https://nominatim.openstreetmap.org/search?X-Requested-With=overpass-turbo&format=json&q=Hungary' \
  -H 'authority: nominatim.openstreetmap.org' \
  -H 'accept: */*' \
  -H 'accept-language: hu,en-GB;q=0.9,en;q=0.8,hu-HU;q=0.7,en-US;q=0.6' \
  -H 'dnt: 1' \
  -H 'origin: https://overpass-turbo.eu' \
  -H 'referer: https://overpass-turbo.eu/' \
  -H 'sec-ch-ua: "Google Chrome";v="119", "Chromium";v="119", "Not?A_Brand";v="24"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "Windows"' \
  -H 'sec-fetch-dest: empty' \
  -H 'sec-fetch-mode: cors' \
  -H 'sec-fetch-site: cross-site' \
  -H 'user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36' \
  --compressed

A possible workaround is to use tourism=museum in Magyarország. In this case, the search string is the native name of the country, but I have hit-and-miss experience with that.

@lonvia
Copy link
Member

lonvia commented Feb 6, 2024

Workaround for this specific query would be to query the country code: https://nominatim.osm.org/ui/search.html?q=HU&accept-language=hu-HU

Interestingly this example represents somewhat the opposite case of #3210. You have searched for a country name in a different language than requested. Because of the difference, the result is ranked lower. In #3210 a city name is requested that happens to be the same as the country name in another language. Here the country is ranked higher because countries have a higher prominence and the difference in spelling is not so big. Tuning the search algorithm to improve the results for Hungary will make results for Brasilia worse and vice versa. Tricky one.

lonvia added a commit to lonvia/Nominatim that referenced this issue Feb 6, 2024
Move the first cutting of the result list before reranking
by result match. This means that results with significantly
less importance are removed early and independently of the
fact how well they match the original query.

Fixes osm-search#3266.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants