fix: add DOI to prefix_mapping in _get_not_found_ids#115
Closed
lior-airis wants to merge 1 commit intodanielnsilva:masterfrom
Closed
fix: add DOI to prefix_mapping in _get_not_found_ids#115lior-airis wants to merge 1 commit intodanielnsilva:masterfrom
lior-airis wants to merge 1 commit intodanielnsilva:masterfrom
Conversation
The `_get_not_found_ids` method was missing DOI from its `prefix_mapping` dict. When a paper was looked up using a DOI-prefixed ID (e.g., `DOI:10.1145/792550.792552`), the method would add only the bare DOI value to `found_ids` (without the `DOI:` prefix), causing the input ID to never match. This resulted in a false "IDs not found" warning for every DOI-prefixed lookup, even when the paper was successfully returned. Also changed the matching logic to always add bare external ID values alongside prefixed forms, so both `DOI:10.1145/...` and `10.1145/...` inputs match correctly. This is consistent with how the Semantic Scholar API accepts both forms. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bfe2822 to
ef052b3
Compare
Author
|
Closing in favor of #116 which fixes the Python 3.14 nest_asyncio incompatibility first. Will rebase the DOI prefix fix on top of that and reopen as a separate PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_get_not_found_idswas missingDOIfrom itsprefix_mappingdict, causing false "IDs not found" warnings for every DOI-prefixed lookup (e.g.,DOI:10.1145/792550.792552)found_idswithout theDOI:prefix, so it never matched the prefixed input ID'DOI': 'DOI'to the mappingDOI:10.1145/...and10.1145/...inputs match correctly (preserving backward compatibility with bare DOI inputs used in existing tests)Reproduction
The paper data IS returned, but the spurious warning is confusing and the
return_not_found=Trueparameter falsely includes DOI-prefixed IDs in the not-found list.Test plan
test_get_papers_doi_prefix_not_false_positive(sync)test_get_papers_doi_prefix_not_false_positive_async(async)10.2139/ssrn.2250500still match correctly)