Skip to content

verify_url Hotfix#20

Merged
tjacovich merged 5 commits intoadsabs:mainfrom
tjacovich:verify_url_hotfix
Jun 12, 2025
Merged

verify_url Hotfix#20
tjacovich merged 5 commits intoadsabs:mainfrom
tjacovich:verify_url_hotfix

Conversation

@tjacovich
Copy link
Copy Markdown
Contributor

There was a subtle change in urljoin between python 2 and python 3 when proc specs are mid-url that resulted in urls that used to be joined to form

http://resolver-service/2016nsf....1557792F/verify_url:http://dx.doi.org/10.1523/ENEURO.0220-17.2017

now being interpreted as

http://resolver-service/2016nsf....1557792F/verify_url:http:/dx.doi.org/10.1523/ENEURO.0220-17.2017

To remedy this, we have added a regex that catches those specific urls and handles them differently.

@tjacovich tjacovich requested a review from kelockhart June 12, 2025 14:30
@tjacovich tjacovich requested review from femalves June 12, 2025 14:40
@coveralls
Copy link
Copy Markdown

coveralls commented Jun 12, 2025

Pull Request Test Coverage Report for Build 15618963613

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 6 of 10 (60.0%) changed or added relevant lines in 1 file are covered.
  • 9 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.2%) to 65.006%

Changes Missing Coverage Covered Lines Changed/Added Lines %
apigateway/utils.py 6 10 60.0%
Files with Coverage Reduction New Missed Lines %
apigateway/services.py 9 75.86%
Totals Coverage Status
Change from base Build 15587748472: -0.2%
Covered Lines: 1148
Relevant Lines: 1766

💛 - Coveralls

Copy link
Copy Markdown
Member

@kelockhart kelockhart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's an issue with the regex here.

Comment thread apigateway/utils.py
#This block exists because of an incompatibility between urlparse.urljoin and urllib.parse.urljoin
#This incompatibility results in http(s):// -> http(s):/ if the proc spec occurs in the middle of the url.
try:
resolver_check = verify_url_regex.match(path)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the bibcode+verify_url:https://path... always be at the beginning of the path?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the particular issue we are handling, yes. We will need to adapt this when we move away from bibcodes, but I would hope it would just be mimicking the fix in resolver-gateway

Comment thread apigateway/utils.py
try:
resolver_check = verify_url_regex.match(path)
if resolver_check:
resolver_groups = resolver_check.groups()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure this will only ever return one item in the tuple, since there's only one capturing group in your regex

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... not sure what happened but an older version of the regex was pushed with the newer normalizing code. Should be fixed now.

@tjacovich tjacovich requested a review from kelockhart June 12, 2025 19:15
@tjacovich tjacovich merged commit 7371ae8 into adsabs:main Jun 12, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants