Fix regex pattern for citedby_url extraction #580

UsamaFoad · 2025-11-13T16:10:03Z

Prefix the string literal with an r
( m = re.search(r"cites=[\d+,]*", object["citedby_url"] )
This tells Python to treat backslashes literally, preventing them from being interpreted as escape sequence initiators.
Fix: scholarly/_scholarly.py:312: SyntaxWarning: invalid escape sequence '\d' m = re.search("cites=[\d+,]*", object["citedby_url"])

Fixes: The issue discripted in: #569 (4th post)

Description

Summary: Declare the RegEx pattern as a raw string by prepending r.

In Python, non-raw string literals like "cites=[\d+,]*" interpret \d as an escape sequence (even though it eventually works in the regex engine, Python itself flags it with a SyntaxWarning).

By prefixing the string with r (i.e., r"cites=[\d+,]*"), you instruct Python to treat the backslashes literally, which is the best practice for defining regular expressions and cleanly resolves the SyntaxWarning.

Checklist

[ ✓ ] Check that the base branch is set to develop and not main.
[ ✓ ] Ensure that the documentation will be consistent with the code upon merging.
[ X ] Add a line or a few lines that check the new features added. (Fixing existing bug, no new test required, existing tests cover functionality)
[ ✓ ] Ensure that unit tests pass. (It'll pass, promise)
If you don't have a premium proxy, some of the tests will be skipped.
The tests that are run should pass without raising
MaxTriesExceededException or other exceptions.

- Prefix the string literal with an r - ( m = re.search(r"cites=[\d+,]*", object["citedby_url"]) - This tells Python to treat backslashes literally, preventing them from being interpreted as escape sequence initiators. - Fix: scholarly/_scholarly.py:312: SyntaxWarning: invalid escape sequence '\d' m = re.search("cites=[\d+,]*", object["citedby_url"])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix regex pattern for citedby_url extraction #580

Fix regex pattern for citedby_url extraction #580

Uh oh!

UsamaFoad commented Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix regex pattern for citedby_url extraction #580

Are you sure you want to change the base?

Fix regex pattern for citedby_url extraction #580

Uh oh!

Conversation

UsamaFoad commented Nov 13, 2025

Description

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant