-
-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ][
characters to page regex of NY Slip Op
and other NY reporters
#206
Comments
][
characters to page regex of NY Slip Op
reporter][
characters to page regex of NY Slip Op
and other NY reporters
When we ingest a scraped opinion we do: return Citation(
cluster=cluster,
volume=citation_objs[0].groups["volume"],
reporter=citation_objs[0].corrected_reporter(),
page=citation_objs[0].groups["page"],
...) When we try to match citations, we do filters.append(
Q(
"match_phrase",
**{"citation.exact": full_citation.corrected_citation()},
)
) The correction in eyecite models def corrected_citation(self):
"""Return citation with corrected reporter."""
if self.edition_guess:
return self.matched_text().replace(
self.groups["reporter"], self.edition_guess.short_name
)
return self.matched_text() So, we could probably add a
Which for now would only standardize NY pages so we can actually match them What would we need to set a canonical page format in reporters-db? Or we could just hardcode in eyecite the values we know in NY, and prefer |
See #207 |
I think the answer here is to improve the regexes to enable both options- which is what we did. and we will just add more variations as we see them instead of hard coding anything. |
We may be failing to resolve (or resolve incorrectly) NY citations due to a variation in the page format in the following scenarios:
NY Slip Op
citations: We accept an optional(U)
, but the opinions text may show an optional[U]
Misc 3d
citations: We accept an optional(A)
, but the opinions text may show an optional[A]
Currently this happens, which may lead to incorrect matches
However, how could we standardize the page after extraction?
reporters-db/reporters_db/data/reporters.json
Line 18833 in d009aae
Example

Highlighted the
NY Slip Op
; to the left is theMisc 3d
Another
The text was updated successfully, but these errors were encountered: