Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(tokenizers): discard citation from nominative reporter on overlap #237

Merged
merged 1 commit into from
Mar 6, 2025

Conversation

grossir
Copy link
Contributor

@grossir grossir commented Mar 6, 2025

Solves #221 and #174

Uses a list of problematic nominative reporters to resolve overlpas

Due to the way we tokenize, an overlap was always resolved in favor of the first token. In the case of nominative reporters, this caused
a CitationToken to be found when a party name matched the
reporter's name, discarding the actual citation

This could be solved in a cleaner way by being consistent on tagging nominative reporters on reporters-db

Solves #221 and #174

Uses a list of problematic nominative reporters to resolve overlpas

Due to the way we tokenize, an overlap was always resolved in favor
of the first token. In the case of nominative reporters, this caused
 a CitationToken to be found when a party name matched the
reporter's name, discarding the actual citation

This could be solved in a cleaner way by being consistent on
tagging nominative reporters on reporters-db
@grossir grossir force-pushed the 221-handle-volume-nominative-overlaps branch from 1612e88 to 7381c86 Compare March 6, 2025 02:42
Copy link
Contributor

github-actions bot commented Mar 6, 2025

The Eyecite Report 👁️

Gains and Losses

There were 12 gains and 15 losses.

Click here to see details.
id Gain Loss
4678352 Thompson, 9
4678352 9 S.W.3d at 814
5740066 43 AD3d 467
5740066 Chase, 43
5970983 Holmes, 181
5970983 181 AD2d 27
2663630 Thompson, 224
2663630 224 F.R.D. 236
2496102 Thompson, 501
2496102 501 U.S. 722
2813797 Thompson, 99
2813797 99 AD3d 819
3018014 Thompson, 60
3018014 60 F.3d 514
1145042 Thompson, 25
1145042 25 Or. App. 511
4393773 264 Neb. 831
4393773 Thompson, 264
2737168 141 Ill. 2d at 242
2737168 141 Ill. 2d 204
2737168 Holmes, 141
6596585 Beckwith
6596585 18 W. Va. 103
6596585 Thompson, 18
6596585 18 W. Va. 135
6849206 Chase, 174
6849206 174 Wash. 363

Time Chart

image

Generated Files

Branch 1 Output
Branch 2 Output
Full Output CSV

@flooie flooie merged commit caa9097 into main Mar 6, 2025
14 checks passed
@flooie flooie deleted the 221-handle-volume-nominative-overlaps branch March 6, 2025 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants