Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API results for dockets include two different "description" fields #4886

Open
s-taube opened this issue Jan 3, 2025 · 6 comments
Open

API results for dockets include two different "description" fields #4886

s-taube opened this issue Jan 3, 2025 · 6 comments
Assignees

Comments

@s-taube
Copy link
Contributor

s-taube commented Jan 3, 2025

There are two different "description" fields in API results for dockets. Users would like to be able to differentiate between the two.

For example:

  • "description": "Declaration in Opposition to Motion"

and

  • "description": "DECLARATION of RUDOLPH W. GIULIANI in Opposition re: 170 MOTION for Order to Show Cause Why Defendant Rudolph W. Giuliani Should not be Held in Contempt for Failing to Comply with the Turnover Orders.. Document filed by Rudolph W. Giuliani. (Attachments: # 1 Exhibit EXHIBIT 1, # 2 Exhibit EXHIBIT 2, # 3 Exhibit EXHIBIT 3, # 4 Exhibit EXHIBIT 4, # 5 Exhibit EXHIBIT 5, # 6 Exhibit EXHIBIT 6, # 7 Exhibit EXHIBIT 7, # 8 Exhibit EXHIBIT 8, # 9 Exhibit EXHIBIT 9, # 10 Exhibit EXHIBIT 10, # 11 Exhibit EXHIBIT 11)..(Cammarata, Joseph) (Entered: 12/24/2024)"

are both from:

{ "resource_uri": "https://www.courtlistener.com/api/rest/v4/docket-entries/411967648/", "id": 411967648, "docket": "https://www.courtlistener.com/api/rest/v4/dockets/69015293/", "recap_documents": [ { "resource_uri": "https://www.courtlistener.com/api/rest/v4/recap-documents/425199869/", "id": 425199869, "filepath_local": "recap/gov.uscourts.nysd.626017/gov.uscourts.nysd.626017.204.0.pdf", "document_number": "204", "attachment_number": null, "pacer_doc_id": "127036708025", "description": "Declaration in Opposition to Motion", … … … }, … … … … … ], … "entry_number": 204, "recap_sequence_number": "2024-12-24.004", "pacer_sequence_number": 570, "description": "DECLARATION of RUDOLPH W. GIULIANI in Opposition re: 170 MOTION for Order to Show Cause Why Defendant Rudolph W. Giuliani Should not be Held in Contempt for Failing to Comply with the Turnover Orders.. Document filed by Rudolph W. Giuliani. (Attachments: # 1 Exhibit EXHIBIT 1, # 2 Exhibit EXHIBIT 2, # 3 Exhibit EXHIBIT 3, # 4 Exhibit EXHIBIT 4, # 5 Exhibit EXHIBIT 5, # 6 Exhibit EXHIBIT 6, # 7 Exhibit EXHIBIT 7, # 8 Exhibit EXHIBIT 8, # 9 Exhibit EXHIBIT 9, # 10 Exhibit EXHIBIT 10, # 11 Exhibit EXHIBIT 11).(Cammarata, Joseph) (Entered: 12/24/2024)", }

@s-taube s-taube converted this from a draft issue Jan 3, 2025
@mlissner
Copy link
Member

mlissner commented Jan 4, 2025

@anseljh you said this caught you, and @johnhawkinson, you griped about it, so I'm trying to find an actionable thing that would have helped. The PACER API documentation already has a note about the two description fields:

Image

(From: https://www.courtlistener.com/help/api/rest/pacer/#docket-entry-endpoint)

Should we put it somewhere else or is there a tweak that would help?

@mlissner mlissner moved this from General Backlog to Backlog Jan 13 - Jan 24 in Sprint (Web Team) Jan 4, 2025
@johnhawkinson
Copy link
Contributor

@anseljh you said this caught you, and @johnhawkinson, you griped about it, so I'm trying to find an actionable thing that would have helped.

An actionable thing that would have helped is not to have two fields named the same thing.
It is already super-confusing that there are two different schema that can both contain the relevant field (docket-entries and within it recap-documents), but the fact that both schema have a description is an unforced error.

Changing the schema such that both fields appear as peers is probably extremely challenging — and arguably not correct, although CM/ECF's data model has them as peer, so I don't think that argument holds water — but renaming one or both of the fields so they have different names seems worthy of consideration.

The PACER API documentation already has a note about the two description fields:

I think expecting people to read the API documentation as distinct from glancing at the API output and making decisions based on what they see — I think reading the documentation may unfortunately be an unreasonable expectation.

@mlissner
Copy link
Member

mlissner commented Jan 4, 2025

So, I'm about 99.9% sure we won't be tweaking the model for this, but maybe explaining the design will help move the conversation forward. As I think you know, there's a 1-to-N relationship between docket entries and documents, and each object can have descriptions.

This means you have something like this:

Image

Where you have one docket entry with four documents fk'ed to it. Each document has a short description, and the entry itself has a long one.

Now, in our search engine, this all gets flattened, so we had to resolve it, and we wound up with short_description (for documents) and description (for entries), so maybe that's helpful, but I don't think the current fields are bad. We have id fields on practically every model too, and people don't get confused by those (much!).

I think reading the documentation may unfortunately be an unreasonable expectation.

😭.

Like I say, I don't think this is particularly a bad model, but even if it were, it probably wouldn't be worth changing (we'd need to release API v5, and that's big and difficult). I think people just need to read the docs instead of just playing with the APIs and hoping for the best. FWIW, this isn't a mistake I see often.

@anseljh
Copy link
Member

anseljh commented Jan 4, 2025 via email

@mlissner
Copy link
Member

mlissner commented Jan 4, 2025

This may have been a weird fluke based on how or what I was querying, because I’m mostly not running into this anymore with a different query.

It's not so weird. We get a lot of content from the PACER RSS feeds, which only have the short description, no the full one, so this happens a lot, actually.

I do think the docs are really long and could use a fresh pair of eyes to consider reorganizing them.

Yeah, maybe fresh eyes will help. I just did a huge documentation overhaul maybe six months ago. It made them a lot longer overall, but it did make the main API page significantly shorter.

@mlissner mlissner moved this from Backlog Jan 13 - Jan 24 to General Backlog in Sprint (Web Team) Jan 9, 2025
@mlissner
Copy link
Member

mlissner commented Jan 9, 2025

I think this winds up being a @s-taube issue, for user research, but I don't think it's a priority because we've recently invested so much time in the API docs.

@mlissner mlissner removed the status in Sprint (Web Team) Jan 9, 2025
@mlissner mlissner moved this to General Backlog in Sprint (Web Team) Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: General Backlog
Development

No branches or pull requests

4 participants