Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
184 changes: 104 additions & 80 deletions skills/huggingface-papers/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
name: huggingface-papers
description: Look up and read Hugging Face paper pages in markdown, and use the papers API for structured metadata such as authors, linked models/datasets/spaces, Github repo and project page. Use when the user shares a Hugging Face paper page URL, an arXiv URL or ID, or asks to summarize, explain, or analyze an AI research paper.
description: Look up, read, search, and list Hugging Face paper pages using the `hf papers` CLI or the papers REST API. Fetch paper content as markdown, get structured metadata (authors, linked models/datasets/spaces, Github repo, project page), search papers by keyword, and browse the daily papers feed. Use when the user shares a Hugging Face paper page URL, an arXiv URL or ID, or asks to summarize, explain, analyze, search, or list AI research papers.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to add AI/ML so that the agent is more encouraged to use it without mention of HF papers.

Suggested change
description: Look up, read, search, and list Hugging Face paper pages using the `hf papers` CLI or the papers REST API. Fetch paper content as markdown, get structured metadata (authors, linked models/datasets/spaces, Github repo, project page), search papers by keyword, and browse the daily papers feed. Use when the user shares a Hugging Face paper page URL, an arXiv URL or ID, or asks to summarize, explain, analyze, search, or list AI research papers.
description: Look up, read, search, and list artificial intelligence and machine learning paper pages on Hugging Face using the `hf papers` CLI. Fetch paper content as markdown, get structured metadata (authors, linked models/datasets/spaces, Github repo, project page), search papers by keyword, and browse the daily papers feed. Use when the user shares a Hugging Face paper page URL, an arXiv URL or ID, or asks to summarize, explain, analyze, search, or list AI research papers.

---

# Hugging Face Paper Pages
Expand All @@ -13,7 +13,7 @@ Hugging Face Paper pages (hf.co/papers) is a platform built on top of arXiv (arx

Whenever someone mentions a HF paper or arXiv abstract/PDF URL in a model card, dataset card or README of a Space repository, the paper will be automatically indexed. Note that not all papers indexed on Hugging Face are also submitted to daily papers. The latter is more a manner of promoting a research paper. Papers can only be submitted to daily papers up until 14 days after their publication date on arXiv.

The Hugging Face team has built an easy-to-use API to interact with paper pages. Content of the papers can be fetched as markdown, or structured metadata can be returned such as author names, linked models/datasets/spaces, linked Github repo and project page.
The Hugging Face team has built an easy-to-use API and CLI to interact with paper pages. Content of the papers can be fetched as markdown, or structured metadata can be returned such as author names, linked models/datasets/spaces, linked Github repo and project page.

## When to Use
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the agent decides whether to load the skill based only on the frontmatter description, so having this "When to Use" section can't influence triggering no?


Expand All @@ -22,6 +22,7 @@ The Hugging Face team has built an easy-to-use API to interact with paper pages.
- User shares an arXiv URL (e.g. `https://arxiv.org/abs/2602.08025` or `https://arxiv.org/pdf/2602.08025`)
- User mentions a arXiv ID (e.g. `2602.08025`)
- User asks you to summarize, explain, or analyze an AI research paper
- User asks to search, list, or browse AI research papers

## Parsing the paper ID

Expand All @@ -36,142 +37,165 @@ It's recommended to parse the paper ID (arXiv ID) from whatever the user provide
| `2602.08025v1` | `2602.08025v1` |
| `2602.08025` | `2602.08025` |

This allows you to provide the paper ID into any of the hub API endpoints mentioned below.
This allows you to provide the paper ID into any of the CLI commands or API endpoints mentioned below.

### Fetch the paper page as markdown
## `hf papers` CLI (preferred)

The `hf` CLI (part of the `huggingface_hub` package) provides a convenient way to interact with papers directly from the terminal. Prefer the CLI over raw API calls when possible.

The content of a paper can be fetched as markdown like so:
### Read a paper as markdown

```bash
curl -s "https://huggingface.co/papers/{PAPER_ID}.md"
hf papers read {PAPER_ID}
```

This should return the Hugging Face paper page as markdown. This relies on the HTML version of the paper at https://arxiv.org/html/{PAPER_ID}.
This prints the full paper content as markdown to stdout. It relies on the HTML version of the paper at https://arxiv.org/html/{PAPER_ID}.

There are 2 exceptions:
- Not all arXiv papers have an HTML version. If the HTML version of the paper does not exist, then the content falls back to the HTML of the Hugging Face paper page.
- If it results in a 404, it means the paper is not yet indexed on hf.co/papers. See [Error handling](#error-handling) for info.
- If the paper is not indexed on hf.co/papers, the command exits with an error: `Paper '{PAPER_ID}' not found on the Hub.`

Alternatively, you can request markdown from the normal paper page URL, like so:
### Get structured metadata (JSON)

```bash
curl -s -H "Accept: text/markdown" "https://huggingface.co/papers/{PAPER_ID}"
hf papers info {PAPER_ID}
```

### Paper Pages API Endpoints

All endpoints use the base URL `https://huggingface.co`.
This prints structured JSON metadata that can include:

#### Get structured metadata
- authors (names and Hugging Face usernames, in case they have claimed the paper)
- media URLs (uploaded when submitting the paper to Daily Papers)
- summary (abstract) and AI-generated summary
- project page and GitHub repository
- organization and engagement metadata (number of upvotes)

Fetch the paper metadata as JSON using the Hugging Face REST API:
### Search papers

```bash
curl -s "https://huggingface.co/api/papers/{PAPER_ID}"
hf papers search "vision language"
hf papers search "attention mechanism" --limit 10
hf papers search "diffusion" --format json
```

This returns structured metadata that can include:
This performs hybrid semantic and full-text search over paper titles, authors, and content.

- authors (names and Hugging Face usernames, in case they have claimed the paper)
- media URLs (uploaded when submitting the paper to Daily Papers)
- summary (abstract) and AI-generated summary
- project page and GitHub repository
- organization and engagement metadata (number of upvotes)
Options:
- `--limit` (default 20): number of results
- `--format` (`table` or `json`): output format
- `--quiet`: only print paper IDs

To find models linked to the paper, use:
### List daily papers

```bash
curl https://huggingface.co/api/models?filter=arxiv:{PAPER_ID}
hf papers ls
hf papers ls --sort trending
hf papers ls --date 2025-01-23
hf papers ls --date today
hf papers ls --week 2025-W09
hf papers ls --month 2025-02
hf papers ls --submitter akhaliq
hf papers ls --format json
```

To find datasets linked to the paper, use:
Options:
- `--date`: date in ISO format (`YYYY-MM-DD`) or `today`
- `--week`: ISO week, e.g. `2025-W09`
- `--month`: month in ISO format, e.g. `2025-02`
- `--submitter`: filter by Hub username of the submitter
- `--sort`: `publishedAt` (default) or `trending`
- `--limit` (default 50): number of results
- `--format` (`table` or `json`): output format
- `--quiet`: only print paper IDs

## Paper Pages REST API

The REST API can be used as an alternative to the CLI, or for endpoints not yet covered by the CLI. All endpoints use the base URL `https://huggingface.co`.

### Fetch the paper page as markdown

```bash
curl https://huggingface.co/api/datasets?filter=arxiv:{PAPER_ID}
curl -s "https://huggingface.co/papers/{PAPER_ID}.md"
```

To find spaces linked to the paper, use:
Alternatively, you can request markdown from the normal paper page URL:

```bash
curl https://huggingface.co/api/spaces?filter=arxiv:{PAPER_ID}
curl -s -H "Accept: text/markdown" "https://huggingface.co/papers/{PAPER_ID}"
```

#### Claim paper authorship
### Get structured metadata

```bash
curl -s "https://huggingface.co/api/papers/{PAPER_ID}"
```

Claim authorship of a paper for a Hugging Face user:
### Find linked models, datasets, and spaces

```bash
curl "https://huggingface.co/api/settings/papers/claim" \
--request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $HF_TOKEN" \
--data '{
"paperId": "{PAPER_ID}",
"claimAuthorId": "{AUTHOR_ENTRY_ID}",
"targetUserId": "{USER_ID}"
}'
curl https://huggingface.co/api/models?filter=arxiv:{PAPER_ID}
curl https://huggingface.co/api/datasets?filter=arxiv:{PAPER_ID}
curl https://huggingface.co/api/spaces?filter=arxiv:{PAPER_ID}
```

- Endpoint: `POST /api/settings/papers/claim`
- Body:
- `paperId` (string, required): arXiv paper identifier being claimed
- `claimAuthorId` (string): author entry on the paper being claimed, 24-char hex ID
- `targetUserId` (string): HF user who should receive the claim, 24-char hex ID
- Response: paper authorship claim result, including the claimed paper ID
### Search papers

#### Get daily papers
```bash
curl -s "https://huggingface.co/api/papers/search?q=vision+language&limit=20"
```

Fetch the Daily Papers feed:
- Endpoint: `GET /api/papers/search`
- Query parameters:
- `q` (string): search query, max length 250
- `limit` (integer): number of results, between 1 and 120

### Get daily papers

```bash
curl -s -H "Authorization: Bearer $HF_TOKEN" \
"https://huggingface.co/api/daily_papers?p=0&limit=20&date=2017-07-21&sort=publishedAt"
curl -s "https://huggingface.co/api/daily_papers?limit=20&date=2025-01-23&sort=publishedAt"
```

- Endpoint: `GET /api/daily_papers`
- Query parameters:
- `p` (integer): page number
- `limit` (integer): number of results, between 1 and 100
- `date` (string): RFC 3339 full-date, for example `2017-07-21`
- `week` (string): ISO week, for example `2024-W03`
- `month` (string): month value, for example `2024-01`
- `date` (string): RFC 3339 full-date, for example `2025-01-23`
- `week` (string): ISO week, for example `2025-W09`
- `month` (string): month value, for example `2025-02`
- `submitter` (string): filter by submitter
- `sort` (enum): `publishedAt` or `trending`
- Response: list of daily papers

#### List papers

List arXiv papers sorted by published date:
### List papers

```bash
curl -s -H "Authorization: Bearer $HF_TOKEN" \
"https://huggingface.co/api/papers?cursor={CURSOR}&limit=20"
curl -s "https://huggingface.co/api/papers?cursor={CURSOR}&limit=20"
```

- Endpoint: `GET /api/papers`
- Query parameters:
- `cursor` (string): pagination cursor
- `limit` (integer): number of results, between 1 and 100
- Response: list of papers

#### Search papers

Perform hybrid semantic and full-text search on papers:
### Claim paper authorship

```bash
curl -s -H "Authorization: Bearer $HF_TOKEN" \
"https://huggingface.co/api/papers/search?q=vision+language&limit=20"
curl "https://huggingface.co/api/settings/papers/claim" \
--request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $HF_TOKEN" \
--data '{
"paperId": "{PAPER_ID}",
"claimAuthorId": "{AUTHOR_ENTRY_ID}",
"targetUserId": "{USER_ID}"
}'
```

This searches over the paper title, authors, and content.

- Endpoint: `GET /api/papers/search`
- Query parameters:
- `q` (string): search query, max length 250
- `limit` (integer): number of results, between 1 and 120
- Response: matching papers
- Endpoint: `POST /api/settings/papers/claim`
- Body:
- `paperId` (string, required): arXiv paper identifier being claimed
- `claimAuthorId` (string): author entry on the paper being claimed, 24-char hex ID
- `targetUserId` (string): HF user who should receive the claim, 24-char hex ID

#### Index a paper
### Index a paper

Insert a paper from arXiv by ID. If the paper is already indexed, only its authors can re-index it:

Expand All @@ -189,9 +213,8 @@ curl "https://huggingface.co/api/papers/index" \
- Body:
- `arxivId` (string, required): arXiv ID to index, for example `2301.00001`
- Pattern: `^\d{4}\.\d{4,5}$`
- Response: empty JSON object on success

#### Update paper links
### Update paper links

Update the project page, GitHub repository, or submitting organization for a paper. The requester must be the paper author, the Daily Papers submitter, or a papers admin:

Expand All @@ -214,11 +237,11 @@ curl "https://huggingface.co/api/papers/{PAPER_OBJECT_ID}/links" \
- `githubRepo` (string, nullable): GitHub repository URL
- `organizationId` (string, nullable): organization ID, 24-char hex ID
- `projectPage` (string, nullable): project page URL
- Response: empty JSON object on success

## Error Handling

- **404 on `https://huggingface.co/papers/{PAPER_ID}` or `md` endpoint**: the paper is not indexed on Hugging Face paper pages yet.
- **CLI errors**: `hf papers info` and `hf papers read` print `Paper '{PAPER_ID}' not found on the Hub.` when the paper does not exist.
- **404 on `https://huggingface.co/papers/{PAPER_ID}` or `.md` endpoint**: the paper is not indexed on Hugging Face paper pages yet.
- **404 on `/api/papers/{PAPER_ID}`**: the paper may not be indexed on Hugging Face paper pages yet.
- **Paper ID not found**: verify the extracted arXiv ID, including any version suffix

Expand All @@ -233,7 +256,8 @@ If the Hugging Face paper page does not contain enough detail for the user's que

## Notes

- No authentication is required for public paper pages.
- Write endpoints such as claim authorship, index paper, and update paper links require `Authorization: Bearer $HF_TOKEN`.
- Prefer the `.md` endpoint for reliable machine-readable output.
- Prefer `/api/papers/{PAPER_ID}` when you need structured JSON fields instead of page markdown.
- Prefer the `hf papers` CLI for reading, searching, and listing papers.
- No authentication is required for public paper pages (CLI or REST API).
- Write endpoints such as claim authorship, index paper, and update paper links require `Authorization: Bearer $HF_TOKEN` (REST API only, not yet available in the CLI).
- Prefer `hf papers read {PAPER_ID}` or the `.md` endpoint for reliable machine-readable output.
- Prefer `hf papers info {PAPER_ID}` or `/api/papers/{PAPER_ID}` when you need structured JSON fields instead of page markdown.
Loading