Skip to content

Commit de86c4c

Browse files
Contributing pass (#553)
* Update README.md * Quick start of CONTRIBUTING.md. * WIP CONTRIBUTING.md * UI options in CONTRIBUTING.md. * Update CONTRIBUTING.md * Update CONTRIBUTING.md * Update CONTRIBUTING.md * Add programmatic example. * Update README.md * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update README.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update README.md Co-authored-by: Victor SANH <[email protected]> * Update README.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Victor SANH <[email protected]> Co-authored-by: Victor SANH <[email protected]>
1 parent d239a8a commit de86c4c

File tree

2 files changed

+121
-73
lines changed

2 files changed

+121
-73
lines changed

CONTRIBUTING.md

+94-62
Original file line numberDiff line numberDiff line change
@@ -1,68 +1,57 @@
11
# Contributing
22

3-
One of the best ways to contribute is by writing templates!
3+
One of the best ways to contribute is by writing prompts!
44

5-
### What are Templates?
5+
### What are Prompts?
66

7-
A template is a piece of code written in a templating language called
7+
A prompt consists of a template(input template and target template, along with collection of associated metadata. A template is a piece of code written in a templating language called
88
[Jinja](https://jinja.palletsprojects.com/en/3.0.x/). A template defines
99
a function that maps an example from a dataset in the
1010
[Hugging Face datasets library](https://huggingface.co/datasets) to two strings of
11-
text. The first is called the _prompt_ which provides all information that
11+
text. The first is called the _input_ which provides all information that
1212
will be available to solve a task, such as the instruction and the context.
13-
The second piece is called the _output_, which is the desired response to the
13+
The second piece is called the _target_, which is the desired response to the
1414
prompt.
1515

16-
### Quick-Start Guide to Writing Templates
16+
### Quick-Start Guide to Writing Prompts
1717

1818
1. **Set up the app.** Fork the app and set up using the
1919
[README](https://github.com/bigscience-workshop/promptsource/blob/main/README.md).
20-
1. **Select a dataset.** Go to the tracking spreadsheet
21-
[here](https://docs.google.com/spreadsheets/d/10SBt96nXutB49H52PV2Lvne7F1NvVr_WZLXD8_Z0JMw/)
22-
and find an unclaimed one. Put your name under "Who's Prompting it?" and
23-
mark it yellow to show it's in progress.
2420
1. **Examine the dataset.** Select or type the dataset into the dropdown in the app.
2521
If the dataset has subsets (subsets are not the same as splits), you can select
26-
which one to work on. Note that templates are subset-specific. You can find
22+
which one to work on. Note that prompts are subset-specific. You can find
2723
out background information on the dataset by reading the information in the
2824
app. The dataset is a collection of examples, and each example is a Python
2925
dictionary. The sidebar will tell you the schema that each example has.
30-
1. **Start a new template**. Enter a name for your first template and hit "Create."
31-
You can always update the name later. If you want to cancel the template, select
32-
"Delete Template."
33-
1. **Write the template**. In the box labeled "Template," enter a Jinja expression.
34-
See the [getting started guide](#getting-started-using-jinja-to-write-templates)
26+
1. **Start a new prompt**. Enter a name for your first prompt and hit "Create."
27+
You can always update the name later. If you want to cancel the prompt, select
28+
"Delete Prompt."
29+
1. **Write the prompt**. In the box labeled "Template," enter a Jinja expression.
30+
See the [getting started guide](#getting-started-using-jinja-to-write-prompts)
3531
and [cookbook](#jinja-cookbook) for details on how to write templates.
36-
1. **Add a reference.** If your template was inspired by a paper, note the
37-
reference in the "Template Reference" section. You can also add a description of
38-
what your template does.
39-
1. **Save the template**. Hit the "Save" button. The output of the template
32+
1. **Save the prompt**. Hit the "Save" button. The output of the prompt
4033
applied to the current example will appear in the right sidebar.
41-
1. **Verify the template**. Check that you didn't miss any case by scrolling
34+
1. **Verify the prompt**. Check that you didn't miss any case by scrolling
4235
through a handful of examples of the prompted dataset using the
4336
"Prompted dataset viewer" mode.
44-
1. **Write between 5 and 10 templates**. Repeat the steps 4 to 8 to create between 5
45-
and 10 (more if you want!) templates per dataset/subset. Feel free to introduce
37+
1. **Write between 5 and 10 prompts**. Repeat the steps 4 to 8 to create between 5
38+
and 10 (more if you want!) prompts per dataset/subset. Feel free to introduce
4639
a mix of formats, some that follow the templates listed in the [best practices](#best-practices)
4740
and some that are more diverse in the format and the formulation.
48-
1. **Duplicate the template(s).** If the dataset you have chosen bear the same
41+
1. **Duplicate the prompts(s).** If the dataset you have chosen bear the same
4942
format as other datasets (for instance, `MNLI` and `SNLI` have identical formats),
50-
you can simply claim these datasets and duplicate the templates you have written
51-
to these additional datasets. The most straighforward way to do it is to copy-paste
52-
the `templates.yaml` file in right subfolder (the `templates` folder is broken down by dataset/subset).
53-
Please make sure you adapt the `dataset` and `subset` keys in the yaml file. You don't need to wory
54-
about the `id` as it is unique for a given dataset/subset.
43+
you can simply duplicate the prompts you have written to these additional datasets.
5544
1. **Upload the template(s).** Submit a PR using the instructions
56-
[here](#uploading-templates).
45+
[here](#uploading-prompts).
5746

58-
## Getting Started Using Jinja to Write Templates
47+
## Getting Started Using Jinja to Write Prompts
5948

6049
Here is a quick crash course on using [Jinja](https://jinja.palletsprojects.com/en/3.0.x/)
6150
to write templates. More advanced usage is in the [cookbook](#jinja-cookbook).
6251

6352
Generally, in a template, you'll want to use a mix of hard-coded data that is
6453
task-specific and stays the same across examples, and commands that tailor the
65-
prompt and output to a specific example.
54+
input and target to a specific example.
6655

6756
To write text that should be rendered as written, just write it normally. The
6857
following "template" will produce the same text every time:
@@ -107,40 +96,82 @@ The choices are {{"a"}}, {{"b"}}, and {{"c"}}.
10796
```
10897
You can leave binary options like yes/no, true/false, etc. unprotected.
10998

110-
Finally, remember that a template must produce two strings: a prompt and an output.
99+
Finally, remember that a template must produce two strings: an input and a target.
111100
To separate these two pieces, use three vertical bars `|||`.
112-
So, a complete template for AG News could be:
101+
So, a complete template for Squad could be:
113102
```jinja2
114-
{{text}}
115-
Is this a piece of news regarding {{"world politics"}}, {{"sports"}}, {{"business"}}, or {{"science and technology"}}? |||
116-
{{ ["World politics", "Sports", "Business", "Science and technology"][label] }}
103+
I'm working on the final exam for my class and am trying to figure out the answer
104+
to the question "{{question}}" I found the following info on Wikipedia and I think
105+
it has the answer. Can you tell me the answer?
106+
{{context}}
107+
|||
108+
{{answers["text"][0]}}'
117109
```
118110

119-
## Best Practices
111+
## Options
112+
In addition to the template itself, you can fill out several other fields in the app.
113+
* **Prompt Reference.** If your template was inspired by a paper, note the
114+
reference in the "Prompt Reference" section. You can also add a description of
115+
what your template does.
116+
* **Original Task?** The checkbox should be checked if the template requires solving a
117+
task that the underlying dataset is used to study. For example, a template that asks a
118+
question from a question answering dataset would be an original task template, but one that asks
119+
to generate a question for a given answer would not.
120+
* **Choices in Template?** The checkbox should be checked if the input explicitly indicates
121+
the options for the possible outputs (regardless of whether `answer_choices` is used).
122+
* **Metrics.** Use the multiselect widget to select all metrics commonly used to evaluate
123+
this task. Choose “Other” if there is one that is not included in the list.
124+
* **Answer Choices.** If the prompt has a small set of possible outputs (e.g., Yes/No,
125+
class labels, entailment judgements, etc.), then the prompt should define and use answer
126+
choices as follows. This allows evaluation to consider just the possible targets for
127+
scoring model outputs. The answer choices field is a Jinja expression that should produce
128+
a `|||` separated list of all possible targets. If the choices don't change from example
129+
to example, then you can just list them. For example, AG News is
130+
```jinja2
131+
World News ||| Sports ||| Business ||| Science and Technology
132+
```
133+
Note that whitespace is stripped from the ends of the choices. If answer choices are set,
134+
then they are also available to Jinja in the prompt itself in the form of a list called
135+
`answer_choices`. You should use this list in both input and target templates so that the
136+
resulting inputs and targets match the answer choices field exactly. For example, a prompt
137+
for AG News could use `answer_choices` like this:
138+
```jinja2
139+
{{text}} Which of the following sections of a newspaper would
140+
this article likely appear in? {{answer_choices[0]}}, {{answer_choices[1]}},
141+
{{answer_choices[2]}}, or {{answer_choices[3]}}?
142+
|||
143+
{{ answer_choices[label] }}
144+
```
145+
Since Answer Choices is a Jinja expression that has access to the example, it can also be used
146+
to extract example-specific choices from the underlying data. For example, in AI2 ARC, we could
147+
use
148+
```jinja2
149+
{{choices.text | join("|||")}}
150+
```
120151

152+
## Best Practices
121153

122-
* **Writing outputs.** When writing a template for a task that requires outputting
123-
a label, don't use articles or other stop words before the label name in the output.
124-
For example, in TREC, the output should be "Person", not "A person". The reason
125-
is that evaluations often look at the first word of the generated output to determine
126-
correctness.
127-
* **Skipping datasets.** You might find a dataset in the spreadsheet that it
128-
doesn't make sense to write templates for. For example, a dataset might just be
129-
text without any annotations. For other cases, ask on Slack. If skipping a dataset,
130-
mark it in red on the spreadsheet.
131-
* **Choosing input and output pairs.** Lots of datasets have multiple columns that can be
132-
combined to form different (input, output) pairs i.e. different "tasks". Don't hesitate to
154+
* **Writing target templates.** The target template should only contain the answer to the task.
155+
It should not contain any extra text such as “The answer is…” (unless that extra text is also in
156+
`answer_choices`). If `answer_choices` is populated, the output should only contain the values
157+
in `answer_choices`.
158+
* **Formatting multple-choice questions.** If the target should match the name of the choice
159+
(e.g., “World News”), then it should list the choices either as part of a grammatical question
160+
or a list with the marker for each (e.g, dashes). If the target should indicate the choice from
161+
the list (e.g., “A,” “Explanation 1,” etc.), then it should list the choices with the indicator
162+
before each one.
163+
* **Choosing input and target pairs.** Lots of datasets have multiple columns that can be
164+
combined to form different (input, target) pairs i.e. different "tasks". Don't hesitate to
133165
introduce some diversity by prompting a given dataset into multiple tasks and provide some
134166
description in the "Template Reference" text box. An example is given
135167
in the already prompted `movie_rationales`.
136-
* **Task Template Checkbox** While there are many different ways to prompt the tasks, only some of them correspond to the original intention of the dataset. For instance, for a summary dataset you can generate a summary or hallucinate an article. However, only the first was the true original task for the dataset. Use the *Task template* check box to indicate true task prompts. (We realize there are some corner cases, for instance, if there was no original task, you should leave this blank. If there are multiple original tasks you can check it for each of them. If you are confused for your dataset, consult with us in slack.)
137-
* **Filtering templates.** If a template is applied to an example and produces an
138-
empty string, that template/example pair will be skipped. (Either the entire output
168+
* **Filtering prompts.** If a prompt is applied to an example and produces an
169+
empty string, that prompt/example pair will be skipped. (Either the entire target
139170
is whitespace or the text on either side of the separator `|||` is whitespace.
140-
You can therefore create templates that only apply to a subset of the examples by
171+
You can therefore create prompts that only apply to a subset of the examples by
141172
wrapping them in Jinja if statements. For example, in the `TREC` dataset, there
142173
are fine-grained categories that are only applicable to certain coarse-grained categories.
143-
We can capture this with the following template:
174+
We can capture this with the following prompt:
144175
```jinja2
145176
{% if label_coarse == 0 %}
146177
Is this question asking for a {{"definition"}}, a {{"description"}}, a {{"manner of action"}}, or a {{"reason"}}?
@@ -149,9 +180,9 @@ Is this question asking for a {{"definition"}}, a {{"description"}}, a {{"manner
149180
{{ {0: "Manner", 7: "Defintion", 9: "Reason", 12: "Description"}[label_fine] }}
150181
{% endif %}
151182
```
152-
* **Conditional generation format.** Always specify the output label `y` and separate it from the prompt
153-
by indicating the vertical bars `|||`. The `y` will be generated by a generative model
154-
conditioned on the prompted input you wrote. You can always transform an "infix" prompt format
183+
* **Conditional generation format.** Always specify the target and separate it from the prompt
184+
by indicating the vertical bars `|||`. The target will be generated by a generative model
185+
conditioned on the input you wrote. You can always transform an "infix" prompt format
155186
```jinja2
156187
Given that {{premise}}, it {{ ["must be true", "might be true", "must be false"][label] }} that {{hypothesis}}
157188
```
@@ -181,10 +212,13 @@ Determine the relation between the following two sentences. The relations are en
181212
{{premise}}
182213
{{hypothesis}} ||| {{label}}
183214
```
215+
* **Setting variables.** You might want to use the Jinja expression `{% set %}` to define a variable. If you do,
216+
do it at the beginning of the prompt, outside any conditional statements, so that the automatic prompt checks
217+
recognize that the variable is defined.
184218

185219
## More Examples
186220

187-
Here are a few interesting examples of templates with explanations.
221+
Here are a few interesting examples of prompts with explanations.
188222

189223
Here's one for `hellaswag`:
190224
```jinja2
@@ -239,15 +273,14 @@ the label might be unknown, so the pieces are wrapped in if statements.
239273
Second, notice that the choices `Yes or No` are not escaped. Yes/no, true/false
240274
are choices that do not need to be escaped (unlike categories).
241275

242-
## Uploading Templates
276+
## Uploading Prompts
243277

244278
Once you save or modify a template, the corresponding file inside the `templates`
245279
directory in the repo will be modified. To upload it, follow these steps:
246280
1. Run `make style` and `make quality`.
247281
2. Commit the modified template files (anything under `templates`) to git.
248282
3. Push to your fork on GitHub.
249283
4. Open a pull request against `main` on the PromptSource repo.
250-
5. When the PR is merged into main, mark the dataset in green on the spreadsheet.
251284

252285

253286
## Jinja Cookbook
@@ -280,5 +313,4 @@ do_something_with_a_and_b
280313

281314
Jinja includes lots of complex features but for most instances you likely only
282315
need to use the methods above. If there's something you're not sure how to do,
283-
just message the prompt engineering group on Slack. We'll collect other frequent
284-
patterns here.
316+
just open an issue. We'll collect other frequent patterns here.

README.md

+27-11
Original file line numberDiff line numberDiff line change
@@ -27,24 +27,40 @@ To host a public streamlit app, launch it with
2727
streamlit run promptsource/app.py -- -r
2828
```
2929

30+
## Prompting an Example:
31+
You can use Promptsource with [Datasets](https://huggingface.co/docs/datasets/) to create
32+
prompted examples:
33+
```python
34+
# Get an example
35+
from datasets import load_dataset
36+
dataset = load_dataset("ag_news")
37+
example = dataset["train"][0]
38+
39+
# Prompt it
40+
from promptsource.templates import TemplateCollection
41+
# Get all the prompts
42+
collection = TemplateCollection()
43+
# Get all the AG News prompts
44+
ag_news_prompts = collection.get_dataset("ag_news")
45+
# Select a prompt by name
46+
prompt = ag_news_prompts["classify_question_first"]
47+
48+
result = prompt.apply(example)
49+
print("INPUT: ", result[0])
50+
print("TARGET: ", result[1])
51+
```
52+
3053
## Contributing
3154
Contribution guidelines and step-by-step *HOW TO* are described [here](CONTRIBUTING.md).
3255

33-
## Writing Templates
34-
A prompt template is expressed in [Jinja](https://jinja.palletsprojects.com/en/3.0.x/).
56+
## Writing Prompts
57+
A prompt is expressed in [Jinja](https://jinja.palletsprojects.com/en/3.0.x/).
3558

3659
It is rendered using an example from the corresponding Hugging Face datasets library
37-
(a dictionary). The separator ||| should appear once to divide the template into prompt
38-
and output. Generally, the prompt should provide information on the desired behavior,
60+
(a dictionary). The separator ||| should appear once to divide the template into input
61+
and target. Generally, the prompt should provide information on the desired behavior,
3962
e.g., text passage and instructions, and the output should be a desired response.
4063

41-
Here's an example for [AG News](https://huggingface.co/datasets/ag_news):
42-
```jinja
43-
{{text}}
44-
Is this a piece of news regarding {{"world politics"}}, {{"sports"}}, {{"business"}}, or {{"science and technology"}}? |||
45-
{{ ["World politics", "Sports", "Business", "Science and technology"][label] }}
46-
```
47-
4864
For more information, read the [Contribution guidelines](CONTRIBUTING.md).
4965

5066
## Known Issues

0 commit comments

Comments
 (0)