You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CONTRIBUTING.md
+94-62
Original file line number
Diff line number
Diff line change
@@ -1,68 +1,57 @@
1
1
# Contributing
2
2
3
-
One of the best ways to contribute is by writing templates!
3
+
One of the best ways to contribute is by writing prompts!
4
4
5
-
### What are Templates?
5
+
### What are Prompts?
6
6
7
-
A template is a piece of code written in a templating language called
7
+
A prompt consists of a template(input template and target template, along with collection of associated metadata. A template is a piece of code written in a templating language called
8
8
[Jinja](https://jinja.palletsprojects.com/en/3.0.x/). A template defines
9
9
a function that maps an example from a dataset in the
10
10
[Hugging Face datasets library](https://huggingface.co/datasets) to two strings of
11
-
text. The first is called the _prompt_ which provides all information that
11
+
text. The first is called the _input_ which provides all information that
12
12
will be available to solve a task, such as the instruction and the context.
13
-
The second piece is called the _output_, which is the desired response to the
13
+
The second piece is called the _target_, which is the desired response to the
14
14
prompt.
15
15
16
-
### Quick-Start Guide to Writing Templates
16
+
### Quick-Start Guide to Writing Prompts
17
17
18
18
1.**Set up the app.** Fork the app and set up using the
and find an unclaimed one. Put your name under "Who's Prompting it?" and
23
-
mark it yellow to show it's in progress.
24
20
1.**Examine the dataset.** Select or type the dataset into the dropdown in the app.
25
21
If the dataset has subsets (subsets are not the same as splits), you can select
26
-
which one to work on. Note that templates are subset-specific. You can find
22
+
which one to work on. Note that prompts are subset-specific. You can find
27
23
out background information on the dataset by reading the information in the
28
24
app. The dataset is a collection of examples, and each example is a Python
29
25
dictionary. The sidebar will tell you the schema that each example has.
30
-
1.**Start a new template**. Enter a name for your first template and hit "Create."
31
-
You can always update the name later. If you want to cancel the template, select
32
-
"Delete Template."
33
-
1.**Write the template**. In the box labeled "Template," enter a Jinja expression.
34
-
See the [getting started guide](#getting-started-using-jinja-to-write-templates)
26
+
1.**Start a new prompt**. Enter a name for your first prompt and hit "Create."
27
+
You can always update the name later. If you want to cancel the prompt, select
28
+
"Delete Prompt."
29
+
1.**Write the prompt**. In the box labeled "Template," enter a Jinja expression.
30
+
See the [getting started guide](#getting-started-using-jinja-to-write-prompts)
35
31
and [cookbook](#jinja-cookbook) for details on how to write templates.
36
-
1.**Add a reference.** If your template was inspired by a paper, note the
37
-
reference in the "Template Reference" section. You can also add a description of
38
-
what your template does.
39
-
1.**Save the template**. Hit the "Save" button. The output of the template
32
+
1.**Save the prompt**. Hit the "Save" button. The output of the prompt
40
33
applied to the current example will appear in the right sidebar.
41
-
1.**Verify the template**. Check that you didn't miss any case by scrolling
34
+
1.**Verify the prompt**. Check that you didn't miss any case by scrolling
42
35
through a handful of examples of the prompted dataset using the
43
36
"Prompted dataset viewer" mode.
44
-
1.**Write between 5 and 10 templates**. Repeat the steps 4 to 8 to create between 5
45
-
and 10 (more if you want!) templates per dataset/subset. Feel free to introduce
37
+
1.**Write between 5 and 10 prompts**. Repeat the steps 4 to 8 to create between 5
38
+
and 10 (more if you want!) prompts per dataset/subset. Feel free to introduce
46
39
a mix of formats, some that follow the templates listed in the [best practices](#best-practices)
47
40
and some that are more diverse in the format and the formulation.
48
-
1.**Duplicate the template(s).** If the dataset you have chosen bear the same
41
+
1.**Duplicate the prompts(s).** If the dataset you have chosen bear the same
49
42
format as other datasets (for instance, `MNLI` and `SNLI` have identical formats),
50
-
you can simply claim these datasets and duplicate the templates you have written
51
-
to these additional datasets. The most straighforward way to do it is to copy-paste
52
-
the `templates.yaml` file in right subfolder (the `templates` folder is broken down by dataset/subset).
53
-
Please make sure you adapt the `dataset` and `subset` keys in the yaml file. You don't need to wory
54
-
about the `id` as it is unique for a given dataset/subset.
43
+
you can simply duplicate the prompts you have written to these additional datasets.
55
44
1.**Upload the template(s).** Submit a PR using the instructions
56
-
[here](#uploading-templates).
45
+
[here](#uploading-prompts).
57
46
58
-
## Getting Started Using Jinja to Write Templates
47
+
## Getting Started Using Jinja to Write Prompts
59
48
60
49
Here is a quick crash course on using [Jinja](https://jinja.palletsprojects.com/en/3.0.x/)
61
50
to write templates. More advanced usage is in the [cookbook](#jinja-cookbook).
62
51
63
52
Generally, in a template, you'll want to use a mix of hard-coded data that is
64
53
task-specific and stays the same across examples, and commands that tailor the
65
-
prompt and output to a specific example.
54
+
input and target to a specific example.
66
55
67
56
To write text that should be rendered as written, just write it normally. The
68
57
following "template" will produce the same text every time:
@@ -107,40 +96,82 @@ The choices are {{"a"}}, {{"b"}}, and {{"c"}}.
107
96
```
108
97
You can leave binary options like yes/no, true/false, etc. unprotected.
109
98
110
-
Finally, remember that a template must produce two strings: a prompt and an output.
99
+
Finally, remember that a template must produce two strings: an input and a target.
111
100
To separate these two pieces, use three vertical bars `|||`.
112
-
So, a complete template for AG News could be:
101
+
So, a complete template for Squad could be:
113
102
```jinja2
114
-
{{text}}
115
-
Is this a piece of news regarding {{"world politics"}}, {{"sports"}}, {{"business"}}, or {{"science and technology"}}? |||
116
-
{{ ["World politics", "Sports", "Business", "Science and technology"][label] }}
103
+
I'm working on the final exam for my class and am trying to figure out the answer
104
+
to the question "{{question}}" I found the following info on Wikipedia and I think
105
+
it has the answer. Can you tell me the answer?
106
+
{{context}}
107
+
|||
108
+
{{answers["text"][0]}}'
117
109
```
118
110
119
-
## Best Practices
111
+
## Options
112
+
In addition to the template itself, you can fill out several other fields in the app.
113
+
***Prompt Reference.** If your template was inspired by a paper, note the
114
+
reference in the "Prompt Reference" section. You can also add a description of
115
+
what your template does.
116
+
***Original Task?** The checkbox should be checked if the template requires solving a
117
+
task that the underlying dataset is used to study. For example, a template that asks a
118
+
question from a question answering dataset would be an original task template, but one that asks
119
+
to generate a question for a given answer would not.
120
+
***Choices in Template?** The checkbox should be checked if the input explicitly indicates
121
+
the options for the possible outputs (regardless of whether `answer_choices` is used).
122
+
***Metrics.** Use the multiselect widget to select all metrics commonly used to evaluate
123
+
this task. Choose “Other” if there is one that is not included in the list.
124
+
***Answer Choices.** If the prompt has a small set of possible outputs (e.g., Yes/No,
125
+
class labels, entailment judgements, etc.), then the prompt should define and use answer
126
+
choices as follows. This allows evaluation to consider just the possible targets for
127
+
scoring model outputs. The answer choices field is a Jinja expression that should produce
128
+
a `|||` separated list of all possible targets. If the choices don't change from example
129
+
to example, then you can just list them. For example, AG News is
130
+
```jinja2
131
+
World News ||| Sports ||| Business ||| Science and Technology
132
+
```
133
+
Note that whitespace is stripped from the ends of the choices. If answer choices are set,
134
+
then they are also available to Jinja in the prompt itself in the form of a list called
135
+
`answer_choices`. You should use this list in both input and target templates so that the
136
+
resulting inputs and targets match the answer choices field exactly. For example, a prompt
137
+
for AG News could use `answer_choices` like this:
138
+
```jinja2
139
+
{{text}} Which of the following sections of a newspaper would
140
+
this article likely appear in? {{answer_choices[0]}}, {{answer_choices[1]}},
141
+
{{answer_choices[2]}}, or {{answer_choices[3]}}?
142
+
|||
143
+
{{ answer_choices[label] }}
144
+
```
145
+
Since Answer Choices is a Jinja expression that has access to the example, it can also be used
146
+
to extract example-specific choices from the underlying data. For example, in AI2 ARC, we could
147
+
use
148
+
```jinja2
149
+
{{choices.text | join("|||")}}
150
+
```
120
151
152
+
## Best Practices
121
153
122
-
***Writing outputs.**When writing a template for a task that requires outputting
123
-
a label, don't use articles or other stop words before the label name in the output.
124
-
For example, in TREC, the output should be "Person", not "A person". The reason
125
-
is that evaluations often look at the first word of the generated output to determine
126
-
correctness.
127
-
***Skipping datasets.** You might find a dataset in the spreadsheet that it
128
-
doesn't make sense to write templates for. For example, a dataset might just be
129
-
text without any annotations. For other cases, ask on Slack. If skipping a dataset,
130
-
mark it in red on the spreadsheet.
131
-
***Choosing input and output pairs.** Lots of datasets have multiple columns that can be
132
-
combined to form different (input, output) pairs i.e. different "tasks". Don't hesitate to
154
+
***Writing target templates.**The target template should only contain the answer to the task.
155
+
It should not contain any extra text such as “The answer is…” (unless that extra text is also in
156
+
`answer_choices`). If `answer_choices` is populated, the output should only contain the values
157
+
in `answer_choices`.
158
+
***Formatting multple-choice questions.** If the target should match the name of the choice
159
+
(e.g., “World News”), then it should list the choices either as part of a grammatical question
160
+
or a list with the marker for each (e.g, dashes). If the target should indicate the choice from
161
+
the list (e.g., “A,” “Explanation 1,” etc.), then it should list the choices with the indicator
162
+
before each one.
163
+
***Choosing input and target pairs.** Lots of datasets have multiple columns that can be
164
+
combined to form different (input, target) pairs i.e. different "tasks". Don't hesitate to
133
165
introduce some diversity by prompting a given dataset into multiple tasks and provide some
134
166
description in the "Template Reference" text box. An example is given
135
167
in the already prompted `movie_rationales`.
136
-
***Task Template Checkbox** While there are many different ways to prompt the tasks, only some of them correspond to the original intention of the dataset. For instance, for a summary dataset you can generate a summary or hallucinate an article. However, only the first was the true original task for the dataset. Use the *Task template* check box to indicate true task prompts. (We realize there are some corner cases, for instance, if there was no original task, you should leave this blank. If there are multiple original tasks you can check it for each of them. If you are confused for your dataset, consult with us in slack.)
137
-
***Filtering templates.** If a template is applied to an example and produces an
138
-
empty string, that template/example pair will be skipped. (Either the entire output
168
+
***Filtering prompts.** If a prompt is applied to an example and produces an
169
+
empty string, that prompt/example pair will be skipped. (Either the entire target
139
170
is whitespace or the text on either side of the separator `|||` is whitespace.
140
-
You can therefore create templates that only apply to a subset of the examples by
171
+
You can therefore create prompts that only apply to a subset of the examples by
141
172
wrapping them in Jinja if statements. For example, in the `TREC` dataset, there
142
173
are fine-grained categories that are only applicable to certain coarse-grained categories.
143
-
We can capture this with the following template:
174
+
We can capture this with the following prompt:
144
175
```jinja2
145
176
{% if label_coarse == 0 %}
146
177
Is this question asking for a {{"definition"}}, a {{"description"}}, a {{"manner of action"}}, or a {{"reason"}}?
@@ -149,9 +180,9 @@ Is this question asking for a {{"definition"}}, a {{"description"}}, a {{"manner
0 commit comments