Skip to content

Commit 80a5a3d

Browse files
committed
update README + typos
1 parent 86072bc commit 80a5a3d

File tree

3 files changed

+22
-26
lines changed

3 files changed

+22
-26
lines changed

CONTRIBUTING.md

+8-8
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ One of the best ways to contribute is by writing templates!
77
A template is a piece of code written in a templating language called
88
[Jinja](https://jinja.palletsprojects.com/en/3.0.x/). A template defines
99
a function that maps an example from a dataset in the
10-
[HuggingFace library](https://huggingface.co/datasets) to two strings of
10+
[Hugging Face datasets library](https://huggingface.co/datasets) to two strings of
1111
text. The first is called the _prompt_ which provides all information that
1212
will be available to solve a task, such as the instruction and the context.
1313
The second piece is called the _output_, which is the desired response to the
@@ -23,7 +23,7 @@ and find an unclaimed one. Put your name under "Who's Prompting it?" and
2323
mark it yellow to show it's in progress.
2424
1. **Examine the dataset.** Select or type the dataset into the dropdown in the app.
2525
If the dataset has subsets (subsets are not the same as splits), you can select
26-
which one to work on. Note that templates are subset specific. You can find
26+
which one to work on. Note that templates are subset-specific. You can find
2727
out background information on the dataset by reading the information in the
2828
app. The dataset is a collection of examples, and each example is a Python
2929
dictionary. The sidebar will tell you the schema that each example has.
@@ -42,10 +42,10 @@ applied to the current example will appear in the right sidebar.
4242
through a handful of examples of the prompted dataset using the
4343
"Prompted dataset viewer" mode.
4444
1. **Write between 5 and 10 templates**. Repeat the steps 4 to 8 to create between 5
45-
and 10 (more if you want!) templates per dataset. Feel free to introduce some diversity
45+
and 10 (more if you want!) templates per dataset/subset. Feel free to introduce some diversity
4646
both in the format and the formulation.
4747
1. **Duplicate the template(s).** If the dataset you have chosen bear the same
48-
format as other datasets (for instance `MNLI` and `SNLI` have identical format),
48+
format as other datasets (for instance, `MNLI` and `SNLI` have identical formats),
4949
you can simply claim these datasets and duplicate the templates you have written
5050
to these additional datasets.
5151
1. **Upload the template(s).** Submit a PR using the instructions
@@ -56,7 +56,7 @@ to these additional datasets.
5656
Here is a quick crash course on using [Jinja](https://jinja.palletsprojects.com/en/3.0.x/)
5757
to write templates. More advanced usage is in the [cookbook](#jinja-cookbook).
5858

59-
Generally in a template, you'll want to use a mix of hard-coded data that is
59+
Generally, in a template, you'll want to use a mix of hard-coded data that is
6060
task-specific and stays the same across examples, and commands that tailor the
6161
prompt and output to a specific example.
6262

@@ -115,7 +115,7 @@ Is this a piece of news regarding {{"world politics"}}, {{"sports"}}, {{"busines
115115

116116
A few miscellaneous things:
117117

118-
* **Writing outputs.** When writing a template for an task that requires outputting
118+
* **Writing outputs.** When writing a template for a task that requires outputting
119119
a label, don't use articles or other stop words before the label name in the output.
120120
For example, in TREC, the output should be "Person", not "A person". The reason
121121
is that evaluations often look at the first word of the generated output to determine
@@ -145,7 +145,7 @@ Is this question asking for a {{"definition"}}, a {{"description"}}, a {{"manner
145145
```
146146
* **Conditional generation format.** Always specify the output label `y` and separate it from the prompt
147147
by indicating the vertical bars `|||`. The `y` will be generated by a generative model
148-
conditionned on the prompted input you wrote. You can always transform an "infix" prompt format
148+
conditioned on the prompted input you wrote. You can always transform an "infix" prompt format
149149
```jinja2
150150
Given that {{premise}}, it {{ ["must be true", "might be true", "must be false"][label] }} that {{hypothesis}}
151151
```
@@ -158,7 +158,7 @@ Given that {{premise}}, it {{ "must be true, might be true, or must be false" }}
158158
## Uploading Templates
159159

160160
Once you save or modify a template, the corresponding file inside the `templates`
161-
directory in the repo will be modified. To upload it, following these steps:
161+
directory in the repo will be modified. To upload it, follow these steps:
162162
1. Run `make style` and `make quality`.
163163
2. Commit the modified template files (anything under `templates`) to git.
164164
3. Push to your fork on GitHub.

README.md

+14-18
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,13 @@ There are 3 modes in the app:
1919
- **Prompted dataset viewer**: check the templates you wrote or already written on entire dataset
2020
- **Sourcing**: write new prompts
2121

22+
<img src="assets/promptsource_app.png" width="800">
23+
24+
## Contributing
25+
Join the **Hackaprompt** and help writing templates!
26+
27+
Contribution guidelines and step-by-step *HOW TO* are described [here](CONTRIBUTING.md).
28+
2229
## Writing Templates
2330
A prompt template is expressed in [Jinja](https://jinja.palletsprojects.com/en/3.0.x/).
2431

@@ -28,28 +35,17 @@ and output. Generally, the prompt should provide information on the desired beha
2835
e.g., text passage and instructions, and the output should be a desired response.
2936

3037
Here's an example for [AG News](https://huggingface.co/datasets/ag_news):
31-
```
38+
```jinja
3239
{{text}}
33-
Is this a piece of news regarding world politics, sports, business, or technology? |||
34-
{{ ["World politics", "Sport", "Business", "Technology"][label] }}
40+
Is this a piece of news regarding {{"world politics"}}, {{"sports"}}, {{"business"}}, or {{"science and technology"}}? |||
41+
{{ ["World politics", "Sports", "Business", "Science and technology"][label] }}
3542
```
3643

37-
## Contributing
38-
This is very much a work in progress, and help is needed and appreciated. Anyone wishing to
39-
contribute code can contact Steve Bach for commit access, or submit PRs from forks. Some particular
40-
places you could start:
41-
1. Try to express things! Explore a dataset and tell us what's hard to do to create templates you want
42-
2. Look in the literature. Are there prompt creation methods that do/do not fit well right now?
43-
3. Scalability testing. Streamlit is lightweight, and we're reading and writing all prompts on refresh.
44-
45-
See also the [design doc](https://docs.google.com/document/d/1IQzrrAAMPS0XAn_ArOq2hyEDCVfeB7AfcvLUqgSnWxQ/).
46-
47-
Before submitting a PR or pushing a new commit, please run style formattings and quality checks so that your newly added file look nice:
48-
```bash
49-
make style
50-
make quality
51-
```
44+
For more information, read the [Contribution guidelines](CONTRIBUTING.md).
5245

46+
## Other documentation
47+
- Experimental context: [Experiment D](https://docs.google.com/document/d/1ar9cTRs9ZWxMkW-_9jF1ZtfY-u8IK_sO5QOtq1qnC2I/edit?usp=sharing)
48+
- Prompting interface: [design doc](https://docs.google.com/document/d/1IQzrrAAMPS0XAn_ArOq2hyEDCVfeB7AfcvLUqgSnWxQ/)
5349
## Known Issues
5450

5551
**Warning or Error about Darwin on OS X:** Try downgrading PyArrow to 3.0.0.

assets/promptsource_app.png

553 KB
Loading

0 commit comments

Comments
 (0)