|
| 1 | +# TAC 2011 |
| 2 | +[Homepage](https://tac.nist.gov/2011/Summarization/) |
| 3 | + |
| 4 | +For TAC 2011, we provide dataset readers for tasks 1 and the submitted AESOP values. |
| 5 | +```bash |
| 6 | +sacrerouge setup-dataset tac2011 \ |
| 7 | + <path-to-gigaword-root> \ |
| 8 | + <path-to-raw-data> \ |
| 9 | + <output-dir> |
| 10 | +``` |
| 11 | +The `<path-to-gigaword-root>` is the path to the root of `LDC2011T07/gigaword_eng_5`. |
| 12 | +The `<path-to-raw-data>` is the path to the root of the [DUC/TAC data repository](https://github.com/danieldeutsch/duc-tac-data) with the data already downloaded. |
| 13 | + |
| 14 | +The output files are the following: |
| 15 | +- `task1.X.jsonl`: The data for task 1 for document sets `X`. The file includes just set A (`A`), B (`B`), or both (`A-B`). |
| 16 | +- `task1.X.summaries.jsonl`: The submitted peer and reference summaries for task 1 |
| 17 | +- `task1.X.metrics.jsonl`: The corresponding automatic and manual evaluation metrics for the peer and reference summaries for task 1 |
| 18 | +- `task1.X.pyramids.jsonl`: The Pyramids for the set of references for task 1 |
| 19 | +- `task1.X.pyramid-annotations.jsonl`: The Pyramid annotations for each submitted peer and reference summary for task 1 |
| 20 | + |
| 21 | +## Notes |
| 22 | +It appears that the Pyramid annotations were exhasutive (identifying SCUs which are not present in the reference Pyramids). |
| 23 | +Those extra SCUs are not loaded here. |
| 24 | + |
| 25 | +There are Pyramids for the combined A-B summaries, which we do not load. |
| 26 | + |
| 27 | +The Pyramid annotations have incorrect SCU IDs, so they should be used with caution. |
| 28 | +Here is an example: |
| 29 | +```xml |
| 30 | +<!-- Pyramid for D1112-B --> |
| 31 | +<scu uid="7" label="Jury did not believe Alvarez planned to hurt anyone (NONE)"> |
| 32 | + <contributor label="The jury foreman said at a news conference, after the trial...he did not believe Alvarez planned to kill anyone"> |
| 33 | + <part label="he did not believe Alvarez planned to kill anyone" start="323" end="372"/> |
| 34 | + <part label="The jury foreman said at a news conference, after the trial" start="263" end="322"/> |
| 35 | + </contributor> |
| 36 | + <contributor label="the jury...believed he didn't intend to kill anyone"> |
| 37 | + <part label="the jury" start="642" end="650"/> |
| 38 | + <part label="believed he didn't intend to kill anyone" start="713" end="753"/> |
| 39 | + </contributor> |
| 40 | + <contributor label="Jurors...didn't believe he meant to hurt anyone"> |
| 41 | + <part label="didn't believe he meant to hurt anyone" start="1288" end="1326"/> |
| 42 | + <part label="Jurors" start="1228" end="1234"/> |
| 43 | + </contributor> |
| 44 | +</scu> |
| 45 | + |
| 46 | +<!-- # Annotation for system 22 --> |
| 47 | +<peerscu uid="41" label="(3) Jury did not believe Alvarez planned to hurt anyone (NONE)"> |
| 48 | + <contributor label="some jurors in the Metrolink train derailment case last month said they really didn't think Alvarez intended to kill anyone"> |
| 49 | + <part label="some jurors in the Metrolink train derailment case last month said they really didn't think Alvarez intended to kill anyone" start="304" end="427"/> |
| 50 | + </contributor> |
| 51 | +</peerscu> |
| 52 | +``` |
0 commit comments