Skip to content

ceroper/DLProject

Repository files navigation

RNN for Text Summarization

Spring 2018 Deep Learning Project

Eduardo Fierro (eff254), Lisa Ren (tr1312), Caroline Roper (cer446)


Files

You can download the RAW and pre-processed data used in this project from an NYU email ([email protected]) from the following link: https://bit.ly/2IEN7Ke

Preprocess

To pre-process the data, we followed Rico Sennrich's subword-NMT scripts, available originally here: https://github.com/rsennrich/subword-nmt. All the files are available in the Preprocess folder inside this repository. Please refer to the README.md file inside that folder for instructions of replication.

Open-NMT

The model follows the specification publicly available by Guillaume Klein, Yoon Kim, Yuntian Dengm, Jean Senellart and Alexander M. Rush, available here: https://github.com/OpenNMT/OpenNMT-py. Due to recent changes for the implementation of pytorch 0.4, the codes were forked to this repository. For requirements, please refer to the README.md file of the repository.

Training models

Once the requirements are met, you can pre-process the data to generate the lazy dataloaders with the following code:

python preprocess.py -train_src DIRECTORY_TO_SOURCE_TRAIN/train.article.txt.bpe \
                     -train_tgt DIRECTORY_TO_TARGET_TRAIN/train.title.txt.bpe \
                     -valid_src DIRECTORY_TO_SOURCE_VALIDATION/valid.article.filter.txt.bpe \
                     -valid_tgt DIRECTORY_TO_TARGET_VALIDATION/valid.title.filter.txt.bpe \
                     -save_data OUTPUT_DIRECTORY/bpe \
                     -src_seq_length 10000 \
                     -dynamic_dict \
                     -share_vocab \
                     -max_shard_size 524288000                                               

After that, the model can be run using the following specification, where the directory INPUT_DIRECTORYfrom below is the same as the OUTPUT_DIRECTORY from above:

python train.py -data INPUT_DIRECTORY/bpe \
                -save_model SAVED_MODELS_DIRECTORY \
                -copy_attn \
                -reuse_copy_attn \
                -epochs 20 \
                -gpuid 0 \
                -bridge

You can also specify the share_embeddings option or none. For comparison purposes, run the model using BPE data and data that have not converted into Byte Pairs.

Evaluation

To generate the translation, you can run the following:

python translate.py -gpu 0 \
                    -batch_size 20 \
                    -beam_size 5 \
                    -model SAVED_MODELS_DIRECTORY \
                    -src PREDICTION DATA \
                    -output OUTPUT_DIRECTORY \
                    -min_length 9 \
                    -verbose \
                    -stepwise_penalty \
                    -coverage_penalty summary \
                    -beta 5 \
                    -length_penalty wu \
                    -alpha 0.9 \
                    -verbose \
                    -block_ngram_repeat 3 \
                    -ignore_when_blocking "." "</t>" "<t>"

In the example above, the translations are generated with alpha = 0.9 and a minimum 9 words.

METEOR scores are obtained using the publicly available implementation authored by Michael Denkowski: https://github.com/cmu-mtlab/meteor. Please refer to the README for further installation instructions and details: http://www.cs.cmu.edu/~alavie/METEOR/README.html.

The scores are obtained using the command below where bp_m9_a9.out is a text file of generated summaries and validation.title.bpe.txt is a text file of reference summaries.

java -Xmx2G -jar meteor-*.jar bpe_m9_a9.out validation.title.bpe.txt -l en -norm

About

Using Recurrent Neural Networks to Generate Headlines for Articles

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published