Vote here:

https://docs.google.com/spreadsheets/d/1S3afCApltgDAJx7vXIy7vSGV65dA6aind4aJjXi1NEw/edit?usp=sharing

Proposed by Becky:

All-but-the-Top: Simple and Effective Postprocessing for Word Representations

https://arxiv.org/pdf/1702.01417.pdf

Deep Learning for NLP Best Practices

http://ruder.io/deep-learning-nlp-best-practices/?utm_campaign=Artificial%2BIntelligence%2BWeekly&utm_medium=email&utm_source=Artificial_Intelligence_Weekly_66

Proposed by Mihai:

Adversarial Examples for Evaluating Reading Comprehension Systems

https://arxiv.org/pdf/1707.07328.pdf Reading comprehension systems are not comprehending

Understanding Black-box Predictions via Influence Functions

https://arxiv.org/pdf/1703.04730.pdf Fundamental for interpretability

On the State of the Art of Evaluation in Neural Language Models

https://arxiv.org/pdf/1707.05589.pdf LSTMs still the best

Multi-Task Video Captioning with Video and Entailment Generation

https://arxiv.org/abs/1704.07489 Multi-modal learning

Adversarial Sets for Regularising Neural Link Predictors

https://arxiv.org/abs/1707.07596 Nice example of how to use adversarial examples for RE

Reading Wikipedia to Answer Open-Domain Questions

https://nlp.stanford.edu/pubs/chen2017reading.pdf

UNDERSTANDING DEEP LEARNING REQUIRES RETHINKING GENERALIZATION

https://openreview.net/forum?id=Sy8gdB9xx&noteId=Sy8gdB9xx blog: https://blog.acolyer.org/2017/05/11/understanding-deep-learning-requires-re-thinking-generalization/

Data Programming: Machine Learning with Weak Supervision (blog)

http://hazyresearch.github.io/snorkel/blog/weak_supervision.html

On the Origin of Deep Learning

https://arxiv.org/pdf/1702.07800.pdf

DYNAMIC COATTENTION NETWORKS FOR QUESTION ANSWERING

https://arxiv.org/pdf/1611.01604.pdf

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

https://arxiv.org/abs/1703.03864 blog: https://blog.openai.com/evolution-strategies/

Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling

https://arxiv.org/abs/1703.04826

Proposed by Ajay

Optimizing Multivariate Performance Measures for Learning Relation Extraction Models

https://www.semanticscholar.org/paper/Optimizing-Multivariate-Performance-Measures-for-Haffari-Nagesh/39a4df0e88eba81d5cea53eaf57295f914099586

Proposed by Dane:

PathNet: Evolution Channels Gradient Descent in Super Neural Networks

https://arxiv.org/abs/1701.08734

Be Precise or Fuzzy: Learning the Meaning of Cardinals and Quantifiers from Vision

https://arxiv.org/abs/1702.05270

Explaining Recurrent Neural Network Predictions in Sentiment Analysis

https://arxiv.org/abs/1706.07206

Proposed by Heather

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (DCGANs)

https://arxiv.org/abs/1511.06434

Proposed by Michael

Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics?

Linguistically Naive != Language Independent: Why NLP Needs Linguistic Typology

http://machinelearningtext.pbworks.com/w/file/fetch/48157747/CICLing2011-manning-tagging.pdf http://www.aclweb.org/anthology/W/W09/W09-0106.pdf

Adversarial Generation of Natural Language (bad review from Goldberg)

Language Generation with Recurrent Generative Adversarial Networks without Pre-training

https://arxiv.org/pdf/1705.10929.pdf https://arxiv.org/pdf/1706.01399.pdf

Adversarial Examples for Evaluating Reading Comprehension Systems

https://nlp.stanford.edu/pubs/jia2017adversarial.pdf

Wide and Deep Learning for Recommender Systems

paper: https://arxiv.org/pdf/1606.07792.pdf blog/tutorial: https://www.tensorflow.org/tutorials/wide_and_deep

What Analogies Reveal About Word Vectors and Their Compositionality

https://aclweb.org/anthology/S/S17/S17-1001.pdf

Proposed by Marco

Hierarchical clustering

Discovering Structure in High-Dimensional Data Through Correlation Explanation

http://arxiv.org/abs/1406.1222 https://github.com/gregversteeg/CorEx https://github.com/gregversteeg/bio_corex/

Maximally Informative Hierarchical Representions of High-Dimensional Data

http://arxiv.org/abs/1410.7404

Proposed by Gus

From ACL 2017

VERB PHYSICS: Relative Physical Knowledge of Actions and Objects

http://www.aclweb.org/anthology/P17-1025

Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules

http://www.aclweb.org/anthology/P17-1006

Learning with Noise: Enhance Distantly Supervised Relation Extraction with Dynamic Transition Matrix

http://www.aclweb.org/anthology/P17-1040

Abstract Syntax Networks for Code Generation and Semantic Parsing

http://aclanthology.info/papers/P17-1105/abstract-syntax-networks-for-code-generation-and-semantic-parsing

IE and Knowledge Graphs

A Graph-based algorithm for inducing lexical taxonomies from scratch

http://wwwusers.di.uniroma1.it/~navigli/pubs/IJCAI_2011_Navigli_Velardi_Faralli.pdf

Reading the web with learned syntactic-semantic inference rules

https://www.cs.cmu.edu/~nlao/publication/2012/2012.emnlp.paper.pdf

A review of relational machine learning for knowledge graphs

https://arxiv.org/pdf/1503.00759.pdf

Knowledge vault: A web-scale approach to probabilistic knowledge fusion

https://www.cs.ubc.ca/~murphyk/Papers/kv-kdd14.pdf

Traversing knowledge graphs in vector space

http://www.emnlp2015.org/proceedings/EMNLP/pdf/EMNLP038.pdf

Logical Form

Language to Logical Form with Neural Attention

http://www.aclweb.org/anthology/P16-1004

LangPro: Natural Language Theorem Prover

https://arxiv.org/abs/1708.09417

Biomedical applications of NLP, etc.

Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4869115/pdf/srep26094.pdf

Machine Science

http://www.jstor.org.ezproxy4.library.arizona.edu/stable/40731977?pq-origsite=summon&seq=1#page_scan_tab_contents (requires NetID login)

Not NLP

Overcoming Catastrophic Forgetting in Neural Networks

https://arxiv.org/abs/1612.00796 http://rylanschaeffer.github.io/content/research/overcoming_catastrophic_forgetting/main.html

Random synaptic feedback weights support error backpropagation for deep learning

https://www.nature.com/articles/ncomms13276 http://www.breloff.com/no-backprop-part1/ http://www.breloff.com/no-backprop-part2/

Proposed by Adam

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings

https://arxiv.org/pdf/1607.06520.pdf

Proposed by John

A Simple but Tough-to-Beat Baseline for Sentence Embeddings

https://openreview.net/pdf?id=SyK00v5xx

Ultradense Word Embeddings by Orthogonal Transformation

https://arxiv.org/pdf/1602.07572.pdf

Men Also Like Shopping: Reducing Gender Bias Using Corpus-level Constraints

https://arxiv.org/abs/1707.09457

Confounds and Consequences in Geotagged Twitter Data

https://arxiv.org/pdf/1506.02275.pdf

Proposed by Clay

Latent LSTM Allocation: Joint Clustering and Non-Linear Dynamic Modeling of Sequence Data (ICML 2017)

http://proceedings.mlr.press/v70/zaheer17a.html

Discovering Structure in High-Dimensional Data Through Correlation Explanation (AISTATS 2015)

https://arxiv.org/abs/1406.1222

The Information Sieve (related to above) (ICML 2016)

https://arxiv.org/abs/1507.02284

Proposed by Enrique

DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning

https://arxiv.org/abs/1707.06690

Learning how to Active Learn: A Deep Reinforcement Learning Approach

https://arxiv.org/abs/1708.02383

Task-Oriented Query Reformulation with Reinforcement Learning

https://arxiv.org/abs/1704.04572

Learning to Paraphrase for Question Answering

https://arxiv.org/abs/1708.06022

Proposed by Egoitz

Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks

emnlp 2017 paper: https://arxiv.org/pdf/1707.09861.pdf extended version: https://arxiv.org/pdf/1707.06799.pdf

Hybrid computing using a neural network with dynamic external memory

https://www.nature.com/nature/journal/v538/n7626/pdf/nature20101.pdf

Proposed by Dongfang

Comparative Study of CNN and RNN for Natural Language Processing

https://arxiv.org/pdf/1702.01923.pdf

Fast and Accurate Entity Recognition with Iterated Dilated Convolutions

https://arxiv.org/pdf/1702.02098.pdf

Proposed by Vikas

Crowdsourcing Multiple Choice Science Questions

https://arxiv.org/pdf/1707.06209.pdf

Proposed by Mithun

Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning

http://www.aclweb.org/anthology/P/P17/P17-1066.pdf

Skip-Gram – Zipf + Uniform = Vector Additivity

http://www.aclweb.org/anthology/P/P17/P17-1007.pdf

VERB PHYSICS: Relative Physical Knowledge of Actions and Objects

http://www.aclweb.org/anthology/P/P17/P17-1025.pdf

Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules

http://www.aclweb.org/anthology/P/P17/P17-1006.pdf

Automatically Generating Rhythmic Verse with Neural Networks

http://www.aclweb.org/anthology/P/P17/P17-1016.pdf