https://docs.google.com/spreadsheets/d/1S3afCApltgDAJx7vXIy7vSGV65dA6aind4aJjXi1NEw/edit?usp=sharing
All-but-the-Top: Simple and Effective Postprocessing for Word Representations
https://arxiv.org/pdf/1702.01417.pdf
Deep Learning for NLP Best Practices
http://ruder.io/deep-learning-nlp-best-practices/?utm_campaign=Artificial%2BIntelligence%2BWeekly&utm_medium=email&utm_source=Artificial_Intelligence_Weekly_66
Adversarial Examples for Evaluating Reading Comprehension Systems
https://arxiv.org/pdf/1707.07328.pdf
Reading comprehension systems are not comprehending
Understanding Black-box Predictions via Influence Functions
https://arxiv.org/pdf/1703.04730.pdf
Fundamental for interpretability
On the State of the Art of Evaluation in Neural Language Models
https://arxiv.org/pdf/1707.05589.pdf
LSTMs still the best
Multi-Task Video Captioning with Video and Entailment Generation
https://arxiv.org/abs/1704.07489
Multi-modal learning
Adversarial Sets for Regularising Neural Link Predictors
https://arxiv.org/abs/1707.07596
Nice example of how to use adversarial examples for RE
Reading Wikipedia to Answer Open-Domain Questions
https://nlp.stanford.edu/pubs/chen2017reading.pdf
UNDERSTANDING DEEP LEARNING REQUIRES RETHINKING GENERALIZATION
https://openreview.net/forum?id=Sy8gdB9xx¬eId=Sy8gdB9xx
blog: https://blog.acolyer.org/2017/05/11/understanding-deep-learning-requires-re-thinking-generalization/
Data Programming: Machine Learning with Weak Supervision (blog)
http://hazyresearch.github.io/snorkel/blog/weak_supervision.html
On the Origin of Deep Learning
https://arxiv.org/pdf/1702.07800.pdf
DYNAMIC COATTENTION NETWORKS FOR QUESTION ANSWERING
https://arxiv.org/pdf/1611.01604.pdf
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
https://arxiv.org/abs/1703.03864
blog: https://blog.openai.com/evolution-strategies/
Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling
https://arxiv.org/abs/1703.04826
Optimizing Multivariate Performance Measures for Learning Relation Extraction Models
https://www.semanticscholar.org/paper/Optimizing-Multivariate-Performance-Measures-for-Haffari-Nagesh/39a4df0e88eba81d5cea53eaf57295f914099586
PathNet: Evolution Channels Gradient Descent in Super Neural Networks
https://arxiv.org/abs/1701.08734
Be Precise or Fuzzy: Learning the Meaning of Cardinals and Quantifiers from Vision
https://arxiv.org/abs/1702.05270
Explaining Recurrent Neural Network Predictions in Sentiment Analysis
https://arxiv.org/abs/1706.07206
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (DCGANs)
https://arxiv.org/abs/1511.06434
Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics?
Linguistically Naive != Language Independent: Why NLP Needs Linguistic Typology
http://machinelearningtext.pbworks.com/w/file/fetch/48157747/CICLing2011-manning-tagging.pdf
http://www.aclweb.org/anthology/W/W09/W09-0106.pdf
Adversarial Generation of Natural Language (bad review from Goldberg)
Language Generation with Recurrent Generative Adversarial Networks without Pre-training
https://arxiv.org/pdf/1705.10929.pdf
https://arxiv.org/pdf/1706.01399.pdf
Adversarial Examples for Evaluating Reading Comprehension Systems
https://nlp.stanford.edu/pubs/jia2017adversarial.pdf
Wide and Deep Learning for Recommender Systems
paper: https://arxiv.org/pdf/1606.07792.pdf blog/tutorial: https://www.tensorflow.org/tutorials/wide_and_deep
What Analogies Reveal About Word Vectors and Their Compositionality
https://aclweb.org/anthology/S/S17/S17-1001.pdf
Hierarchical clustering
Discovering Structure in High-Dimensional Data Through Correlation Explanation
http://arxiv.org/abs/1406.1222
https://github.com/gregversteeg/CorEx
https://github.com/gregversteeg/bio_corex/
Maximally Informative Hierarchical Representions of High-Dimensional Data
http://arxiv.org/abs/1410.7404
From ACL 2017
VERB PHYSICS: Relative Physical Knowledge of Actions and Objects
http://www.aclweb.org/anthology/P17-1025
Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules
http://www.aclweb.org/anthology/P17-1006
Learning with Noise: Enhance Distantly Supervised Relation Extraction with Dynamic Transition Matrix
http://www.aclweb.org/anthology/P17-1040
Abstract Syntax Networks for Code Generation and Semantic Parsing
http://aclanthology.info/papers/P17-1105/abstract-syntax-networks-for-code-generation-and-semantic-parsing
IE and Knowledge Graphs
A Graph-based algorithm for inducing lexical taxonomies from scratch
http://wwwusers.di.uniroma1.it/~navigli/pubs/IJCAI_2011_Navigli_Velardi_Faralli.pdf
Reading the web with learned syntactic-semantic inference rules
https://www.cs.cmu.edu/~nlao/publication/2012/2012.emnlp.paper.pdf
A review of relational machine learning for knowledge graphs
https://arxiv.org/pdf/1503.00759.pdf
Knowledge vault: A web-scale approach to probabilistic knowledge fusion
https://www.cs.ubc.ca/~murphyk/Papers/kv-kdd14.pdf
Traversing knowledge graphs in vector space
http://www.emnlp2015.org/proceedings/EMNLP/pdf/EMNLP038.pdf
Logical Form
Language to Logical Form with Neural Attention
http://www.aclweb.org/anthology/P16-1004
LangPro: Natural Language Theorem Prover
https://arxiv.org/abs/1708.09417
Biomedical applications of NLP, etc.
Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4869115/pdf/srep26094.pdf
http://www.jstor.org.ezproxy4.library.arizona.edu/stable/40731977?pq-origsite=summon&seq=1#page_scan_tab_contents (requires NetID login)
Not NLP
Overcoming Catastrophic Forgetting in Neural Networks
https://arxiv.org/abs/1612.00796
http://rylanschaeffer.github.io/content/research/overcoming_catastrophic_forgetting/main.html
Random synaptic feedback weights support error backpropagation for deep learning
https://www.nature.com/articles/ncomms13276
http://www.breloff.com/no-backprop-part1/
http://www.breloff.com/no-backprop-part2/
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
https://arxiv.org/pdf/1607.06520.pdf
A Simple but Tough-to-Beat Baseline for Sentence Embeddings
https://openreview.net/pdf?id=SyK00v5xx
Ultradense Word Embeddings by Orthogonal Transformation
https://arxiv.org/pdf/1602.07572.pdf
Men Also Like Shopping: Reducing Gender Bias Using Corpus-level Constraints
https://arxiv.org/abs/1707.09457
Confounds and Consequences in Geotagged Twitter Data
https://arxiv.org/pdf/1506.02275.pdf
Latent LSTM Allocation: Joint Clustering and Non-Linear Dynamic Modeling of Sequence Data (ICML 2017)
http://proceedings.mlr.press/v70/zaheer17a.html
Discovering Structure in High-Dimensional Data Through Correlation Explanation (AISTATS 2015)
https://arxiv.org/abs/1406.1222
The Information Sieve (related to above) (ICML 2016)
https://arxiv.org/abs/1507.02284
DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning
https://arxiv.org/abs/1707.06690
Learning how to Active Learn: A Deep Reinforcement Learning Approach
https://arxiv.org/abs/1708.02383
Task-Oriented Query Reformulation with Reinforcement Learning
https://arxiv.org/abs/1704.04572
Learning to Paraphrase for Question Answering
https://arxiv.org/abs/1708.06022
Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks
emnlp 2017 paper: https://arxiv.org/pdf/1707.09861.pdf
extended version: https://arxiv.org/pdf/1707.06799.pdf
Hybrid computing using a neural network with dynamic external memory
https://www.nature.com/nature/journal/v538/n7626/pdf/nature20101.pdf
Comparative Study of CNN and RNN for Natural Language Processing
https://arxiv.org/pdf/1702.01923.pdf
Fast and Accurate Entity Recognition with Iterated Dilated Convolutions
https://arxiv.org/pdf/1702.02098.pdf
Crowdsourcing Multiple Choice Science Questions
https://arxiv.org/pdf/1707.06209.pdf
Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning
http://www.aclweb.org/anthology/P/P17/P17-1066.pdf
Skip-Gram – Zipf + Uniform = Vector Additivity
http://www.aclweb.org/anthology/P/P17/P17-1007.pdf
VERB PHYSICS: Relative Physical Knowledge of Actions and Objects
http://www.aclweb.org/anthology/P/P17/P17-1025.pdf
Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules
http://www.aclweb.org/anthology/P/P17/P17-1006.pdf
Automatically Generating Rhythmic Verse with Neural Networks
http://www.aclweb.org/anthology/P/P17/P17-1016.pdf