index.php

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<link rel="stylesheet" type="text/css" href="css/NLGSite.css" /> <style type="text/css">
<!--
  A:link    { text-decoration: none; color: #000099}
  A:active  { text-decoration: none; color: #000099}
  A:visited { text-decoration: none; color: #000099}
  A:hover   { text-decoration: underline; color: #990099}
//-->
</style>

<script type="text/javascript">
<!--
function exp_coll(ind) {
  s = document.getElementById(ind);

  if (s.style.display == 'none') {
    s.style.display = 'block';
  } else if (s.style.display == 'block') {
    s.style.display = 'none';
  }
}
-->
</script>

<title>NL Seminar</title>

  </head>
<body text="#000033" link="#000099" vlink="#000099" alink="#000099">
<?php include('includes/usc-header.php'); ?><br><center><h2><b>USC/ISI NL Seminar</b></h2></center>
<?php include('includes/about.php'); ?>  <div class="nlheader"><h3>Upcoming talks:</h3></div>
<table width=90% border=0 cellspacing=1 cellpadding=4 bgcolor="#FFFFFF" align=center>
<tr class="seminarTableHeader"><td align=left width=14%>
    <b>Date</b>
  </td><td align=left width=25%>
    <b>Speaker</b>
  </td><td align=left>
    <b>Title</b>
  </td></tr>
<tr class="speakerItem" border=0 >
<td align=left valign=top>08 Dec 2017</td>
<td align=left valign=top>Nasrin Mostafazadeh (BenevolentAI lab)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs08_Dec_2017');">
[Canceled] Language Comprehension & Language Generation in Eventful Contexts
</a><br>
<span id=abs08_Dec_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Building AI systems that can process user input, understand it, and generate an engaging and contextually-relevant output in response, has been one of the longest-running goals in AI. Humans use a variety of modalities, such as language and visual cues, to communicate. A major trigger to our meaningful communications are "events" and how they cause/enable future events. In this talk, I will present my research about language comprehension and language generation around events, with a major focus on commonsense reasoning, world knowledge, and context modeling. I will focus on multiple context modalities such as narrative, conversational, and visual. Finally, I will highlight my recent work on language comprehension in the biomedical domain for finding cures for major diseases.
<p>
Bio: Nasrin Mostafazadeh is a senior research scientist at BenevolentAI labs. She recently got her PhD at the University of Rochester working with James Allen in conversational interaction and dialogue research group. During her PhD, she spent about a year at Microsoft and a summer at Google doing research on various NLP problems. Nasrinâ€™s research focuses on language comprehension, mainly studying events to predict what happens next. She has developed models for tackling various research tasks for pushing AI toward deeper language understanding with applications ranging from story generation to vision & language. Recently, she has been working on language comprehension in the biomedical domain, with the goal of finding cures for major diseases such as cancer by leveraging millions of unstructured data.
<br>
</font>
</span>
</td></tr></table><br><br>
 <div class="nlheader"><h3>Past talks:</h3></div>
<table width=90% border=0 cellspacing=1 cellpadding=4 bgcolor="#FFFFFF" align=center>
<tr class="seminarTableHeader"><td align=left width=14%>
  <b>Date</b>
</td><td align=left width=25%>
    <b>Speaker</b>
  </td><td align=left>
    <b>Title</b>
  </td></tr>
<tr class="speakerItem" border=0 >
<td align=left valign=top>20 Nov 2017</td>
<td align=left valign=top>Margaret Mitchell (Google)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_Nov_2017');">
Algorithmic Bias in Artificial Intelligence: The Seen and Unseen Factors Influencing Machine Perception of Images and Language
</a><br>
<span id=abs20_Nov_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> The success of machine learning has surged, with similar algorithmic approaches effectively solving a variety of human-defined tasks.  Tasks testing how well machines can perceive images and communicate about them have exposed strong effects of different types of bias, such as selection bias and dataset bias.  In this talk, I will unpack some of these biases, and how they affect machine perception today.
<p>
Bio: Margaret Mitchell is a Senior Research Scientist in Google's Research & Machine Intelligence group, working on artificial intelligence. Her research generally involves vision-language and grounded language generation, focusing on how to evolve artificial intelligence towards positive goals. This includes research on helping computers to communicate based on what they can process, as well as projects to create assistive and clinical technology from the state of the art in AI.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Nov 2017</td>
<td align=left valign=top>Jonathan Gordon (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Nov_2017');">
Learning and Reading
</a><br>
<span id=abs17_Nov_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> In recent years, a dramatic increase in the availability of digital text has created challenges and opportunities for learning for both humans and machines. My talk will describe research on learning commonsense knowledge from text -- despite our Gricean imperative to write down only what other people wouldn't know -- and using this for reasoning about language and the world. It will also address helping people to learn scientific knowledge by using implicit structure in a proliferation of articles, books, online courses, and other educational resources.
<p>
Bio: Jonathan Gordon is a postdoctoral researcher at the USC Information Sciences Institute, where he works with Jerry Hobbs and colleagues on the problems of learning and organizing knowledge from text. He completed a bachelor's degree in computer science at Vassar College and a Ph.D. in artificial intelligence at the University of Rochester, supervised by Lenhart Schubert.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Nov 2017</td>
<td align=left valign=top>Anssi Yli-JyrÃ¤</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Nov_2017');">
On Real-Time Graph Transducers
</a><br>
<span id=abs10_Nov_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> NLP research has been fluctuating between two extreme models of computation:
finite computers and universal computers.  Often a practical solution combines
both of these two extremes because formally powerful models are simulated by
physical machines that approximate them.  This is especially true for recurrent
neural networks whose activation vector is the key to deeper understanding of
their emergent finite-state behavior.  However, we currently have only a very
loose characterization for the finite-state property in neural networks.
In order to construct a hypothesis for a possible bottom-up organization of the
state-space of activation vectors of RNNs, I compare neural networks with
bounded Turing machines and finite-state machines, and quote recent results on
finite state models for semantic graphs.  These models enjoy the nice closure
properties of weighted finite-state machines.
In the end of the talk, I sketch my vision for neural networks that perform
finite-state graph transductions in real time. Such transductions would have a
vast variety of applications in machine translation and semantic information
retrieval involving big data.
<p>
Anssi Yli-JyrÃ¤ has the titles of Adjunct Professor (Docent) in Language
Technology at the University of Helsinki and Life Member of Clare Hall College at the University of Cambridge.  He is currently a PI and a Research
Fellow of the Academy of Finland in a project concerning universality of
finite-state syntax.  He has published a handbook on Hebrew and Greek morpheme
alignments in the Finnish Bible translation together with a group of Digital
Humanists, and then served the Finnish Electronic Library at CSC - IT Centre of
Science where he built an internet harvester and a search engine for the Finnish
WWW.   In 2005, he earned his PhD from the University of Helsinki and then worked as a coordinator for the Language Bank of Finland at CSC.  There he contributed to pushing his employer to what is now known as the CLARIN European
Research Infrastructure Consortium.  He became the first President of SIGFSM in 2009, after fostering and organizing FSMNLP conferences for several years.  In 2012-2013, he served as a Subject Head of Language Technology in his home university before visiting the Speech Group at the Department of Engineering, Cambridge University.  He has supervised theses and contributed to the
theoretical basis of Helsinki Finite-State Transducer (HFST) library.  In his
own research, Yli-JyrÃ¤ constantly pursues unexplored areas, applying
finite-state transducers to graphical language processing tasks such as
autosegmental phonology, constraint interaction, and dependency syntax and neural semantics.  He is a qualified teacher and interested in the occurrence of
flow in agile programming and simultaneous translation.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 Nov 2017</td>
<td align=left valign=top>Kai-Wei Chang (UCLA)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_Nov_2017');">
Structured Predictions: Practical Advancements and Applications in Natural Language Processing
</a><br>
<span id=abs03_Nov_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Many machine learning problems involve making joint
predictions over a set of mutually dependent output variables. The
dependencies between output variables can be represented by a
structure, such as a sequence, a tree, a clustering of nodes, or a
graph. Structured prediction models have been proposed for problems of
this type. In this talk, I will describe a collection
of results that improve several aspects of these approaches. Our
results lead to efficient and effective algorithms for learning structured
prediction models, which, in turn, support weak supervision signals and improve training and evaluation speed.
I will also discuss potential risks and challenges when using structured prediction models
<p>
Bio: Kai-Wei Chang is an assistant professor in the Department of
Computer Science at the University of California, Los Angeles. He
has published broadly in machine learning and natural language processing. His
research has mainly focused on designing machine learning methods for
handling large and complex data. He has been involved in developing
several machine learning libraries, including LIBLINEAR, Vowpal
Wabbit, and Illinois-SL. He was an assistant professor at the University
of Virginia in 2016-2017. He obtained his Ph.D. from the University of
Illinois at Urbana-Champaign in 2015 and was a post-doctoral researcher at Microsoft Research in 2016.
Kai-Wei was awarded the EMNLP Best Long Paper Award (2017),  KDD
Best Paper Award (2010), and the Yahoo! Key Scientific Challenges Award
(2011). Additional information is available at http://kwchang.net.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>13 Oct 2017</td>
<td align=left valign=top>Yangfeng Ji (University of Washington)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs13_Oct_2017');">
Context is Everything: From language modeling to language generation
</a><br>
<span id=abs13_Oct_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Contextual information is critical for language processing and generation. Particularly for large texts consisting of multiple sentences or paragraphs, how to capture the contextual information beyond sentence boundaries is important for building better language processing systems. This talk will discuss our recent effort on incorporating contextual information to language modeling and generation. It presents three models with each of them corresponds a specific linguistic phenomenon of context shared in written texts: (i) local context from preceding sentences; (ii) semantic and pragmatic relations between adjacent sentences; and (iii) evolving of entities (e.g., characters in novels) through coreference links in texts. The starting point of our model design is sentence-level recurrent neural network language models (RNNLMs). To capture these aspects of contextual information, we extend RNNLMs by either adding extra connections among existing network components, or adding dedicated components particularly to encode specific linguistic information. Evaluation results show that these models outperforms strong baselines and prior work language modeling tasks. Their ability of capturing contextual information is also verified by the quantitative evaluation on each corresponding task, such as identifying the relation between sentences, and resolving coreference ambiguity. Qualitative analysis is also included to demonstrate the ability of these models for text generation.
<p>
Bio: Yangfeng Ji is a postdoc researcher at University of Washington working with Noah Smith. His research interests lie in the interaction of natural language processing and machine learning. He is interested in designing machine learning models and algorithms for language processing, and also fascinated by how linguistic knowledge helps build better learning models. He completed his Ph.D. in Computer Science at Georgia Institute of Technology in 2016, advised by Jacob Eisenstein. He was one of the area co-chairs on Discourse and Pragmatics in ACL 2017.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>08 Sep 2017</td>
<td align=left valign=top>Leon Cheung, Nelson Liu (ISI Intern)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs08_Sep_2017');">
1)Improving Low Resource Neural Machine Translation 2)Language-Independent Translation of Out-of-Vocabulary Words
</a><br>
<span id=abs08_Sep_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> 1)Statistical models have outperformed neural models in machine
translation, until recently, with the introduction of the sequence to
sequence neural model. However, this model's performance suffers greatly
when starved of bilingual parallel data. This talk will discuss several
strategies that try to overcome this low-resource challenge, including
modifications to the sequence to sequence model, transfer learning, data
augmentation, and the use of monolingual data.
<p>
2)Neural machine translation is effective for language pairs with large datasets, but falls short to traditional methods (e.g. phrase or syntax-based machine translation) in the low-resource setting. However, these classic approaches struggle to translate out-of-vocabulary tokens, a limitation that is amplified when there is little training data. In this work, we augment a syntax-based machine translation system with a module that provides translations of out-of-vocabulary tokens. We present several language-independent strategies for translation of unknown tokens, and benchmark their accuracy on an intrinsic out-of-vocabulary translation task across a typologically diverse dataset of sixteen languages. Lastly, we explore the effects of using the module to add rules to a syntax-based machine translation system on overall translation quality.
<p>
Bio:
Leon Cheung is a second year undergraduate from UC San Diego. This
summer he has been working with Jon May and Kevin Knight to improve
neural machine translation for low resource languages.
<p>
Nelson Liu is an undergraduate at the University of Washington, where he works with Professor Noah Smith. His research interests lie at the intersection of machine learning and natural language processing. Previously, he worked at the Allen Institute for Artificial Intelligence on machine comprehension---he is currently a summer intern at ISI working with Professors Kevin Knight and Jonathan May.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>31 Aug 2017</td>
<td align=left valign=top>Yining Chen,  Sasha Mayn (ISI Intern)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs31_Aug_2017');">
THURSDAY TALK:  1)Recurrent Neural Networks as Weighted Language Recognizers 2)Gloss-to-English: Improving Low Resource Language Translation Using Alignment Tables
</a><br>
<span id=abs31_Aug_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> 1)We investigate properties of a simple recurrent neural network (RNN) as a formal device for recognizing weighted languages. We focus on the single-layer, ReLU-activation, rational-weight RNN with softmax, a standard form of RNN used in language processing applications. We prove that many questions one may ask about such RNNs are undecidable, including consistency, equivalence, minimization, and finding the highest-weighted string. For consistent RNNs, finding the highest-weighted string is decidable, although the solution can be exponentially long in the length of the input RNN encoded in binary. Limiting to solutions of polynomial length, we prove that finding the highest-weighted string for a consistent RNN is NP-complete and APX-hard.
<p>
2) Neural Machine Translation has gained popularity in recent years and has been able to achieve impressive results. The only caveat is that millions of parallel sentences are needed in order to train the system properly, and in a low-resource scenario that amount of data simply may not be available. This talk will discuss strategies for addressing the data scarcity problem, particularly using alignment tables to make use of parallel data from higher-resource language pairs and creating synthetic in-domain data.
<p>
Bio: Yining Chen is an third year undergraduate student at Dartmouth College. She is a summer intern at ISI working with Professor Kevin Knight and Professor Jonathan May.
<p>
Sasha Mayn is a summer intern at ISIâ€™s Natural Language Group. She is particularly interested in machine translation and language generation. Last summer Sasha interned at the PanLex Project in Berkeley, where she was responsible for pre-processing digital dictionaries and entering them into PanLex's multilingual database. This summer she has been working on improving neural machine translation strategies for low-resource languages under the supervision of Jon May and Kevin Knight.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 Aug 2017</td>
<td align=left valign=top>Marjan Ghazvininejad (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_Aug_2017');">
Neural Creative Language Generation
</a><br>
<span id=abs18_Aug_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Natural language generation (NLG) is a well studied and still very challenging field in natural language processing. One of the less studied NLG tasks is the generation of creative texts such as jokes, puns, or poems. Multiple reasons contribute to the difficulty of research in this area. First, no immediate application exists for creative language generation. This has made the research on creative NLG extremely diverse, having different goals, assumptions, and constraints. Second, no quantitative measure exists for creative NLG tasks. Consequently, it is often difficult to tune the parameters of creative generation models and drive improvements to these systems. Finally, rule based systems for creative language generation are not yet combined with deep learning methods.
<p>
In this work, we address these challenges for poetry generation which is one of the main areas of creative language generation. We introduce password poems as a novel application for poetry generation.  Furthermore, we combine finite-state machinery with deep learning models in a system for generating poems for any given topic. We introduce a quantitative metric for evaluating the generated poems and build the first interactive poetry generation system that enables users to revise system generated poems by adjusting style configuration settings like alliteration, concreteness and the sentiment of the poem.
<p>
In order to improve the poetry generation system, we decide to borrow ideas from human literature and develop a poetry translation system. We propose to study human poetry translation and measure the language variation in this process. we will study how human poetry translation is different from human translation in general and whether a translator translates poetry more freely. Then we will use our findings to develop a machine translation system specifically for translating poetry and proposing metrics for evaluating the quality of poetry translation.
<p>
<p>
Bio: Marjan Ghazvininejad is a PhD student at ISI working  with Professor Kevin Knight.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Aug 2017</td>
<td align=left valign=top>Nima Pourdamghani (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Aug_2017');">
Improving machine translation from low resource languages
</a><br>
<span id=abs11_Aug_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Conference Room [689]<br>
<b>Abstract:</b> Statistical machine translation (MT) often needs a large corpus of parallel translated sentences in order to achieve good performance. This limits the use of current MT technologies to a few resource-rich languages. Assume an incident happens in an area with a low-resource language. For a quick response, we need to build an MT system with available data, as finding or translating new parallel data is expensive and time consuming. For many languages this means that we only have a small amount of often out-of-domain parallel data (e.g. a Bible or Ubuntu manual). This talk is about ways to improve machine translation in low resource scenarios. I'll talk about use of monolingual data and parallel data from related languages to improve machine translation from the low resource language into English.
<p>
Bio: Nima Pourdamghani is a fourth year Ph.D. student at ISI. He works with Professor Kevin Knight on machine translation from low resource languages.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>21 Jul 2017</td>
<td align=left valign=top>Xing Shi (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs21_Jul_2017');">
Neural Sequence Models: Interpretation and Augmentation
</a><br>
<span id=abs21_Jul_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Recurrent neural networks (RNN) have been successfully applied to various Natural Language Processing tasks, including language modeling, machine translation, text generation, etc. However, several obstacles still stand in the way: First, due to the RNN's distributional nature, few interpretations of its internal mechanism are obtained, and it remains a black box. Second, because of the large vocabulary sets involved, the text generation is very time-consuming. Third, there is no flexible way to constrain the generation of the sequence model with  external knowledge. Last, huge training data must be collected to guarantee the performance of these neural models, whereas annotated data such as parallel data used in machine translation are expensive to obtain. This work aims to address the four challenges mentioned above.
<p>
To further understand the internal mechanism of the RNN, I choose neural machine translation (NMT) systems as a testbed. I first investigate how NMT outputs target strings of appropriate lengths, locating a collection of hidden units that learns to explicitly implement this functionality. Then I investigate whether NMT systems learn source language syntax as a by-product of training on string pairs. I find that both local and global syntactic information about source sentences is captured by the encoder. Different types of syntax are stored in different layers, with different concentration degrees.
<p>
To speed up text generation, I proposed two novel GPU-based algorithms: 1) Utilize the source/target words alignment information to shrink the target side run-time vocabulary; 2) Apply locality sensitive hashing to find nearest word embeddings. Both methods lead to a 2-3x speedup on four translation tasks without hurting machine translation accuracy as measured by BLEU. Furthermore, I integrate a finite state acceptor into the neural sequence model during generation, providing a flexible way to constrain the output, and I successfully  apply this to poem generation, in order to control the pentameter and rhyme.
<p>
Based on above success, I propose to work on the following: 1) Go one further step towards interpretation: find unit/feature mappings, learn the unit temporal behavior, and understand different hyper-parameter settings. 2) Improve NMT performance on low-resource language pairs by fusing an external language model, feeding explicit target-side syntax and utilizing better word embeddings.
<p>
<p>
<p>
Bio: Xing Shi is a PhD student at ISI working with Prof. Kevin Knight.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Jul 2017</td>
<td align=left valign=top>Sorcha Gilroy (University of Edinburgh)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Jul_2017');">
Parsing Graphs with Regular Graph Grammars
</a><br>
<span id=abs14_Jul_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Recently, several datasets have become available which represent natural language phenomena as graphs. Hyperedge Replacement Languages (HRL) have been the focus of much attention as a formalism to represent the graphs in these datasets. Chiang et al. (2013) prove that HRL graphs can be parsed in polynomial time with respect to the size of the input graph. We believe that HRL may be more expressive than is necessary to represent semantic graphs and we propose looking at Regular Graph Languages (RGL; Courcelle, 1991), which is a subfamily of HRL, as a possible alternative. We provide a top-down parsing algorithm for RGL that runs in time linear in the size of the input graph.
<p>
Bio:Sorcha is a 2nd year PhD student at the University of Edinburgh and is a student in the Center for Doctoral Training in Data Science. Her PhD is focused on formal languages of graphs for NLP and her supervisors are Adam Lopez and Sebastian Maneth. She completed her undergraduate degree in mathematical sciences at University College Cork and her masters degree in data science at the University of Edinburgh. She is at ISI as an intern in the NLP group.
<p>
Live here: http://webcastermshd.isi.edu/Mediasite/Play/c523b7ef95b443e8b29cfac3092e00081d
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Jul 2017</td>
<td align=left valign=top>Amir Hossein Yazdavar (Wright state University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Jul_2017');">
Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
</a><br>
<span id=abs07_Jul_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> With the rise of social media, millions of people express their moods, feelings and daily struggles with mental health issues routinely on social media platforms like Twitter. Un- like traditional observational cohort studies conducted through questionnaires and self-reported surveys, we explore the reliable detection of clinical depression from tweets obtained unobtrusively. Based on the analysis of tweets crawled from users with self-reported depressive symptoms in their Twitter profiles, we demonstrate the potential of detecting clinical depression symptoms which emulate the PHQ-9 questionnaire clinicians use today. Our study uses a semi-supervised statistical model to evaluate how the duration of these symptoms and their expression
on Twitter (in terms of word usage patterns and topical preferences) align with the medical findings reported via the PHQ-9. Our proactive and automatic screening tool is able to identify clinical depressive symptoms with an accuracy of 68% and precision of 72%.
<p>
Bio: Amir is a 2nd year Ph.D. Researcher at Kno.e.sis Center Wright State University, OH under the guidance of  Prof. Amit P. Sheth, the founder and executive director of Kno.e.sis Center. He is broadly interested in machine learning (incl. deep learning) and semantic web (incl. creation and use of knowledge graphs) and their applications to NLP/NLU and social media analytics. He has a particular interest in the extraction of subjective information with applications to search, social and biomedical/health applications. At Kno.e.sis Center â€“ He is working on several real world projects mainly focused on studying human behavior on the web via Natural Language Understanding, Social Media Analytics utilizing Machine learning (Deep learning) and Knowledge Graph techniques. In particular, his focus is to enhance statistical models via domain semantics and guidance from offline behavioral knowledge to understand userâ€™s behavior from unstructured and large-scale Social data.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Jun 2017</td>
<td align=left valign=top>Mayank Kejriwal (ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Jun_2017');">
From Noisy Information Extraction to Rich Information Retrieval in Unusual Domains
</a><br>
<span id=abs16_Jun_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: Information Extraction (IE) or the algorithmic extraction of named entities, relations and attributes of interest from text-rich data is an important natural language processing task. In this talk, I will discuss the relationship of IE to fine-grained Information Retrieval (IR), especially when the domain of interest is unusual i.e. computationally under-studied, socially consequential and difficult to analyze. In particular, such domains exhibit a significant long-tail effect, and their language models are obfuscated. Using real-world examples and results obtained in recent DARPA MEMEX evaluations, I will discuss how our search system uses semantic strategies to usefully facilitate complex information needs of investigative users in the human trafficking domain, even when IE outputs are extremely noisy. I briefly report recent results obtained from a user study conducted by DARPA, and the lessons learned thereof for both IE and IR research.
<p>
Bio: Mayank Kejriwal is a computer scientist in the Information integration group at ISI. He received his Ph.D. from the University of Texas at Austin under Daniel P. Miranker. His dissertation involved domain-independent linking and resolving of structured Web entities at scale, and was published as a book in the Studies in the Semantic Web series. At ISI, he is involved in the DARPA MEMEX, LORELEI and D3M projects. His current research sits at the intersection of knowledge graph construction, search, inference and analytics, especially over Web corpora in unusual social domains.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 Jun 2017</td>
<td align=left valign=top>Benjamin Girault (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_Jun_2017');">
Introduction to Graph Signal Processing: Tools for Harmonic Analysis on Irregular Structures.
</a><br>
<span id=abs09_Jun_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor  Conference Room [689]<br>
<b>Abstract:</b> During the past few years, graph signal processing has been extending the field
of signal processing on Euclidean spaces to irregular spaces represented by
graphs. We have seen successes ranging from the Fourier transform, to
wavelets, vertex-frequency (time-frequency) decomposition, sampling theory,
uncertainty principle, or convolutive filtering. This presentation introduces
the field, the type of signals involved, and how harmonic analysis is
performed.
<p>
Bio: Benjamin Girault received his License (B.Sc.) and his Master (M.Sc.) in France
from Ã‰cole Normale SupÃ©rieure de Cachan, France, in 2009 and 2012 respectively
in the field of theoretical computer science. He then received his PhD in
computer science from Ã‰cole Normale SupÃ©rieure de Lyon, France, in December
2015. His dissertation entitled "Signal Processing on Graphs - Contributions
to an Emerging Field" focuses on extending the classical definition of
stationary temporal signals to stationary graph signal. Currently, he is a
postdoctoral scholar with Professors Antonio Ortega and Shri Narayanan at the
University of Southern California continuing his work on graph signal
processing with a focus on applying these tools to understanding human
behavior.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 May 2017</td>
<td align=left valign=top>Yannis Konstas (UW)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_May_2017');">
Building Adaptable and Scalable Natural Language Generation Systems
</a><br>
<span id=abs26_May_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Traditionally, computers communicate with humans by converting computer-readable input to human-interpretable output, for example via graphical user interfaces. My research focuses on building programs that automatically generate textual output from computer-readable input. The majority of existing Natural Language Generation (NLG) systems use hard-wired rules or templates in order to capture the input for every different application and rely on small manually annotated corpora.  In this talk, I will present a framework for building NLG systems using Neural Network architectures. The approach makes no domain-specific modifications to the input and benefits from training on very large unannotated corpora.  It achieves state-of-the-art performance on a number of tasks, including generating text from meaning representations and source code.  Such a system can have direct applications to intelligent conversation agents, source code assistant tools, and semantic-based Machine Translation.
<p>
Bio: Ioannis Konstas is a postdoctoral researcher at the University of Washington, Seattle, collaborating with Prof. Luke Zettlemoyer since 2015. His main research interest focuses on the area of Natural Language Generation (NLG) with an emphasis on data-driven deep learning methods.
He has received BSc in Computer Science from AUEB (Greece) in 2007, and MSc in Artificial Intelligence from the University of Edinburgh (2008). He continued his study at the University of Edinburgh and received his Ph.D. degree in 2014. He has previously worked as a Research Assistant at the University of Glasgow (2008), and as a postdoctoral researcher at the University of Edinburgh (2014).
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 May 2017</td>
<td align=left valign=top>Sayan Ghosh (USC/ICT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_May_2017');">
Representation Learning for Human Affect Recognition  (PhD Proposal Practice Talk)
</a><br>
<span id=abs05_May_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Recent advances in end-to-end representation learning have made impressive strides in achieving state-of-the-art results in perception problems on speech, image and natural language. However, the area of affect understanding has mostly relied on off-the-shelf features to solve problems in emotion recognition, multi-modal fusion and generative modeling of affective speech and language. The potential impact of representation learning approaches to this area remains ripe for exploration. My thesis proposal is an important step in this direction. Firstly, I present an overview of my work on AU (Action Unit) detection, speech emotion recognition and glottal inverse filtering through speech modeling. Secondly, I introduce Affect-LM, a novel neural language model for affective text generation which exploits prior knowledge through a dictionary of emotionally colored words (such as the LIWC tool). Finally, I state some upcoming problems in representation learning for affect from speech and multi-modal language modeling which I plan to work on for the remainder of my degree.
<p>
Sayan is a fourth-year PhD student at the University of Southern California, working at the Behavior Analytics and Machine Learning Group at the ICT(Institute for Creative Technologies) with Prof. Stefan Scherer. He is working on research towards building learning systems for better sensing of human behavior and emotion, and integrating deep learning techniques with human affect. His areas of interest include, but are not limited to deep learning, machine perception, affective computing, speech/signal processing, and generative modeling.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>28 Apr 2017</td>
<td align=left valign=top>Andreas StuhlmÃ¼ller (Stanford)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs28_Apr_2017');">
Modeling Dialog using Probabilistic Programs
</a><br>
<span id=abs28_Apr_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> How can we effectively explore the space of automated dialog systems? In this talk, I introduce WebPPL, a probabilistic programming language that provides a wide range of inference and optimization algorithms out of the box. This language makes it easy to express and combine probabilistic models, including regression and categorization models, highly structured cognitive models, models of agents that make sequential plans, and deep neural nets. I show that this also includes recent sequence-to-sequence architectures for dialog. I then use this framework to implement *dialog automation using workspaces*, a variation on these architectures that is aimed at dialogs that require sufficiently deep reasoning between utterances that it is difficult to learn how to automate them from transcripts alone.
<p>
Bio: Andreas StuhlmÃ¼ller is a post-doctoral researcher at Stanford, working in Prof. Noah Goodman's Computation & Cognition lab, and founder of Ought Inc. Previously, he received his Ph.D. in Brain and Cognitive Sciences from MIT, where he was part of Prof. Josh Tenenbaum's Computational Cognitive Science group. He has worked on the design and implementation of probabilistic programming languages, on their application to cognitive modeling, and recently on dialog systems. He is broadly interested in leveraging machine learning to help people think.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>21 Apr 2017</td>
<td align=left valign=top>Kallirroi Georgila (USC/ICT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs21_Apr_2017');">
Reinforcement learning of negotiation dialogue policies
</a><br>
<span id=abs21_Apr_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> The dialogue policy of a dialogue system decides on what dialogue move (also called â€œactionâ€) the system should make given the dialogue context (also called â€œdialogue stateâ€). Building hand-crafted dialogue policies is a hard task, and there is no guarantee that the resulting policies will be optimal. This issue has motivated the dialogue community to use statistical methods for automatically learning dialogue policies, the most popular of which is reinforcement learning (RL). However, to date, RL has mainly been used to learn dialogue policies in slot-filling applications (e.g., restaurant recommendation, flight reservation, etc.) largely ignoring other more complex genres of dialogue such as negotiation. This talk presents challenges in reinforcement learning of negotiation dialogue policies. The first part of the talk focuses on applying RL to a two-party multi-issue negotiation domain. Here the main challenges are the very large state and action space, and learning negotiation dialogue policies that can perform well for a variety of negotiation settings, including against interlocutors whose behavior has not been observed before. Good negotiators try to adapt their behaviors based on their interlocutorsâ€™ behaviors. However, current approaches to using RL for dialogue management assume that the userâ€™s behavior does not change over time. In the second part of the talk, I will present an experiment that deals with this problem in a resource allocation negotiation scenario.
<p>
Kallirroi Georgila is a Research Assistant Professor at the Institute for Creative Technologies (ICT) at the University of Southern California (USC) and at USCâ€™s Computer Science Department. Before joining USC/ICT in 2009 she was a Research Scientist at the Educational Testing Service (ETS) and before that a Research Fellow at the School of Informatics at the University of Edinburgh. Her research interests include all aspects of spoken dialogue processing with a focus on reinforcement learning of dialogue policies, expressive conversational speech synthesis, and speech recognition. She has served on the organizing, senior, and program committees of many conferences and workshops. Her research work is funded by the National Science Foundation and the Army Research Office.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Apr 2017</td>
<td align=left valign=top>Kevin Knight (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Apr_2017');">
Why is it harder to build a tic-tac-toe playing robot than a tic-tac-toe playing program?
</a><br>
<span id=abs14_Apr_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> I wanted to understand why it's so hard to build working robots, so I programmed one to play tic-tac-toe.  Now I understand a lot better!  I thought I'd relate my experience right now, just in case I later become more knowledgeable and impossible to understand.
<p>
Kevin Knight is a Research Director at the Information Sciences Institute (ISI) of the University of Southern California (USC), and a Professor in the USC Computer Science Department.  He received a PhD in computer science from Carnegie Mellon University and a bachelor's degree from Harvard University.  Dr. Knightâ€™s research interests include statistical machine translation, natural language generation, automata theory, and decipherment of historical manuscripts.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Apr 2017</td>
<td align=left valign=top>Reihane Boghrati (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Apr_2017');">
ConversAtion level Syntax SImilarity Metric (CASSIM)
</a><br>
<span id=abs07_Apr_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> The syntax and semantics of human language can illuminate many individual psychological differences and important dimensions of social interaction. Thus, analysis of language provides important insights into the underlying psychological properties of individuals and groups. Accordingly, psychological and psycholinguistic research has begun incorporating sophisticated representations of semantic content to better understand the connection between word choice and psychological processes. While the majority of language analysis work in psychology has focused on semantics, psychological information is encoded not just in what people say, but how they say it. We introduce ConversAtion level Syntax SImilarity Metric (CASSIM), a novel method for calculating conversation-level syntax similarity. CASSIM estimates the syntax similarity between conversations by automatically generating syntactical representations of the sentences in conversations, estimating the structural differences between them, and calculating an optimized estimate of the conversation-level syntax similarity. Also, we conduct a series of analyses with CASSIM to investigate syntax accommodation in social media discourse. Further, building off of CASSIM, we propose ConversAtion level Syntax SImilarity Metric-Group Representations (CASSIM-GR). This extension builds generalized representations of syntactic structures of documents, thus allowing researchers to distinguish between people and groups based on syntactic differences.
<p>
Bio: Reihane is a forth year Ph.D student at USC, working with Morteza Dehghani in Computational Social Science Laboratory. She is interested in introducing new methods and computational models to psychology, and more broadly to social sciences. Her work spans the boundary between natural language processing and psychology, as does her intellectual curiosity.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>31 Mar 2017</td>
<td align=left valign=top>Danqi Chen (Stanford)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs31_Mar_2017');">
Towards the Machine Comprehension of Text
</a><br>
<span id=abs31_Mar_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Enabling a computer to understand a document so that it can answer comprehension questions is a central, yet unsolved goal of NLP.  The task of reading comprehension (i.e., question answering over unstructured text) has received vast attention recently, and some progress has been made thanks to the creation of large-scale datasets and development of attention-based neural networks.
In this talk, Iâ€™ll first present how we advance this line of research. Iâ€™ll show how simple models can achieve (nearly) state-of-the-art performance on recent benchmarks, including the CNN/Daily Mail datasets and the Stanford Question Answering Dataset. Iâ€™ll focus on explaining the logical structure behind these neural architectures and discussing advantage as well as limits of current approaches.
Lastly Iâ€™ll talk about our recent work on scaling up machine comprehension systems, which attempt to answer open-domain questions at the full Wikipedia scale. We demonstrate the promise of our system, as well as set up new benchmarks by evaluating on multiple existing QA datasets.
<p>
Bio:Danqi Chen is a Ph.D. candidate in Computer Science at Stanford University, advised by Prof. Christopher Manning. Her main research interests lie in deep learning for natural language processing and understanding, and she is particularly interested in the intersection between text understanding and knowledge reasoning. She has been working on machine comprehension, question answering, knowledge base population and dependency parsing. She is a recipient of Facebook fellowship and Microsoft Research Womenâ€™s Fellowship and an outstanding paper award in ACL'16. Prior to Stanford, she received her B.S. from Tsinghua University in 2012.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 Mar 2017</td>
<td align=left valign=top>Stephen Kobourov (Arizona)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_Mar_2017');">
Analyzing the Language of Food on Social Media
</a><br>
<span id=abs27_Mar_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We investigate the predictive power behind the language of
food on social media. We collect a corpus of over three million
food-related posts from Twitter and demonstrate that many latent
population characteristics can be directly predicted from this data:
overweight rate, diabetes rate, political leaning, and home
geographical location of authors. For all tasks, our language-based
models significantly outperform the majority- class baselines.
Performance is further improved with more complex natural language
processing, such as topic modeling. We analyze which textual features
have most predictive power for these datasets, providing insight into
the connections between the language of food, geographic locale, and
community characteristics. Lastly, we design and implement an online
system for real-time query and visualization of the dataset.
Visualization tools, such as geo-referenced heatmaps,
semantics-preserving wordclouds and temporal histograms, allow us to
discover more complex, global patterns mirrored in the language of
food.
<p>
Stephen Kobourov is a Professor of Computer Science at the University
of Arizona. He completed BS degrees in Mathematics and Computer
Science at Dartmouth College in 1995, and a PhD in Computer Science at
Johns Hopkins University in 2000.  He has worked as a Research
Scientist at AT&T Research Labs, a Hulmboldt Fellow at the University
of TÃ¼bingen in Germany, and a Distinguished Fulbright Chair at Charles
University in Prague.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 Mar 2017</td>
<td align=left valign=top>Sameer Singh (UCI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_Mar_2017');">
Intuitive Interactions with Black-box Machine Learning
</a><br>
<span id=abs24_Mar_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Machine learning is at the forefront of many recent advances in natural language processing, enabled in part by the sophisticated models and algorithms that have been recently introduced. However, as a consequence of this complexity, machine learning essentially acts as a black-box as far as users are concerned. It is incredibly difficult to understand, predict, or "fix" the behavior of NLP models that have been deployed. In this talk, I propose interpretable representations that allow users and machine learning models to interact with each other: enabling machine learning models to provided explanations as to why a specific prediction was made and enabling users to inject domain knowledge into machine learning. The first part of the talk introduces an approach to estimate local, interpretable explanations for black-box classifiers and describes an approach to summarize the behavior of the classifier by selecting which explanations to show to the user. I will also briefly describe work on "closing the loop", i.e. allowing users to provide feedback on the explanations to improve the model, for the task of relation extraction, an important subtask of natural language processing. In particular, we introduce approaches to both explain the relation extractor using logical statements and to inject symbolic domain knowledge into relational embeddings to improve the predictions. I present experiments to demonstrate that an interactive interface is effective in providing users an understanding of, and an ability to improve, complex black-box machine learning systems.
<p>
Bio: Sameer Singh is an Assistant Professor of Computer Science at the University of California, Irvine. He is working on large-scale and interactive machine learning applied to information extraction and natural language processing. Till recently, Sameer was a Postdoctoral Research Associate at the University of Washington. He received his PhD from the University of Massachusetts, Amherst in 2014, during which he also interned at Microsoft Research, Google Research, and Yahoo! Labs on massive-scale machine learning. He was selected as a DARPA Riser, was awarded the Adobe Research Data Science Award, won the grand prize in the Yelp dataset challenge, has been awarded the Yahoo! Key Scientific Challenges fellowship, and was a finalist for the Facebook PhD fellowship. Sameer has published more than 30 peer-reviewed papers at top-tier machine learning and natural language processing conferences and workshops.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Mar 2017</td>
<td align=left valign=top>Kuan Liu (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Mar_2017');">
Heterogeneous Attribute Embedding and Sequence Modeling for Recommendation with Implicit Feedback
</a><br>
<span id=abs17_Mar_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Incorporating implicit feedback into a recommender system is a challenging problem due to sparse and noisy observations. I will present our approaches that exploit heterogeneous attributes and sequence properties within the observations. We build a neural network framework to embed heterogeneous attributes in an end-to-end fashion, and apply the framework to three sequence-based models. Our methods achieve significant improvements on four large-scale datasets compared to state-of-the-art baseline models (30% to 90% relative increase in NDCG). Experimental results show that attribute embedding and sequence modeling both lead to improvements and, further, that our novel output attribute layer plays a crucial role. I will conclude with our exploratory studies that investigate why sequence modeling works well in recommendation systems and advocate its use for large scale recommendation tasks.
<p>
Bio:
Kuan Liu is a fifth year Ph.D. student at ISI/USC working with Prof. Prem Natarajan. Before that, He received a bachelor degree from Tsinghua University with a major in Computer Science. His research interests include machine learning, large scale optimization, deep learning, and applications to recommender systems, network analysis.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Mar 2017</td>
<td align=left valign=top>He He (Stanford)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Mar_2017');">
Learning agents that interact with humans
</a><br>
<span id=abs10_Mar_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> The future of virtual assistants, self-driving cars, and smart homes require intelligent agents that work intimately with users. Instead of passively following orders given by users, an interactive agent must actively collaborate with people through communication, coordination, and user-adaptation. In this talk, I will present our recent work towards building agents that interact with humans. First, we propose a symmetric collaborative dialogue setting in which two agents, each with some private knowledge, must communicate in natural language to achieve a common goal. We present a human-human dialogue dataset that poses new challenges to existing models, and propose a neural model with dynamic knowledge graph embedding. Second, we study the user-adaptation problem in quizbowl - a competitive, incremental question-answering game. We show that explicitly modeling of different human behavior leads to more effective policies that exploits sub-optimal players. I will conclude by discussing opportunities and open questions in learning interactive agents.
<p>
He He is a post-doc at Stanford University, working with Percy Liang. Prior to Stanford, she earned her Ph.D. in Computer Science at the University of Maryland, College Park, advised by Hal DaumÃ© III and Jordan Boyd-Graber. Her interests are at the interface of machine learning and natural language processing. She develops algorithms that acquire information dynamically and do inference incrementally, with an emphasis on problems in natural language processing. She has worked on dependency parsing, simultaneous machine translation, question answering, and more recently dialogue systems.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Mar 2017</td>
<td align=left valign=top>Alessandro Achille (UCLA)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Mar_2017');">
Information Dropout: Learning Optimal Representations Through Noisy Computation
</a><br>
<span id=abs07_Mar_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 11:00 am - 12:00 pm<br>
<b>Location:</b> 6th Floor Conference Room [689]<br>
<b>Abstract:</b> The cross-entropy loss commonly used in deep learning is closely related to the defining properties of optimal representations, but does not enforce some of the key properties. We show that this can be solved by adding a regularization term, which is in turn related to injecting multiplicative noise in the activations of a Deep Neural Network, a special case of which is the common practice of dropout. We show that our regularized loss function can be efficiently minimized using Information Dropout, a generalization of dropout rooted in information theoretic principles that automatically adapts to the data and can better exploit architectures of limited capacity. When the task is the reconstruction of the input, we show that our loss function yields a Variational Autoencoder as a special case, thus providing a link between representation learning, information theory and variational inference. Finally, we prove that we can promote the creation of disentangled representations simply by enforcing a factorized prior, a fact that has been observed empirically in recent work. Our experiments validate the theoretical intuitions behind our method, and we find that information dropout achieves a comparable or better generalization performance than binary dropout, especially on smaller models, since it can automatically adapt the noise to the structure of the network, as well as to the test sample.
arXiv: https://arxiv.org/abs/1611.01353
<p>
Bio: Alessandro Achille is a PhD student in Computer Science at UCLA, working with Prof. Stefano Soatto. He focuses on variational inference, representation learning, and their applications to deep learning and computer vision. Before coming to UCLA, he obtained a Master's degree in Pure Math at the Scuola Normale Superiore in Pisa, where he studied model theory and algebraic topology with Prof. Alessandro Berarducci.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 Mar 2017</td>
<td align=left valign=top>Lili Mou (Peking University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_Mar_2017');">
Coupling distributed and symbolic execution for natural language queries
</a><br>
<span id=abs03_Mar_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> In this talk, Lili will introduce his work "Coupling distributed and symbolic execution for natural language queries," which was done during his internship at Huawei Technologies (Hong Kong), supervised by Dr. Zhengdong Lu. The study proposes a unified perspective of neural and symbolic execution for semantic parsing, and shows how we can make use of both neural and symbolic worlds.
<p>
Lili Mou received his BS degree in computer science from Peking University in 2012. He is now a Ph.D. student, supervised by Profs. Zhi Jin, Ge Li, and Lu Zhang. His recent research interests include deep learning applied to natural language processing as well as programming language processing. He has publications at top conferences like AAAI, ACL, CIKM, COLING, EMNLP, IJCAI, and INTERSPEECH.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Feb 2017</td>
<td align=left valign=top>Nanyun Peng (Johns Hopkins)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Feb_2017');">
Representation Learning with Joint Models for Information Extraction
</a><br>
<span id=abs23_Feb_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> There is abundant knowledge out there carried in the form of natural language texts, such as social media posts, scientific research literature, medical records, etc., which grows at an astonishing rate. Yet this knowledge is mostly inaccessible to computers and overwhelming for human experts to absorb. Information extraction (IE) processes raw texts to produce machine understandable structured information, thus dramatically increasing the accessibility of knowledge through search engines, interactive AI agents, and medical research tools. However, traditional IE systems assume abundant human annotations for training high quality machine learning models, which is impractical when trying to deploy IE systems to a broad range of domains, settings and languages. In this talk, I will present how to leverage the distributional statistics of characters and words, the annotations for other tasks and other domains, and the linguistics and problem structures, to combat the problem of inadequate supervision, and conduct information extraction with scarce human annotations.
<p>
Nanyun Peng is a PhD candidate in the Department of Computer Science at Johns Hopkins University, affiliated with the Center for Language and Speech Processing and advised by Dr. Mark Dredze. She is broadly interested in Natural Language Processing, Machine Learning, and Information Extraction. Her research focuses on using deep learning for information extraction with scarce human annotations. Nanyun is the recipient of the Johns Hopkins University 2016 Fred Jelinek Fellowship. She has completed two research internships at IBM T.J. Watson Research Center, and Microsoft Research Redmond. She holds a master's degree in Computer Science and BAs in Computational Linguistics and Economics, all from Peking University.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Feb 2017</td>
<td align=left valign=top>Yonatan Bisk (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Feb_2017');">
The Limits of Unsupervised Syntax and the Importance of Grounding in Language Acquisition
</a><br>
<span id=abs10_Feb_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Conference Room [689]<br>
<b>Abstract:</b> The future of self-driving cars, personal robots, smart homes, and intelligent assistants hinges on our ability to communicate with computers. The failures and miscommunications of Siri-style systems are untenable and become more problematic as machines become more pervasive and are given more control over our lives. Despite the creation of massive proprietary datasets to train dialogue systems, these systems still fail at the most basic tasks. Further, their reliance on big data is problematic. First, successes in English cannot be replicated in most of the 6,000+ languages of the world. Second, while big data has been a boon for supervised training methods, many of the most interesting tasks will never have enough labeled data to actually achieve our goals. It is, therefore, important that we build systems which can learn from naturally occurring data and grounded, situated interactions.
<p>
In this talk, I will discuss work from my thesis on the unsupervised acquisition of syntax which harnesses unlabeled text in over a dozen languages.  This exploration leads us to novel insights into the limits of semantics-free language learning.  Having isolated these  stumbling blocks, Iâ€™ll then present my recent work on language grounding where we attempt to learn the meaning of several linguistic constructions via interaction with the world.
<p>
Yonatan Biskâ€™s research focuses on Natural Language Processing from naturally occurring data (unsupervised and weakly supervised data).  He is a postdoc researcher with Daniel Marcu at USCâ€™s Information Sciences Institute.  Previously, he received his PhD from the University of Illinois at Urbana-Champaign under Julia Hockenmaier and his BS from the University of Texas at Austin.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 Feb 2017</td>
<td align=left valign=top>Melissa Roemmele (UCS/ICT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_Feb_2017');">
Recurrent Neural Networks for Narrative Prediction
</a><br>
<span id=abs03_Feb_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Narrative prediction involves predicting â€˜what happens nextâ€™ in a story. This task has a long history in AI research but is now getting more recognition in the NLP community. In this talk Iâ€™ll describe three different evaluation schemes for narrative prediction, one of which (the Story Cloze Test) is the shared task for this yearâ€™s LSDSem workshop at EACL. Iâ€™ll present my ongoing efforts to develop Recurrent Neural Network-based models that succeed on these evaluation frameworks, and discuss the particular challenges posed by each of them.
<p>
Bio: Iâ€™m a PhD candidate at USCâ€™s Institute for Creative Technologies advised by Andrew Gordon in the Narrative Group. My thesis research explores machine learning approaches to automatically generating text-based stories. Iâ€™m interested in using this research to stimulate peopleâ€™s creativity in writing. More broadly, Iâ€™m excited by any opportunity to use automated analysis of text data to give people new insights and ideas.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>20 Jan 2017</td>
<td align=left valign=top>Jonathan May (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_Jan_2017');">
How I Learned to Stop Worrying and Love Evaluations (and Keep Worrying)
</a><br>
<span id=abs20_Jan_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Bake-offs, shared tasks, evaluations: these are names for short,
high-stress periods in many CS researchers' lives where their
algorithms and models are exposed to unseen data, often with
reputations and funding on the line. Evaluations are sometimes
perceived to be the bane of much of our work lives. We
grouse about metrics, procedures, glitches, and all the
time "wasted" chasing scores, rather than doing Real
Science (TM).  In this talk I will argue that despite valid criticisms
of the approach, coordinated evaluation is a net benefit to NLP
research and has led to accomplishments that might not have otherwise
arisen. This argument will frame a more in-depth discussion of several
pieces of recent evaluation-grounded work: rapid generation of
translation and information extraction for low-resource surprise
languages (DARPA LORELEI) and organization of SemEval shared
tasks in semantic parsing and generation.
<p>
Jonathan May is a Research Assistant Professor at the University of
Southern California's Information Sciences Institute
(USC/ISI). Previously, he was a research scientist at SDL Research
(formerly Language Weaver) and a scientist at Raytheon BBN
Technologies. He received a Ph.D. in Computer Science from the
University of Southern California in 2010 and a BSE and MSE in
Computer Science Engineering and Computer and Information Science,
respectively, from the University of Pennsylvania in 2001. Jon's
research interests include automata theory, natural language
processing, machine translation, and machine learning.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Jan 2017</td>
<td align=left valign=top>David Chiang (Notre Dame)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Jan_2017');">
Speech-to-Translation Alignment for Documentation of Endangered Languages
</a><br>
<span id=abs10_Jan_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> <p>
I will give an overview of this project, focusing on the pieces that my student, Antonios Anastasopoulos, and I have been most involved in. Our work is based on the premise that spoken language resources are more readily annotated with translations than with transcriptions. A first step towards making such data interpretable would be to automatically align spoken words with their translations. I'll present a neural attentional model (Duong et al., NAACL 2016) and a latent-variable generative model (Anastasopoulos and Chiang, EMNLP 2016) for this task.
<p>
David Chiang (PhD, University of Pennsylvania, 2004) is an associate professor in the Department of Computer Science and Engineering at the University of Notre Dame. His research is on computational models for learning human languages, particularly how to translate from one language to another. His work on applying formal grammars and machine learning to translation has been recognized with two best paper awards (at ACL 2005 and NAACL HLT 2009). He has received research grants from DARPA, CIA, NSF, and Google, has served on the executive board of NAACL and the editorial board of Computational Linguistics and JAIR, and is currently on the editorial board of Transactions of the ACL.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 Jan 2017</td>
<td align=left valign=top>Kenton Murray (Notre Dame)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_Jan_2017');">
Learning Neural Network Structures for Natural Language
</a><br>
<span id=abs06_Jan_2017 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> In recent years, deep learning has had a huge impact on natural language processing surpassing the performance of many other statistical and machine learning methods. One of the many promises of deep learning is that features are learned implicitly and that there is no need to manually engineer features for good performance. However, neural network performance is highly dependent on network architecture and selection of hyper-parameters. In many ways, architecture engineering has supplanted feature engineering in NLP tasks. In this talk, I will focus on two ways neural network structures can be learned while concurrently training models. First, I'll present a regularization scheme for learning the number of neurons in a neural language model during training (Murray and Chiang 2015) and show how it can be used in a Machine Translation task. Then, I'll move onto a Visual Question Answering task where denotations are selected by executing a probabilistic program that models non-determinism with neural networks (Murray and Krishnamurthy 2016).
<p>
Kenton Murray is a PhD student in the Natural Language Processing Lab at the University of Notre Dame's Computer Science and Engineering Department working with David Chiang. His research is on neural methods for human languages, particularly machine translation and question answering. Prior to Notre Dame, he was a Research Associate at the Qatar Computing Research Institute (QCRI) and received a Master's in Language Technologies from Carnegie Mellon University and a Bachelor's in Computer Science from Princeton University.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 Dec 2016</td>
<td align=left valign=top>Radu Soricut (Google)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_Dec_2016');">
Multimodal Machine Comprehension: Tasks and Approaches
</a><br>
<span id=abs09_Dec_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> The ability of computer models to achieve genuine understanding of information as presented to humans (text, images, etc) is a long-standing goal of Artificial Intelligence. Along the way towards this goal, the research community has proposed solving tasks such as machine reading comprehension and computer image understanding. In this talk, we introduce two new tasks that can help us move closer to the goal. First, we present a multi-choice reading comprehension task, for which the goal is to understand a text passage and choose the correct summarizing sentence from among several options. Second, we present a multi-modal understanding task, posed as a combined vision-language comprehension challenge: identifying the most suitable text describing a visual scene, given several similar options. We present several baseline and competitive learning approaches based on neural network architectures, illustrating the utility of the proposed tasks in advancing both image and language comprehension. We also present human evaluation results, which inform a performance upper-bound on these tasks, and quantify the remaining gap between computer systems and human performance (spoiler alert: we are not there yet).
<p>
Radu Soricut is a Staff Research Scientist in the Research and Machine Intelligence group at Google. Radu has a PhD in Computer Science from University of Southern California, and has been with Google since 2012. His main areas of interest are natural language understanding, multilingual processing, natural language generation (from multimodal inputs), and general machine learning techniques for solving these problems. Radu has published extensively in these areas in top-tier peer-reviewed conferences and journals, and has won the Best Paper Award at the North American Association for Computational Linguistics Conference (NAACL) in 2015. Radu's current project looks at bridging natural language understanding and generation using neural techniques, in the context of Google's focus on making natural language an effective way of interacting with the world and the technology around us.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>02 Dec 2016</td>
<td align=left valign=top>Yejin Choi (UW)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs02_Dec_2016');">
Procedural Language and Knowledge
</a><br>
<span id=abs02_Dec_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Various types of how-to-knowledge are encoded in natural language instructions: from setting up a tent, to preparing a dish for dinner, and to executing biology lab experiments. These types of instructions are based on procedural language, which poses unique challenges. For example, verbal arguments are commonly elided when they can be inferred from context, e.g., ``bake for 30 minutes'', not specifying bake what and where. Entities frequently merge and split, e.g., ``vinegarâ€™â€™ and ``oilâ€™â€™ merging into ``dressingâ€™â€™, creating challenges to reference resolution. And disambiguation often requires world knowledge, e.g., the implicit location argument of ``stir frying'' is on ``stove''. In this talk, I will present our recent approaches to interpreting and composing cooking recipes that aim to address these challenges.
In the first part of the talk, I will present an unsupervised approach to interpreting recipes as action graphs, which define what actions should be performed on which objects and in what order. Our work demonstrates that it is possible to recover action graphs without having access to gold labels, virtual environments or simulations. The key insight is to rely on the redundancy across different variations of similar instructions that provides the learning bias to infer various types of background knowledge, such as the typical sequence of actions applied to an ingredient, or how a combination of ingredients (e.g., ``flour'', ``milk'', ``eggs'') becomes a new entity (e.g, ``wet mixture'').
In the second part of the talk, I will present an approach to composing new recipes given a target dish name and a set of ingredients. The key challenge is to maintain global coherence while generating a goal-oriented text. We propose a Neural Checklist Model that attains global coherence by storing and updating a checklist of the agenda (e.g., an ingredient list) with paired attention mechanisms for tracking what has been already mentioned and what needs to be yet introduced. This model also achieves strong performance on dialogue system response generation. I will conclude the talk by discussing the challenges in modeling procedural language and acquiring the necessary background knowledge, pointing to avenues for future research.
<p>
Bio:
Yejin Choi is an assistant professor at the Computer Science & Engineering Department of University of Washington. Her recent research focuses on language grounding, integrating language and vision, and modeling nonliteral meaning in text. She was among the IEEEâ€™s AI Top 10 to Watch in 2015 and a co-recipient of the Marr Prize at ICCV 2013. Her work on detecting deceptive reviews, predicting the literary success, and learning to interpret connotation has been featured by numerous media outlets including NBC News for New York, NPR Radio, New York Times, and Bloomberg Business Week. She received her Ph.D. in Computer Science at Cornell University.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 Nov 2016</td>
<td align=left valign=top>Ramesh R Manuvinakurike (USC/ICT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_Nov_2016');">
Incremental spoken dialogue system for reference resolution in images
</a><br>
<span id=abs18_Nov_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> In this talk, I will be speaking about our ongoing effort in the
development of Eve, state-of-the-art incremental reference resolution
in images based spoken dialogue agent. Incrementality is central to
developing a naturally conversing spoken dialogue systems.
Incrementality makes the conversations more natural and efficient
compared to non-incremental alternatives. The performance of the Eve
was found to be comparable to human performance and she conveniently
outperforms alternative non-incremental architectures. However,
building such a system is not trivial. It needs high-performance
architectures and dialogue components (ASR, dialogue policies,
language understanding etc.). I will also speak about future plans for
enhancing Eve's capability. I also take a slight deviation and explore
a different word level natural language understanding model for
reference resolution in images in a dialogue setting.
<p>
Bio: Ramesh Manuvinakurike is a Ph.D. student at USC Institute for
Creative Technologies working with Prof. David DeVault and Prof.
Kallirroi Georgila. He is interested in developing conversational
systems and has developed various such systems. His work with his
colleagues on agent Eve won 'Best paper' award at Sigdial 2015.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>28 Oct 2016</td>
<td align=left valign=top>Yu Su (UCSB)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs28_Oct_2016');">
Learning from Zero: Recent Advances in Bootstrapping Semantic Parsers using Crowdsourcing
</a><br>
<span id=abs28_Oct_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Semantic parsing, which parses natural language into formal languages, has been applied to a wide range of structured data like relation databases, knowledge bases, and web tables. To learn a semantic parser for a new domain, the first challenge is always how to collect training data. While data collection using crowdsourcing has become a common practice in NLP, it's a particularly challenging and interesting problem when it comes to semantic parsing, and is still in its early stages. Given a domain and a formal language, how can we generate meaningful logical forms in a configurable way? How to design the annotation task so that crowdsourcing workers, who do not understand formal languages, can handle with ease? How can we exploit the compositional nature of formal languages to optimize the crowdsourcing process? In this talk I will introduce some recent advances in this direction, and present some preliminary answers to the above questions. The covered works mainly concern knowledge bases, but we will also cover some ongoing work concerning web APIs.
<p>
Yu Su is a fifth year PhD candidate in the Computer Science Department at UCSB, advised by Professor Xifeng Yan. Before that, He received a bachelor degree from Tsinghua University in 2012, with a major in Computer Science. He is interested in the interplay between language and formal meaning representations, including problems like semantic parsing, continuous knowledge representation, and natural language generation. He also enjoys applying deep learning on these problems.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>21 Oct 2016</td>
<td align=left valign=top>Marjan Ghazvininejad and Yonatan Bisk (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs21_Oct_2016');">
EMNLP practice talk: 1) Generating Topical Poetry & 2) Unsupervised Neural Hidden Markov Models
</a><br>
<span id=abs21_Oct_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> 1) In this talk I describe Hafez, a program that generates any number of distinct poems on a user-supplied topic.  Poems obey rhythmic and rhyme constraints.  I describe the poetry-generation algorithm, give experimental data concerning its parameters, and show its generality with respect to language and poetic form.
2) In this work, we present the first results for neuralizing an Unsupervised Hidden Markov Model. We evaluate our approach on tag induction. Our approach outperforms existing generative models and is competitive with the state-of-the-art though with a simpler model easily extended to include additional context.
<p>
Marjan Ghazvininejad is a PhD student at ISI working with Prof. Kevin Knight.
Yonatan Bisk is a Postdoc at ISI working with Prof. Daniel Marcu.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Oct 2016</td>
<td align=left valign=top>Xing Shi (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Oct_2016');">
EMNLP practice talk: Understanding Neural Machine Translation: length control and syntactic structure
</a><br>
<span id=abs14_Oct_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Neural Machine Translation is powerful but we know little about the black box. We conduct the following two investigations to gain a better understanding: First, we investigate how neural, encoder-decoder translation systems output target strings of appropriate lengths, finding that a collection of hidden units learns to explicitly implement this functionality. Second, we investigate whether a neural, encoderdecoder translation system learns syntactic information on the source side as a by-product of training. We propose two methods to detect whether the encoder has learned local and global source syntax. A fine-grained analysis of the syntactic structure learned by the encoder reveals which kinds of syntax are learned and which are missing.
<p>
Bio: Xing Shi is a PhD student at ISI working with Prof. Kevin Knight.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Sep 2016</td>
<td align=left valign=top>Andrea Gagliano (UC Berkeley)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Sep_2016');">
Poetry at the Metaphorical Intersection
</a><br>
<span id=abs26_Sep_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: This talk will discuss a technique to create figurative relationships using Mikolov et al.â€™s word vectors. Drawing on existing work on figurative language, we start with a pair of words and use the intersection of word vector similarity sets to blend the distinct semantic spaces of the two words. We conduct preliminary quantitative and qualitative observations to compare the use of this novel intersection method with the standard word vector addition method for the purpose of supporting the generation of figurative language. To showcase this technique, we use it to write computer generated sonnets.
<p>
Bio
Andrea Gagliano is a masters student at UC Berkeley's School of Information and the Berkeley Center for New Media. Her research explores the use of computation for creativity - both tools to support creative practices and generation of creative works. Recently, she has been focusing in the field of natural language processing by working on poetry and metaphor generation.
<p>
Previously, Andrea received her BS in Mathematics and BA in Business Administration from the University of Washington in 2013. During her studies, she spent time with the Creative Writing department studying poetry.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 Sep 2016</td>
<td align=left valign=top>Burr Settles (Duolingo)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_Sep_2016');">
Duolingo: Improving Language Learning and Assessment with Data
</a><br>
<span id=abs19_Sep_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Duolingo is a language education platform with more than 150 million students worldwide. Our flagship learning app is the #1 way to learn a language online, and is the most-downloaded education app for both Android and iOS devices. It is also completely free. In this talk, I will describe the Duolingo system and several empirical projects, which mix machine learning with computational linguistics and psychometrics to improve learning, engagement, and even language proficiency assessment through our products.
<p>
Burr Settles is a scientist, engineer, and head of research at Duolingo: the most widely used education application in the world, teaching 20 languages to more than 150 million users worldwide. He is also the principal developer of the Duolingo English Test: a computer-adaptive proficiency exam that aims to disrupt and democratize the global certification marketplace through highly accessible mobile technology. Before joining Duolingo, he earned a PhD in computer sciences at University of Wisconsin-Madison, and then worked as a postdoctoral research scientist at Carnegie Mellon University, where his work has spanned machine learning, natural language processing, and computational social science. His 2012 book Active Learning is now the standard text on learning algorithms that are adaptive, curious, or exploratory (if you will). Burr gets around by bike and (among other things) plays guitar in the pop band delicious pastries.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Sep 2016</td>
<td align=left valign=top>Zachary Chase Lipton (UCSD)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Sep_2016');">
Efficient Exploration for Dialog Policy Learning with BBQ Networks & Replay Buffer Spiking
</a><br>
<span id=abs16_Sep_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 1:30 pm - 2:30 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> When rewards are sparse and efficient exploration essential, deep Q-learning with Ïµ-greedy exploration tends to fail. This poses problems for otherwise promising domains such as task-oriented dialog systems, where the primary reward signal, indicating successful completion, typically occurs only at the end of each episode but depends on the entire sequence of utterances. A poor agent encounters such successful dialogs rarely, and a random agent may never stumble upon a successful outcome in reasonable time. We present two techniques that significantly improve the efficiency of exploration for deep Q-learning agents in dialog systems. First, we demonstrate that exploration by Thompson sampling, using Monte Carlo samples from a Bayes-by-Backprop neural network, yields marked improvement over standard DQNs with Boltzmann or Ïµ-greedy exploration. Second, we show that spiking the replay buffer with a small number of successes, as are easy to harvest for dialog tasks, can make Q-learning feasible when it might otherwise fail catastrophically.
<p>
Bio:
I am a graduate student in the Artificial Intelligence Group at the University of California, San Diego on leave for two quarters at Microsoft Research Redmond. I work on machine learning, focusing on deep learning methods and applications. In particular, I work on modeling sequential data with recurrent neural networks and sequential decision-making processes with deep reinforcement learning. I'm especially interested in research impacting medicine and natural language processing. Recently, in Learning to Diagnose with LSTM RNNs, we trained LSTM RNNs to accurately predict patient diagnoses using only lightly processed time series of sensor readings in the pediatric ICU. Before coming to UCSD, I completed a Bachelor of Arts with a joint major in Mathematics and Economics at Columbia University. Then, I worked in New York City as a jazz musician. I have interned with Amazon's Core Machine Learning team and Microsoft Research's Deep Learning Team.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 Sep 2016</td>
<td align=left valign=top>Nada Aldarrab (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_Sep_2016');">
How we Cracked the â€œBorgâ€ Cipher + First Steps Towards Deciphering from Images
</a><br>
<span id=abs09_Sep_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> European libraries are filled with undeciphered historical manuscripts from the 16th-18th centuries. These documents are enciphered with classical methods, which puts their contents out of the reach of historians who are interested in the history of that era.
In this talk, we show how we automatically cracked a 400-page book from the 17th century. We also describe a system aimed at deciphering from camera-phone images. We show initial results for different ciphers.
<p>
Bio:
Nada is a graduate student at USC, working on her thesis under the supervision of Prof. Kevin Knight. She is currently working on the decipherment of historical documents (joint project with Uppsala University, Sweden). Her research interests include natural language processing, machine learning, decipherment and machine translation.1
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Aug 2016</td>
<td align=left valign=top>Ke Tran (ISI Intern)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Aug_2016');">
Unsupervised learning linguistic structures with deep neural networks
</a><br>
<span id=abs26_Aug_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: We present a general framework for unsupervised learning that combines probalistic graphical models with the power of deep nets. We employ a neuralized expectation miminization algorithm for learning. We apply this framework for unsupervised sequential tagging and show some interesting results.
<p>
Bio: Ke is a PhD candidate at University of Amsterdam. He is interning at ISI, working with Yonatan Bisk, Ashish Vaswani, Kevin Knight, and Daniel Marcu. His research focuses on deep learning and machine translation.
<p>
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 Aug 2016</td>
<td align=left valign=top>Xiang Li (ISI Intern)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_Aug_2016');">
Event extraction from AMR representations
</a><br>
<span id=abs19_Aug_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: How to use NLP techniques to help medical researchers is crucial now. And making use of millions of medical passages is a good starting point. By doing this, we can extract useful information from these papers and help medical researchers a lot.
<p>
Iâ€™ll introduce a simple method to extract relations between proteins using AMR. By using this rule-base system, we can get AMR representation to simplified AMR(SMR) which only contains protein relation information.
<p>
Bio: Xiang Li(Lorraine) is a 2016 summer intern under the supervision of Prof Kevin Knight and Prof Daniel Marcu. She is also going to be a PhD student at the University of Massachusetts Amherst in Andrew McCallumâ€™s research group in this coming Fall. She got her B.S at the East China Normal University, Shanghai, China and got her M.S at the University of Chicago. Her research interest mainly focused on natural language processing and machine learning.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 Aug 2016</td>
<td align=left valign=top>Angeliki Lazaridou (University of Trento)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_Aug_2016');">
Can machines understand and generate stories?
</a><br>
<span id=abs03_Aug_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 11:00 am - 11:59 am<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: Computational creativity is an emerging field of AI, with linguistic creativity being an interesting test-bed for developing and evaluating machines with reasoning capabilities. A concrete example is story generation and understanding, a task which unlike the vast majority of traditional NLP that treats sentences in isolation,  requires deep understanding of the general context and discourse of stories.
<p>
In this talk, I will present some preliminary steps towards this goal and show how sequence-to-sequence models can be applied to this task. Overall, our results on story understanding are on par with current state-of-the-art  (that nevertheless have no generative capabilities), while at the same time producing sometimes rather amusing story endings.
<p>
Bio: Angeliki is a final year PhD student at the Center for Mind/Brain Sciences of the University of Trento. She received her MSc from the Saarland University, where she worked with Ivan Titov and Caroline Sporleder on Bayesian models for sentiment and discourse. She is currently working at the intersection between language and vision under the supervision of Marco Baroni.
<p>
Webcast: http://webcastermshd.isi.edu/Mediasite/Play/6f51b67c1a304a0c83297dd2f9b453921d
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>29 Jul 2016</td>
<td align=left valign=top>Sebastian Mielke (ISI Intern)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs29_Jul_2016');">
Let's not be clever: simple pre- and post-processing tricks in machine translation
</a><br>
<span id=abs29_Jul_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: Today's machine translation system are highly complex and extending them often means leaving highly sophisticated solutions and established algorithms behind. Therefore it is attractive to try to extend the process outside of the translation system: in pre- and post-processing steps.
I will show a pre-processing step for helping to translate tweets and a post-processing step that helps "guess" the translations of unknown and thus untranslated words in arbitrary sentences using dictionaries and other resources.
<p>
Bio: Sebastian is currently pursuing a CS masters degree in Dresden, Germany with Prof. Heiko Vogler, taking a break from studying to work on low-resource machine translation with Prof. Kevin Knight and Prof. Daniel Marcu as an ISI intern in 2016.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Jul 2016</td>
<td align=left valign=top>Stephen Rawls / Huaigu Cao (ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Jul_2016');">
Title: LSTM's for OCR
</a><br>
<span id=abs22_Jul_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: We present ongoing research into OCR for both machine print and handwriting recognition. We utilize a neural network along with LSTM's to perform OCR directly from pixel intensity. We are exploring a few novel improvements, including using a CNN for feature extraction prior to the LSTM, and combining reinforcement learning into our training to directly optimize word error rate in our test-time decoding procedure, which utilizes a (non-differentiable) language-model based decoding of the LSTM output. Finally, we present the design of the OCR system we used to win a pilot project with the US Census for recognizing handwritten first and last names.
<p>
Bio: Stephen Rawls is a research programmer and a PhD student at USC/ISI advised by Dr. Prem Natarajan. He works in the Computer Vision group at ISI on face recognition and OCR, among other projects.
<p>
Huaigu Cao is a computer scientist at USC ISI. His interest of research includes image processing and pattern recognition.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Jul 2016</td>
<td align=left valign=top>Xiang Li (ISI Intern)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Jul_2016');">
Title: Commonsense Knowledge Base Completion
</a><br>
<span id=abs15_Jul_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Abstract: We enrich a curated resource of commonsense knowledge by formulating the problem as one of knowledge base completion (KBC). Most work in KBC focuses on knowledge bases like Freebase that relate entities drawn from a fixed set. However, the tuples in ConceptNet (Speer and Havasi, 2012) define relations between an unbounded set of phrases. We develop neural network models for scoring tuples on arbitrary phrases and evaluate them by their ability to distinguish true held-out tuples from false ones. We find strong performance from a bilinear model using a simple additive architecture to model phrases. We manually evaluate our trained modelâ€™s ability to assign quality scores to novel tuples, finding that it can propose tuples at the same quality level as medium- confidence tuples from ConceptNet.
<p>
Bio: Xiang Li is a 2016 summer intern under the supervision of Prof Kevin Knight and Prof Daniel Marcu. She is also going to be a PhD student at the University of Massachusetts Amherst in Andrew McCallumâ€™s research group in this coming Fall. She got her B.S at the East China Normal University, Shanghai, China and got her M.S at the University of Chicago. Her research interest mainly focused on natural language processing and machine learning. This work is done when she was in Chicago working with Prof Kevin Gimpel at TTIC(Toyota Technological Institute at Chicago)
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>08 Jul 2016</td>
<td align=left valign=top>Aliya Deri (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs08_Jul_2016');">
Title: Grapheme-to-Phoneme Models for (Almost) Any Language
</a><br>
<span id=abs08_Jul_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: Grapheme-to-phoneme (g2p) models are rarely available in low-resource languages, as the creation of training and evaluation data is expensive and time-consuming. We use Wiktionary to obtain more than 650k word-pronunciation pairs in more than 500 languages. We then develop phoneme and language distance metrics based on phonological and linguistic knowledge; applying those, we adapt g2p models for high-resource languages to create models for related low-resource languages. We provide results for models for 229 adapted languages.
<p>
Bio: Aliya Deri is a PhD candidate in Computer Science at USC, advised by Professor Kevin Knight.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Jun 2016</td>
<td align=left valign=top>Yue Zhang (Singapore University of Technology and Design)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Jun_2016');">
Title: Neural network models for structured prediction
</a><br>
<span id=abs23_Jun_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Abstract: Transition-based methods leverage non-local features for structured tasks. When combined with beam search and global structure learning, they give high accuracies for a number of NLP tasks. We investigate the effectiveness of neural network models for transition-based parsing and Chinese word segmentation. Results show that automatic features induced by neural models give higher accuracies than carefully designed manual features. The beam search and perceptron learning framework of Zhang and Clark (2011) can be used with neural network models. However, large margin training does not always work. When the number of labels are many, a maximum likelihood training objective with contrastive estimation learning gives better accuracies.
<p>
Bio: Yue Zhang is currently an assistant professor at Singapore University of Technology and Design. Before joining SUTD in July 2012, he worked as a postdoctoral research associate in University of Cambridge, UK. Yue Zhang received his DPhil and MSc degrees from University of Oxford, UK, and his BEng degree from Tsinghua University, China. His research interests include natural language processing, machine learning and artificial Intelligence. He has been working on statistical parsing, parsing, text synthesis, machine translation, sentiment analysis and stock market analysis intensively. Yue Zhang serves as the reviewer for top journals such as Computational Linguistics, Transaction of Association of Computational Linguistics and Journal of Artificial Intelligence Research.  He is also PC member for conferences such as ACL, COLING, EMNLP, NAACL, EACL, AAAI and IJCAI. Recently, he was the area chairs of COLING 2014, NAACL 2015 and EMNLP 2015.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Jun 2016</td>
<td align=left valign=top>Yoav Goldberg (Bar Ilan University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Jun_2016');">
Title: Doing stuff with LSTMs
</a><br>
<span id=abs10_Jun_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract:  While deep learning methods in NLP are arguably overhyped, recurrent neural networks (RNNs), and in particular LSTM networks, emerge as very capable learners for sequential data. Thus, my group started using them everywhere. After briefly explaining what they are and why they are cool, I will describe some recent work in which we use LSTMs as a building block: learning a shared representation in a multi-task setting; learning feature representations for syntactic parsing; and learning to detect hypernyms in a large corpus. Most work achieve state of the art results.  I will also describe a work which reviewers seem to hate but I really like in which we try to shed some light on what's being captured by LSTM-based sentence representations.
<p>
<p>
Bio: Yoav Goldberg is a senior lecturer in Computer Science at Bar Ilan University, Israel, working on natural language processing. Prior to that he was a research scientist at Google. Before deep learning took over he used to work on syntactic parsing and structured prediction. He still does, but now he uses some new shiny tools which he is trying to understand and refine.
<p>
Live here: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=3d82a6274df44b89a94f376c0c9630f71d
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 Jun 2016</td>
<td align=left valign=top>Ke Tran (University of Amsterdam)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_Jun_2016');">
Title: Memorization and Exploration in Recurrent Neural Language Models
</a><br>
<span id=abs03_Jun_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: In this talk, I will focus on two important aspects in language modeling: memorization and exploration.
First, I will present Recurrent Memory Network, a recurrent language model augmented with an external memory block. I will show that by explicitly addressing the memory, RMN not only amplifies the power of recurrent neural network but also facilitate our understanding of its internal functioning and allows us to discover underlying patterns in data. Furthermore, our experiments demonstrate that using external memory allows RMN capturing sentence coherence better than previous models on sentence completion task.
In context of language generation (e.g. using conditional recurrent language models), memorization might hurt the performance of the whole system especially when recurrent models start hallucinating. In the second part, I will present preliminary findings in training neural machine translation (NMT) to avoid this pitfall. Particularly, we allow NMT to explore during training using REINFORCE/deep Q-network/minimum risk training.
<p>
Bio: Ke is a third year PhD candidate at University of Amsterdam, advised by Christof Monz and Arianna Bisazza. Before that, he received Msc degree from University of Groningen and Charles University in Prague. He is interested in neural machine translation.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>20 May 2016</td>
<td align=left valign=top>Yonatan Bisk (ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_May_2016');">
Title: Natural Language Communication with Computers
</a><br>
<span id=abs20_May_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Abstract: We propose a framework for devising testable algorithms for bridging the communication gap between humans and robots.  We begin with a setting in which humans give instructions to robots using unrestricted language commands, with instruction sequences aimed at building complex goal configurations in a blocks world.  I will present details of our data-collection effort, and preliminary results on action understanding.  Time permitting, I will present new baseline results for flipping the semantic parsing paradigm to address the problem of language generation, where a human performs commands produced by a machine to demonstrate basic two-way communication.
<p>
Bio: Yonatan Bisk received his PhD from UIUC in 2015 under Professor Julia Hockenmaier and is now a Postdoc with Daniel Marcu at ISI.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>13 May 2016</td>
<td align=left valign=top>Angeliki Lazaridou (University of Trento)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs13_May_2016');">
Towards Multi-Agent Communication-Based Language Learning
</a><br>
<span id=abs13_May_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: One of the most ambitious goals of AI is to develop intelligent conversational agents able to communicate with humans and assist them in their tasks. Thus, communication and interaction should be at the core of the learning process of these agents; failure to integrate communication as their main building block raises concerns regarding their usability.
<p>
In this talk, I will propose an interactive multimodal framework for language learning. Instead of being passively exposed to large amounts of natural text, our learners (implemented as feed-forward neural networks) engage in cooperative referential games starting from a tabula rasa setup, and thus develop their own language from the need to communicate in order to succeed at the game. Preliminary experiments provide promising results, but also suggest that it is important to ensure that agents trained in this way do not develop an ad-hoc communication code only effective for the game they are playing.
<p>
Bio: Angeliki is a final year PhD student at the Center for Mind/Brain Sciences of the University of Trento. She received her MSc from the Saarland University, where she worked with Ivan Titov and Caroline Sporleder on Bayesian models for sentiment and discourse. She is currently working at the intersection between language and vision under the supervision of Marco Baroni.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 May 2016</td>
<td align=left valign=top>Gully Burns (ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_May_2016');">
Title: The TechKnAcq Project: Building Pedagogically Tuned Reading Lists from Technical Corpora
</a><br>
<span id=abs06_May_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: This work is geared towards developing pedagogically-tuned information retrieval systems to help learners select the most informative documents as a reading list for a given query over a given technical corpus. This work will enable learners to understand complex subjects more quickly. I will discuss our overall methodology, our efforts to study dependency between topics within a technical corpus and improvements to evaluating topic quality. I will describe ongoing efforts to study a document's pedagogical value to the end user and future directions for this enterprise.
<p>
Bio: Gully Burns' focus is to develop pragmatic knowledge engineering systems for scientists in collaboration with experts from the field of AI. He was originally trained as a physicist at Imperial College in London before switching to do a Ph.D. in neuroscience at Oxford. He came to work at USC in 1997, developing the 'NeuroScholar' project in Larry Swanson's lab before joining the Information Sciences Institute in 2006. He is as Research Lead at ISI.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>29 Apr 2016</td>
<td align=left valign=top>Zhengping Che (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs29_Apr_2016');">
Deep learning solutions to computational phenotyping in health care
</a><br>
<span id=abs29_Apr_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: Exponential growth in electronic health care data has resulted in new opportunities and urgent needs to discover meaningful data-driven representations and patterns of diseases. Recent rise of this research field with more available data and new applications also has introduced several challenges. In this talk, we will present our deep learning solutions to address some of the challenges. First, health care data is inherently heterogeneous, with a variety of missing values and from multiple data sources. We propose variations of Gated Recurrent Unit (GRU) to explore and utilize the informative missingness in health care data, and hierarchical multimodal deep models to utilize the relations between different data sources. Second, model interpretability is not only important but necessary for care providers and clinical experts. We introduce a simple yet effective knowledge distillation approach called interpretable mimic learning to learn interpretable gradient boosting tree models while mimicking the performance of deep learning models.
<p>
Bio: Zhengping Che is a third year PhD candidate in the Computer Science Department at the University of Southern California, advised by Professor Yan Liu. Before that, he received his bachelor degree in Computer Science from Pilot CS Class (Yao Class) at Tsinghua University, China. His primary research interest lies in the area of deep learning and its applications in health care domain, especially on multivariate time series data.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Apr 2016</td>
<td align=left valign=top>Morteza Dehghani (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Apr_2016');">
Decoding Neuro-Semantic Representation of Stories across Languages
</a><br>
<span id=abs15_Apr_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: Understanding how conceptual knowledge is represented and organized in the human brain is one of the core problems of cognitive science, and many studies have aimed at exploring and understanding the similarities of neuro-semantic representations of concepts. A general approach that has been particularly fruitful in this domain is the investigation of the relationship between various corpus statistics of words and neural activity during exposure to those words. In this work, we examine the neuro-semantic representations of stories across three different languages. We demonstrate that using new advances in vector-based representation of text and paragraphs, fMRI signals can be reliably mapped to story representations. We also show that such representations can capture common neuro-semantic representation of stories across different languages. Finally, performing search-light analysis using over a billion regressions, we show that activation patterns in the default mode network of the brain are the most reliable features for decoding stories.
<p>
Bio: Morteza is an Assistant Professor of psychology, computer science and the Brain and Creativity Institute at University of Southern California. His research spans the boundary between psychology and artificial intelligence, as does his education. His work investigates properties of cognition by using documents of the social discourse, such as narratives, social media, transcriptions of speeches and news articles, in conjunction to behavioral studies.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>08 Apr 2016</td>
<td align=left valign=top>Hao Wu (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs08_Apr_2016');">
Learning Distributed Representations from Network Data and Human Navigation
</a><br>
<span id=abs08_Apr_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: The increasing growth of network data such as linked documents on the Web and social networks, has imposed great challenges on automatic data analysis. We study the problem of learning representations of network data, which is of critical for applications including data classification, ranking and link prediction. We present neural network embedding algorithms to learn distributed representations of network data that capture the deep context of each data point, and human cognition in navigation data. To improve the scalability of our algorithms, we use efficient optimization and sampling methods.
<p>
Bio: Hao Wu is a PhD student at USC/ISI, advised by Kristina Lerman.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>01 Apr 2016</td>
<td align=left valign=top>Julian McAuley (UCSD)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs01_Apr_2016');">
Harnessing reviews to build richer models of opinions
</a><br>
<span id=abs01_Apr_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: Online reviews are often our first port of call when considering products and purchases online. Yet navigating huge volumes of reviews (many of which we might disagree with) is laborious, especially when we are interested in some niche aspect of a product. This suggests a need to build models that are capable of capturing the complex and idiosyncratic semantics of reviews, in order to build richer and more personalized recommender systems. In this talk I'll discuss three such directions: First, how can reviews be harnessed to better understand the dimensions (or facets) of people's opinions? Second, how can reviews be used to answer targeted questions, that may be subjective or require personalized responses? And third, how can reviews themselves be synthesized, so as to predict what a reviewer would say, even for products they haven't seen yet?
<p>
Bio: Dr. McAuley has been an Assistant Professer in the Computer Science Department at the University of California, San Diego since 2014. Previously he was a postdoctoral scholar at Stanford University after receiving his PhD from the Australian National University in 2011. His research is concerned with developing predictive models of human behavior using large volumes of online activity data.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Mar 2016</td>
<td align=left valign=top>Jonathan Kummerfeld (Berkeley)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Mar_2016');">
Capturing More Linguistic Structure with Graph-Structured Parsing
</a><br>
<span id=abs25_Mar_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: The correct interpretation of any sentence is obscured by a vast array of alternatives. Previous work on disambiguating meaning has focused on representations of syntax using tree structures. Simplifying syntax in this way often means leaving out long-distance relations between words, providing less information to downstream tasks such as dialog and question answering. We propose a new algorithm that is able to efficiently search over graph structures, fully capturing argument structures as a directed acyclic graph.  Our dynamic program uniquely decomposes structures, and is sound and complete with respect to the class of one-endpoint crossing graphs.
<p>
<p>
Bio: Jonathan is a Ph.D. candidate at UC Berkeley working on natural language processing with Dan Klein. His research focuses on new algorithms for interpreting text and analyzing system behavior. In particular, he has built search-based error analysis tools for syntactic parsing and coreference resolution, and a graph-based syntactic parser.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Mar 2016</td>
<td align=left valign=top>Sahil Garg (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Mar_2016');">
Extracting Biomolecular Interactions Using Semantic Parsing of Biomedical Text
</a><br>
<span id=abs11_Mar_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: We advance the state of the art in biomolecular interaction extraction with three contributions: (i) We show that deep, Abstract Meaning Representations (AMR) significantly improve the accuracy of a biomolecular interaction extraction system when compared to a baseline that relies solely on surface- and syntax-based features; (ii) In contrast with previous approaches that infer relations on a sentence-by-sentence basis, we expand our framework to enable consistent predictions over sets of sentences (documents); (iii) We further modify and expand a graph kernel learning framework to enable concurrent exploitation of automatically induced AMR (semantic) and dependency structure (syntactic) representations. Our experiments show that our approach yields interaction extraction systems that are more robust in environments where there is a significant mismatch between training and test conditions.
<p>
Bio: Sahil Garg is a PhD student, advised by Prof. Aram Galstyan, in computer science department of Viterbi school of engineering at University of Southern California. He is interested in problem oriented research. In the past, he developed machine learning, information theoretic algorithms for real world problems such as sensing environmental dynamics using mobile robotic sensors. In this talk, he is going to discuss his recent work on extracting bio-molecular interactions from bio-medical text using semantic parsing, especially in relevance to Cancer disease.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>04 Mar 2016</td>
<td align=left valign=top>David Jurgens (Stanford)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs04_Mar_2016');">
Linguistic Annotation Using Video Games with a Purpose
</a><br>
<span id=abs04_Mar_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: Building systems that understand human language often requires access to large amounts of text annotated with all the features and nuances of human communication.  However, building these annotated corpora is often prohibitive due to the time, cost, and expertise required to annotate.  While crowdsourcing the work can help, untrained workers still incur costs and the workers may not be as motivated to answer correctly.  In this talk, I will describe how to solve this annotation bottleneck using video games in which traditional annotation tasks are transformed into core video game mechanics and embedded in the kinds of games you might play on your mobile phone.  Our video games are not only fun to play but are capable of annotating a wide variety of linguistic phenomena at costs lower that crowdsourcing and have quality equal to that of experts.  Using four games, I will demonstrate how their creation process can be distilled into reusable design patterns to create new games for different types of tasks in linguistics and beyond.
<p>
Bio: David Jurgens is postdoctoral scholar in the department of Computer Science at Stanford University.  He received his PhD in Computer Science from UCLA in 2014 and has been a visiting researcher at HRL Laboratories, research scientist at Sapienza University of Rome and postdoctoral scholar at McGill University.  His research focuses on two areas: natural language processing, where he works on new methods for understanding the meaning of text, and computational social science where he investigates population dynamics through peoples' language and demographics.  He is currently a co-chair of the International Workshops on Semantic Evaluation (SemEval) and of the workshop on Natural Language Processing and Computational Social Science.  His research has been featured in Forbes, MIT Technology Review, Business Insider, and Schneier on Security.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Feb 2016</td>
<td align=left valign=top>Angel Chang (Stanford)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Feb_2016');">
Interactive scene design using natural language
</a><br>
<span id=abs26_Feb_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: Designing 3D scenes is currently a creative task that requires significant expertise and effort in using complex 3D design interfaces.  This design process starts in contrast to the easiness with which people can use language to describe real and imaginary environments.  We present an interactive text to 3D scene generation system that allows a user to design 3D scenes using natural language.  A user provides input text from which we extract explicit constraints on the objects that should appear in the scene.  Given these explicit constraints, the system then uses a spatial knowledge base learned from an existing database of 3D scenes and 3D object models to infer an arrangement of the objects forming a natural scene matching the input description.  Using textual commands the user can then iteratively refine the created scene by adding, removing, replacing, and manipulating objects.
<p>
Bio: Angel Chang recently received her PhD after working in the Stanford NLP group where she was advised by Chris Manning.  Her research focuses on the intersection of natural language understanding, computer graphics, and AI.  She is currently a visiting expert at Tableau Research.  More details at http://stanford.edu/~angelx/
<p>
Webcast link: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=735bfbb4ba1a4b749fe591958f837ccb1d
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 Feb 2016</td>
<td align=left valign=top>Ehsan Ebrahimzadeh (UCLA)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_Feb_2016');">
Chasing vaccination in social media: Narrative discovery from an unstructured corpus of text
</a><br>
<span id=abs19_Feb_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> The 2014-2015 measles outbreak in California was a serious public health crisis. Health officials attributed the outbreak to the increasing number of children whose parents had secured exemptions from vaccination for various vaccine-preventable diseases (VPDs). We believe that exemption seeking is part of a broader culture of distrust driven in large part by stories circulating in social media. An under- standing of the dynamics of this broader culture is necessary if we are to develop health policies that do not simply address outcomes but rather the cultural basis for decisions leading to those outcomes. We reveal the dynamics of exemption seeking and the greater culture of distrust endemic to these sites by developing a generative statistical-mechanical model where stories are represented as net- works with actants such as parents, medical professionals, and religious institutions as nodes, and their various relationships as edges. We estimate the latent but unknown stories circulating on these sites by modeling the posts as a sampling of the hidden story graph. Working with a data set of â‰ˆ2 million posts crawled from parent- ing sites over a â‰ˆ5 year period, we uncover a strong, persistent story signal in which parents, driven by a distrust of government and medical institutions, devise strategies to secure exemptions for their children from required vaccinations. In these stories, it is the vaccines and not the VPDs that pose a threat to the children. Our method of analyzing social media conversations and the exchange of stories at scale can provide an alert mechanism to health officials, help lay the groundwork for devising community-specific messaging interventions, and inform policy making.
<p>
Bio. Ehsan Ebrahimzadeh is a PhD candidate in the Electrical Engineering Department of UCLA, where he is simultaneously working towards my his degree in Applied Mathematics. Broadly speaking, he is interested in Statistics, Applied probability, and Data Analytics. Before joining UCLA in 2013, he received his MASc degree in Electrical Engineering from University of Waterloo, and BSc degrees in Mathematics and Electrical Engineering from Isfahan University of Technology.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>12 Feb 2016</td>
<td align=left valign=top>Thang Luong (Stanford)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs12_Feb_2016');">
Recent Advances in Neural Machine Translation
</a><br>
<span id=abs12_Feb_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Neural Machine Translation (NMT) is a simple new architecture for getting machines to learn to translate. At its core, NMT is a single big recurrent neural network that is trained end-to-end with several advantages such as simplicity and generalization. Despite being relatively new, NMT has already been showing promising results in various translation tasks. In this talk, I will give an overview of NMT and highlight my recent work on (a) how to address the rare word problem in NMT, (b) how to improve the attention (alignment) mechanism, and (c) how to leverage data from other modalities to improve translation.
<p>
Bio. Thang Luong is currently a 5th-year PhD student in the Stanford NLP group under Prof. Chris Manning. In the past, he has published papers on various different NLP-related areas such as digital library, machine translation, speech recognition, parsing, psycholinguistics, and word embedding learning. Recently, his main interest shifts towards the area of deep learning using sequence to sequence models to tackle various NLP problems, especially neural machine translation. He has built state-of-the-art (academically) neural machine translation systems both at Google and at Stanford.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Feb 2016</td>
<td align=left valign=top>Linhong Zhu (ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Feb_2016');">
Deciphering Dark Web through k-partite Graph Summarization
</a><br>
<span id=abs05_Feb_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Facts and their relations extracted from web are commonly modeled as graphs with different types of vertices. In this work, we focus on the problem of revealing latent entities from a $k$-partite graph, by co-clustering $k$ types of different vertices. We propose a CoSum approach, which creates a summary graph, where each super node (a cluster of original vertices) represents a hidden entity and the weighted edges encode important relations among extracted entities. The resulted summary graph also allows for investigation and interpretation of hidden entities. Evaluation verifies that CoSum outperforms several baselines in terms of entity coherence, query supporting and recovering hidden victims in the applied human trafficking domain.
<p>
Bio: Linhong Zhu is currently a computer scientist at Information Sciences Institute, University of Southern California, where she also received training as a Postdoctoral Research Associate. Before that, she worked as a Scientist-I in data analytics department at Institute for Infocomm Research, Singapore. She obtained her Ph.D. degree in computer engineering from Nanyang Technological University, Singapore in 2011. Her research interests are large-scale graph analytics with applications to social network analysis, social media analysis, and predictive modeling. She has been awarded with University of Southern California Postdoctoral travel and training award in 2014 and her paper has been selected as two of the best papers in SIGMOD 2010.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>29 Jan 2016</td>
<td align=left valign=top>Reid Swanson (USC/ICT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs29_Jan_2016');">
Leveraging the Social Web to Enable Open-Domain Interactive Storytelling
</a><br>
<span id=abs29_Jan_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Storytelling is an integral part of human interaction and critical to nearly all forms of entertainment. Since the introduction of TALE-SPIN over thirty years ago, automating the process of storytelling has been an active area of research. However, despite the incredible advances in other areas of computer science, such as 3D graphics and computational physics, that have enabled dazzling immersive interactive environments, there has been little progress in delivering automated *stories* that have the richness and complexity we expect in this genre of discourse.
<p>
In this talk I will primarily discuss work done during my thesis that leverages the vast amounts of knowledge hidden implicitly in the social web in order to enable a text-based open-domain interactive storytelling system. In this system the human and computer take turns writing sentences of an emerging fictional story on any topic the author chooses. The system uses an architecture inspired by case-based reasoning with a knowledge base of over a million personal stories about the daily lives and experiences of ordinary people. At each turn the system selects a sentence from the corpus that tries to maximize the semantic and discourse coherence given the text of the story so far.
<p>
I will also describe how crowd-sourcing communities were used to collect thousands of collaborative stories with the system and tens of thousands of ratings from hundreds of participants on several subjective evaluation criteria. The best models show significant improvements over the baseline and are judged to be indistinguishable from entirely human written weblog stories from a held out part of the collection.
<p>
I will conclude with some more recent and ongoing research that examines additional methods of evaluation and new models of narrative generation based on Recurrent Neural Networks.
<p>
Bio. Reid Swanson received his PhD in Computer Science from the University of Southern California in 2010 where he focused on a large-scale text-based interactive storytelling system. His primary research interest is in large-scale open-domain interpretation and generation of interactive narratives.
<p>
After graduating he spent a year at the Walt Disney Imagineering Research & Development lab in Glendale, CA. At Disney he worked with an interdisciplinary team of industry engineers, academics, artists and performers to develop technologies for bringing persistent interactive storytelling to select groups of guests at their theme parks and resorts.
<p>
From 2011 until 2015, Reid worked as a postdoc at UC Santa Cruz where he participated in a range of different projects. As part of the SIREN project, with Arnav Jhala, he investigated games for teaching conflict resolution management. On the SSIM project, with Michael Mateas, he helped research and develop virtual training environments targeting the military and law enforcement agencies to help prevent conflict escalation in unknown social environments. With Marilyn Walker, he also investigated automated methods for analyzing and mining prototypical arguments on internet debate forums about controversial topics such as gun control, gay marriage and evolution.
<p>
In August of 2015 he rejoined the Institute of Technologies as a Research Scientist where he is researching the role of narrative structure in the persuasiveness of an intended message embedded in the story across different cultures.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Jan 2016</td>
<td align=left valign=top>Jiwei Li (Stanford)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Jan_2016');">
Extracting User Information from Online Social Media
</a><br>
<span id=abs22_Jan_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> The overwhelming popularity of online social media creates an unprecedented opportunity to  display aspects of oneself.  Inferring information about these users has the potential to benefit many
downstream applications such as recommendation engines and targeted advertising. In this talk I will show how to extract important personal information such as major life events  and personal attributes (e.g., gender, education, job)  from social evidence  such as the text produced by users and their friends and from properties of their social network. I will describe algorithms making use of a variety of frameworks, including distant supervision, and a deep learning architecture that learns user representations by integrating many heterogeneous social signals.
<p>
Bio. Jiwei Li is a PH.D. student in the computer science department at Stanford University, working with Prof. Dan Jurafsky. His research interests include discourse, language generation, and social networks, with a focus on deep learning methods. Jiwei receives his B.S. from Peking University in 2012.  He was rewarded the Facebook Fellowship in 2015.
<p>
Webcast link: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=6b5348f2f8dc4a4dbb595eca444410d51d
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Jan 2016</td>
<td align=left valign=top>Gabor Angeli (Stanford)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Jan_2016');">
Learning Open Domain Knowledge From Text
</a><br>
<span id=abs15_Jan_2016 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> The increasing availability of large text corpora holds the promise of acquiring an unprecedented amount of knowledge from this text. However, current techniques are either specialized to particular domains or do not scale to large corpora. This dissertation develops a new technique for learning open-domain knowledge from unstructured web-scale text corpora.    A first application aims to capture common sense facts: given a candidate statement about the world and a large corpus of known facts, is the statement likely to be true? We appeal to a probabilistic relaxation of natural logic -- a logic which uses the syntax of natural language as its logical formalism -- to define a search problem from the query statement to its appropriate support in the knowledge base over valid (or approximately valid) logical inference steps. We show a 4x improvement at retrieval recall compared to lemmatized lookup, maintaining above 90% precision.    This approach is extended to handle longer, more complex premises by segmenting these utterance into a set of atomic statements entailed through natural logic. We evaluate this system in isolation by using it as the main component in an Open Information Extraction system, and show that it achieves a 3% absolute improvement in F1 compared to prior work on a competitive knowledge base population task.    A remaining challenge is elegantly handling cases where we could not find a supporting premise for our query. To address this, we create an analogue of an evaluation function in gameplaying search: a shallow lexical classifier is folded into the search program to serve as a heuristic function to assess how likely we would have been to find a premise. Results on answering 4th grade science questions show that this method improves over both the classifier in isolation and a strong IR baseline, and achieves the best published results on the task.
<p>
Bio. Gabor is a new graduate from Chris Manning's natural language processing lab. He graduated with a BS in electrical engineering/computer science from UC Berkeley in 2010, and defended his Ph.D. in the fall of 2015. His research focuses on natural language understanding, ranging from relation extraction and knowledge base population, textual entailment, common-sense reasoning, and question answering. He has led the Stanford knowledge base population project for the past three years, with Stanford ranking 5th, 1st, and 1st (tied) among teams participating in the TAC-KBP competition over those three years. In addition to publications at ACL, EMNLP and NAACL, he co-authored an EMNLP best dataset paper on collecting a large dataset for textual entailment. Outside of academia, he was the NLP architect for Baarzo in 2014 (acquired by Google), and is currently a fellow at XSeed Capital. In his free time, Gabor enjoys hiking, board games, and binge-watching Netflix shows.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>04 Dec 2015</td>
<td align=left valign=top>Eli Pincus (USC / ICT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs04_Dec_2015');">
What Can We Learn From An Agent that Plays Word-Guessing Games?
</a><br>
<span id=abs04_Dec_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> In this talk I will discuss an agent that can play a simple word-guessing game with a user.   The fast-paced, multi-modal, and interactive nature of the dialogue that takes place in word-guessing games are challenging for todayâ€™s dialogue systems to emulate.   The agent serves as a research testbed to explore issues of fast-paced incremental interaction and user satisfaction in such a setting.  I will trace how the agent's design was motivated by a human-human corpus as well as discuss two empirical studies involving the agent.  The first study was designed to learn an algorithm to automatically select effective clues (clues likely to elicit a correct guess from a human).  The second study was an evaluation of several synthetic voices and 1 human voice which showed how participant's subjective perceptions and objective task performances fluctuated based on the voice used and the duration of the participant's exposure to the voice.
<p>
Bio. Eli Pincus is a 3rd year USC PhD student and a graduate research assistant in the Natural Dialogue Group at USC Institute for Creative Technologies.  He is advised by Professor David Traum. Eli's main research is in human-computer dialogue.  Since joining USC he has been working on improving virtual human dialogue.  He won the best computer science department TA award in spring 2015.   He was a research intern in the NLP and AI group at Nuance Communications in summer 2015.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>20 Nov 2015</td>
<td align=left valign=top>Jia Xu (Chinese Academy of Sciences</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_Nov_2015');">
Better Bootstraps, Better Translation.
</a><br>
<span id=abs20_Nov_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Bagging [Breiman, 96] and its variants is one of the most popular methods in aggregating classifiers and regressors. Its original analysis assumes that the bootstraps are built from an unlimited, independent source of samples. In the real world this analysis fails because there is a limited number of training samples.
<p>
We analyze the effect of intersections between bootstraps to train different base predictors, which shows that the real-world bagging behaves very differently than its ideal analog [Breiman, 96]. Most importantly, we provide an alternative subsampling method called design-bagging based on a new construction of combinatorial designs. We prove that this is universally better than bagging.  Our analytical results are backed up by experiments on general classification and regression settings, and significantly improved all machine translation systems we used in the NIST-15 C-E competition.
<p>
Bio:  Jia Xu is an associate professor at ICT/CAS, after being an assistant professor in Tsinghua University and a senior researcher at DFKI lecturing at Saarland University in Germany. She worked at IBM Watson and MSR Redmond during her Ph.D. advised by Hermann Ney at RWTH-Aachen University. Her current research interests are in Machine Learning with a focus towards highly competitive machine translation systems, where she led and participated in teams winning first place in WMT-11, TC-Star -05-07 and NIST-08. In NIST-15 she led one more team that won 4th place, which is the 1st among academic institutions.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>13 Nov 2015</td>
<td align=left valign=top>Satish Kumar Thittamaranahalli (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs13_Nov_2015');">
Notes on the Constraint Composite Graph
</a><br>
<span id=abs13_Nov_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> In this talk, I will present the idea of the constraint composite graph (CCG) associated with any combinatorial problem modeled as a weighted constraint satisfaction problem (WCSP). The CCG constitutes the first mathematical framework for simultaneously exploiting the numerical structure of the weighted constraints as well as the graphical structure of the variable-interactions in a WCSP. I will discuss a number of important applications of the CCG including its role in: (a) identification of tractable classes of WCSPs; (b) kernelization techniques for combinatorial problems; and (c) understanding the scope of incremental computation for hard combinatorial problems.
<p>
Bio. Dr. Satish Kumar Thittamaranahalli (T. K. Satish Kumar) is a Research Scientist at the University of Southern California. He has published extensively on numerous topics in Artificial Intelligence spanning such diverse areas as Constraint Reasoning, Planning and Scheduling, Probabilistic Reasoning, Combinatorial Optimization, Approximation and Randomization, Heuristic Search, Model-Based Reasoning, Knowledge Representation and Spatio-Temporal Reasoning. He has served on the Program Committees of many international conferences in Artificial Intelligence and is a co-winner of the Best Student Paper Award from the 2005 International Conference on Automated Planning and Scheduling. Dr. Kumar received his PhD in Computer Science from Stanford University in March 2005. In the past, he has also been a Visiting Student at the NASA Ames Research Center, a Postdoctoral Research Scholar at the University of California, Berkeley, a Research Scientist at the Institute for Human and Machine Cognition, a Visiting Assistant Professor at the University of West Florida, and a Senior Research and Development Scientist at Mission Critical Technologies.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 Nov 2015</td>
<td align=left valign=top>Fabrizio Morbini (USC / ICT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_Nov_2015');">
Text generation from abductive interpretations
</a><br>
<span id=abs06_Nov_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Abduction is an inference method often used to formalize what the process of interpretation is. In this talk i'll describe a system that generates a textual description of an abductive proof and its evaluation when applied to the interpretations generated for a set of 100 movies from the Heider-Simmel Interactive Theater project. The goal of the system is to generate text that explains the system's interpretation fluently without having to read or understand a proof graph and first order logic.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Oct 2015</td>
<td align=left valign=top>Farshad Kooti (USC / ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Oct_2015');">
Fine-grained Temporal Patterns of Online Content Consumption
</a><br>
<span id=abs23_Oct_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Online activity is characterized by diurnal and weekly patterns, reflecting human circadian rhythms, sleep cycles, and social patterns of work and leisure. Using data from online social networking site Facebook, we uncover temporal patterns that take place at far shorter time scales. Specifically, we demonstrate fine-grained, within-session behavioral changes, where a session is defined as a period of time a user engages with Facebook before choosing to take a break. We show that over the course of a session, users spend less time consuming some types of content, such as textual posts, and preferentially consume more photos and videos. Moreover, users who spend more time engaging with Facebook have different patterns of session activity than the less-engaged users, a distinction that is already visible at the start of the session. We study activity patterns with respect to usersâ€™ demographic characteristics, such as age and gender, and show that age has a strong impact on within-session behavioral changes. Finally, we show that the temporal patterns we uncover help us more accurately predict the length of sessions on Facebook.
<p>
Bio.  I am a third-year Computer Science PhD student at the University of Southern California (USC), Information Sciences Institute (ISI) working under the supervision of Kristina Lerman. My main research interest is the study of large and complex datasets, especially data from online social networks, which includes the measurement and analysis of users' behavior in OSNs. I'm currently a Data Science intern at Facebook in Menlo Park.
Before joining USC, I got my master's from Max Planck Institute for Software Systems (MPI-SWS), Germany. I worked with Krishna Gummadi as my advisor and also with Meeyoung Cha (KAIST) and Winter Mason (Facebook) during my master's. Before MPI, I got my bachelor's in Computer Engineering (Software) from University of Tehran, Iran.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 Oct 2015</td>
<td align=left valign=top>Liron Cohen (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_Oct_2015');">
Using Highways for Bounded-Suboptimal Multi-Agent Path Finding
</a><br>
<span id=abs09_Oct_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Multi-agent path-finding (MAPF) is important for applications such as the kind of warehousing done by Kiva systems. Solving the problem optimally is NP-hard, yet finding low-cost solutions is important. Bounded-suboptimal MAPF algorithms, such as enhanced conflict-based search (ECBS), often do not perform well in warehousing domains with many agents. We therefore develop bounded-suboptimal MAPF algorithms, called CBS+HWY and ECBS+HWY, that exploit the problem structure of a given MAPF instance by finding paths for the agents that include edges from user-provided highways, which encourages a global behavior of the agents that avoids collisions. On the theoretical side, we develop a simple approach that uses highways for MAPF and provides suboptimality guarantees. On the experimental side, we demonstrate that ECBS+HWY can decrease the runtimes and solution costs of ECBS in Kiva-like domains with many agents if the highways capture the problem structures well.
<p>
Bio: Liron received a B.S. in Computer Engineering in 2007 and an M.S. in Computer Science in 2012, both from the Hebrew University of Jerusalem. Liron is interested in combinatorial problems related to constraint-based reasoning and symbolic planning. Specifically, he is looking at novel algorithmic techniques for exploiting structure in such combinatorial problems.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>02 Oct 2015</td>
<td align=left valign=top>David Kale (USC / ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs02_Oct_2015');">
Automated Deep Multi-Phenotyping with Noisy Labels
</a><br>
<span id=abs02_Oct_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> The increasing volume of electronic health records (EHR) data has spurred significant interest in the development of algorithmic phenotyping, used to identify patient cohorts in massive databases. Data-driven phenotyping, which formulates phenotyping as a statistical learning problem, offers superior scalability and generalization. Building upon previous work at Stanford, we propose a deep multi-phenotyping model: we train a single multi-task neural network to recognize multiple phenotypes, trained on noisy labels generated via an automatic process. We present preliminary results on classifying over 30 different phenotypes on a data set of over one million patients from the Stanford clinical system. This is joint work with Nigam Shah at Stanford University Center for Biomedical Informatics Research.
<p>
BIO: Dave Kale is a fourth year PhD student in Computer Science and an Alfred E. Mann Innovation in Engineering Fellow at the University of Southern California. He is advised by Greg Ver Steeg. Before joining USC and ISI, he worked in the Whittier VPICU at Children's Hospital LA and co-founded the Meaningful Use of Complex Medical Data (MUCMD) Symposium. Dave holds a BS and MS from Stanford University.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Sep 2015</td>
<td align=left valign=top>Guido Zarrella (MITRE)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Sep_2015');">
Neuromorphic Language Understanding
</a><br>
<span id=abs11_Sep_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Recurrent neural networks are effective tools for processing natural language. RNNs can be effectively trained to perform sequence processing tasks such as translation, classification, language modeling, and paraphrase detection. However despite major gains in fields related to these power hungry artificial neural networks, it remains difficult to construct functional models of cognition inspired by biological nervous systems. In this talk I'll describe how RNNs can be trained to excel at language understanding tasks and then adapted to run on ultra-low power neuromorphic hardware which simulates the spiking of individual neurons. The result is an interactive embedded system that uses recurrent neural networks to process language while consuming an estimated .000048 watts (48 microwatts).
<p>
Bio: Guido Zarrella is a Principal Artificial Intelligence Engineer at the MITRE Corporation in Denver, Colorado. He leads a R&D effort pursuing advances in deep learning for language understanding. He is a former President of the Association for Computational Linguistics, having served in this role on December 5th, 2011.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>04 Sep 2015</td>
<td align=left valign=top>Barret Zoph (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs04_Sep_2015');">
How Much Information Does a Human Translator Add to the Original?
</a><br>
<span id=abs04_Sep_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> We ask how much information a human translator adds to an original text, and we
provide a bound. We address this question in the context of bilingual text com-
pression: given a source text, how many bits of additional information are required to specify the target text produced by a human translator? We develop new compression algorithms and establish a benchmark task.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>28 Aug 2015</td>
<td align=left valign=top>Sudha Rao (Maryland / ISI Intern)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs28_Aug_2015');">
Distant supervision for relation extraction using AMR
</a><br>
<span id=abs28_Aug_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> In this talk I will present the work I did with Prof Daniel Marcu and Prof Kevin Knight at ISI over the summer. In this work, we show how we can improve relation extraction for biomedical text using distant supervision from existing knowledge sources like BioPax. We label the data using heuristics from AMR which obviates the need for expensive manual annotation and allows us to make use of large amounts of data for training. I will also talk about some ongoing work on training a simpler model that exploits linguistic information stored in the path via the least common ancestor in an AMR.
<p>
Bio. I am a PhD student from University of Maryland, College Park working under Prof. Hal Daume III and Prof. Philip Resnik. My recent project on "Dialogue focus tracking for zero pronoun resolution" appeared at NAACL 2015. At ISI, I am working with Prof. Daniel Marcu and Prof. Kevin Knight on application of Abstract Meaning Representation (AMR) to biology literature. Specifically we will be developing techniques for constructing text level AMRs from sentence level AMRs and then assess its impact on reading-against-a-model molecular biology tasks. In my spare time, I enjoy singing, dancing and watching movies.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Aug 2015</td>
<td align=left valign=top>Wenduan Xu (Cambridge / ISI Intern)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Aug_2015');">
Using HyTER networks for short-answer scoring
</a><br>
<span id=abs25_Aug_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> This talk summarizes my work so far on investigating the usefulness of
HyTER networks for short-answer scoring. I will first introduce the
task and the approach we take in this project. And finally I will show
some initial results we have.
<p>
Bio. Wenduan Xu is a graduate student in Cambridge advised by Stephen Clark, working on CCG parsing.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Aug 2015</td>
<td align=left valign=top>Qing Dou (USC / ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Aug_2015');">
Beyond Parallel Data - A Decipherment Approach for Better Quality Machine Translation
</a><br>
<span id=abs14_Aug_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Thanks to the availability of parallel data and advances in machine
learning techniques, we have seen tremendous improvement in the field
of machine translation over the past 20 years. However, due to lack of
parallel data, the quality of machine translation is still far from
satisfying for many language pairs and domains. In general, it is
easier to obtain non-parallel data, and much work has tried to learn
translations from non-parallel data. Nonetheless, improvements to
machine translation have been limited. In this work, I follow a
decipherment approach to learn translations from non parallel data and
achieve significant gains in machine translation.
<p>
I apply slice sampling to Bayesian decipherment. Compared with the
state- of-the-art algorithm, the new approach is highly scalable and
accurate, making it possible to decipher billions of tokens with
hundreds of thousands of word types at high accuracy for the first
time. When it comes to deciphering foreign languages, I introduce
dependency relations to address the problems of word reordering,
insertion, and deletion. Experiments show that dependency relations
help improve Spanish/English deciphering accuracy by over 5-fold.
Moreover, this accuracy is further doubled when word embeddings are
used to incorporate more contextual information.
<p>
Moreover, I decipher large amounts of monolingual data to improve the
state- of-the-art machine translation systems in the scenario of
domain adaptation and low density languages. Through experiments, I
show that decipherment find high quality translations for
out-of-vocabulary words in the task of domain adaptation, and help
improve word alignment when the amount of parallel data is limited. I
observe up to 3.8 point and 1.9 point BlEU gain in Spanish/French and
Malagasy/English machine translation experiments respectively.
<p>
Bio. Qing is a PhD candidate at USC. His research interests focus on
application of machine learning techniques to help computer better
understand human languages. He is working with Kevin Knight on various
problems related to Machine Translation and Decipherment. Prior to
that, he has worked on computational phonology, including stress
prediction and transliteration. He is interested in continuing his
research in industrial settings to solve exciting large scale problems.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Aug 2015</td>
<td align=left valign=top>Marius Pasca (Google)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Aug_2015');">
Understanding the World's Compositional Concepts
</a><br>
<span id=abs07_Aug_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Conference Room [1135]<br>
<b>Abstract:</b> Compositional topics ("Swiss passport", "German grammar") of
interest to Web users may be available as entries within structured
knowledge resources. But such topics are not necessarily connected to,
let alone represented in relation to, entries of the constituent
topics ("Switzerland" and "Passport", or "German language" and
"Grammar") from which their approximate meaning could be aggregated.
Web documents - more precisely, encyclopedic articles - and Web search
queries are shown to be useful in complementary tasks relevant to
understanding compositional topics. The tasks are the decomposition of
potentially compositional topics into zero, one or more constituent
topics; and the interpretation of the role ("issued by", "of") played
by constituents ("Swiss", "German") within ambiguous compositional
phrases that might refer to compositional topics.
<p>
Bio: Marius Pasca is a research scientist at Google in Mountain View,
California. Current research interests include factual information
extraction from unstructured text within documents and queries, and
its applications to Web search.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 Jul 2015</td>
<td align=left valign=top>Sudha Rao (Maryland / ISI Intern)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_Jul_2015');">
Dialogue focus tracking for zero pronoun resolution
</a><br>
<span id=abs24_Jul_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> We take a novel approach to zero pronoun resolution in Chinese: our model explicitly tracks the flow of focus in a discourse. Our approach, which generalizes to deictic references, is not reliant on the presence of overt noun phrase antecedents to resolve to, and allows us to address the large percentage of â€œnon-anaphoricâ€ pronouns filtered out in other approaches. We furthermore train our model using readily available parallel Chinese/English corpora, allowing for training without hand-annotated data. Our results demonstrate improvements on two test sets, as well as the usefulness of linguistically motivated features.
<p>
Bio. I am a PhD student from University of Maryland, College Park working under Prof. Hal Daume III and Prof. Philip Resnik. My recent project on "Dialogue focus tracking for zero pronoun resolution" appeared at NAACL 2015. At ISI, I am working with Prof. Daniel Marcu and Prof. Kevin Knight on application of Abstract Meaning Representation (AMR) to biology literature. Specifically we will be developing techniques for constructing text level AMRs from sentence level AMRs and then assess its impact on reading-against-a-model molecular biology tasks. In my spare time, I enjoy singing, dancing and watching movies.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Jul 2015</td>
<td align=left valign=top>Wenduan Xu (Cambridge / ISI Intern)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Jul_2015');">
Shift-Reduce CCG Parsing with a Dependency Model
</a><br>
<span id=abs17_Jul_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> CCG is able to derive typed dependency structures, providing a useful
approximation to the underlying predicate-argument relations of â€œwho
did what to whomâ€ and dependency structures form an integral part of
CCG. In this talk, I will first cover some essential background on
CCG, its dependency structures and CCG parsing; I will then discuss a
recent dependency model we developed for shift-reduce CCG parsing.  A
challenge arises in this model from the fact that the oracle needs to
keep track of exponentially many gold-standard derivations, which are
all hidden. And we solve this by integrating a packed parse forest
with the beam-search decoder and introduce a novel technique for
querying an exponentially-sized oracle on-the-fly during beam-search
decoding.
<p>
Bio. Wenduan Xu is a graduate student in Cambridge advised by Stephen Clark,
working on CCG parsing.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Jul 2015</td>
<td align=left valign=top>Deniz Yuret (KoÃ§ University / ISI Visitor)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Jul_2015');">
Parsing with word vectors
</a><br>
<span id=abs10_Jul_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> We investigate the use of distributed word representations instead of word forms and parts of speech in syntactic parsing.  Distributed representations are dense, low-dimensional, and real valued vector representations (embeddings) for words.  Instead of ad-hoc feature conjunctions, we use kernels and neural networks for non-linearity, greatly simplifying feature engineering.  We show that dense representations offer both computational and learning advantages compared to sparse one-hot vector representations.  We introduce context vectors, distributed representations for word contexts, and show that they can replace or complement parts of speech in parsing models.  We show that distributed representations give accuracies comparable to the state-of-the-art word form and part-of-speech based feature sets.
<p>
Bio. Deniz Yuret is an associate professor of Computer Engineering at KoÃ§ University in Istanbul working at the Artificial Intelligence Laboratory since 2002. Previously he was at the MIT AI Lab (1988-1999) and later co-founded Inquira, Inc., a company commercializing question answering technology (2000-2002). He has worked on supervised and unsupervised approaches to syntax, morphology, lexical semantics and lexical categories.  His most recent work is on creation and applications of continuous word embeddings.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Jun 2015</td>
<td align=left valign=top>Vivi Nastase (FBK / ISI Visitor)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Jun_2015');">
Metonymy resolution with multi-faceted knowledge from Wikipedia
</a><br>
<span id=abs26_Jun_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 10th Floor Classroom [1016]<br>
<b>Abstract:</b> Metonymic words stand-in for concepts closely related to the words' literal interpretation. Resolving metonymies would then require identifying potentially metonymic words, finding closely related concepts, and determining which one fits the local
(grammatically-related) and global context best. Each of these tasks can be resolved best by using different types of resources: a network of concepts for finding related concepts; a grammatically analyzed corpus (and, ideally, an ontology) for computing selectional preferences for the local context; a large corpus for computing co-occurrence probabilities, to factor in the global context. Within NLP we do have all these types of resources, but because of their different requirements -- e.g. relational models of meaning rely on differentiating word senses, while distributional representations do not (cannot) make such distinctions -- they are separate from one another. By using Wikipedia and exploiting its various structured/semi-structured sources of information, we can build a resource that combines the three types of meaning representations mentioned above. I will discuss the task of metonymy resolution and show how the combination of representations extracted from Wikipedia makes possible an unsupervised approach to this task.
<p>
Bio: Vivi Nastase is a researcher at the Fondazione Bruno Kessler in Trento, working mainly on lexical
semantics, semantic relations, knowledge acquisition and language evolution. She holds a Ph.D.
from the University of Ottawa, Canada, and has previously worked at the Heidelberg Institute of Theoretical Studies (HITS) and the University of Heidelberg.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Jun 2015</td>
<td align=left valign=top>Sravana Reddy (Dartmouth / ISI Visitor)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Jun_2015');">
Automated tools for analyzing sociophonetic variation
</a><br>
<span id=abs23_Jun_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> The phenomenal amount of text on social media has recently spawned endeavors on computational methods to study language variation and change. However, we also have access to an unprecedented quantity of speech -- from Youtube video blogs to podcasts to recordings of radio and television shows, spanning several different accents and dialects. This data is a boon to sociophoneticians, who have traditionally relied on small-scale interviews to study systematic variation in speech. At the same time, it presents a challenge: the usual manual speech analysis methods do not scale.
I will present ongoing work on an application that allows sociophoneticians to identify dialect features from potentially noisy speech data without the need for manual transcription.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>12 Jun 2015</td>
<td align=left valign=top>Yan Liu (USC/MELADY)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs12_Jun_2015');">
Group Anomaly Detection in Social Media Analysis
</a><br>
<span id=abs12_Jun_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Traditional anomaly detection on social media mostly focuses on individual point anomalies while anomalous phenomena usually occur in groups. Therefore it is valuable to study the collective behavior of individuals and detect group anomalies. Existing group anomaly detection approaches rely on the assumption that the groups are known, which can hardly be true in real world social media applications. In this paper, we take a generative approach by proposing a hierarchical Bayes model: Group Latent Anomaly Detection (GLAD) model. GLAD takes both pair-wise and point-wise data as input, automatically infers the groups and detects group anomalies simultaneously. To account for the dynamic properties of the social media data, we further generalize GLAD to its dynamic extension d-GLAD. We conduct extensive experiments to evaluate our models on both synthetic and real world datasets. The empirical results demonstrate that our approach is effective and robust in discovering latent groups and detecting group anomalies.
<p>
Bio: Yan Liu is an assistant professor in Computer Science Department at University of Southern California from 2010. Before that, she was a Research Staff Member at IBM Research. She received her M.Sc and Ph.D. degree from Carnegie Mellon University in 2004 and 2007. Her research interest includes developing scalable machine learning and data mining algorithms with applications to social media analysis, computational biology, climate modeling and healthcare analytics. She has received several awards, including NSF CAREER Award, Okawa Foundation Research Award, ACM Dissertation Award Honorable Mention, Best Paper Award in SIAM Data Mining Conference, Yahoo! Faculty Award and the winner of several data mining competitions, such as KDD Cup and INFORMS data mining competition.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>29 May 2015</td>
<td align=left valign=top>Aliya Deri (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs29_May_2015');">
How to Make a Frenemy: Multitape FSTs for Portmanteau Generation
</a><br>
<span id=abs29_May_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> A portmanteau is a type of compound word that fuses the sounds and meanings of two component words; for example, â€œfrenemyâ€ (friend + enemy) or â€œsmogâ€ (smoke + fog). We develop a system, including a novel multitape FST, that takes an input of two words and outputs possible portmanteaux. Our system is trained on a list of known portmanteaux and their component words, and achieves 45% exact matches in cross-validated experiments.
<p>
Bio.
Aliya Deri is a PhD candidate at USC/ISI.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 May 2015</td>
<td align=left valign=top>Marjan Ghazvininejad (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_May_2015');">
How to Memorize a Random 60-Bit String
</a><br>
<span id=abs22_May_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> User-generated passwords tend to be memorable,  but  not  secure.   A  random,  computer-generated 60-bit string is much more secure.
However, users cannot memorize random 60-bit strings. In this paper, we investigate methods for converting arbitrary bit strings into English word sequences (both prose and poetry), and  we  study  their  memorability  and  other properties.
<p>
Bio: Marjan Ghazvininejad is a second year PhD student in Computer Science at University of Southern California (USC). She is working with Professor Kevin Knight at the Information Sciences Institute (ISI). She is interested in natural language processing, especially the application of machine learning techniques in this area.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 May 2015</td>
<td align=left valign=top>Dehua Cheng (USC/Melady)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_May_2015');">
Exploring LDA: Parallel Inference and Model Selection
</a><br>
<span id=abs15_May_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Latent Dirichlet allocation (LDA) and its Bayesian nonparametric generalization hierarchical Dirichlet processes (HDP) have been proven successful in modeling large, complex, real-world domains. However, inference on LDA/HDP is challenging and it has received notable attention from the researchers. In this talk, we present two algorithmic advances for LDA/HDP inference by examining their mathematical properties. We will first present an effective parallel Gibbs sampling algorithm for LDA/HDP by exploring the equivalency between the Dirichlet-multinomial hierarchy and the Gamma-Poisson hierarchy. Secondly, we will show how to provably select the number of topics for LDA by studying the spectral space of its second order moments (bi-gram statistics).
<p>
Bio: Dehua Cheng is a third year Ph.D. student in the CS department at USC, advised by Professor Yan Liu. Prior to that, he received his B.S. degree in Mathematics and Physics from Tsinghua University, China. His research interests include randomized numerical algorithm in machine learning and parallel inference for probabilistic graphical model.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 Apr 2015</td>
<td align=left valign=top>David Kauchak (Pomona )</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_Apr_2015');">
Learning To Simplify Text One Sentence at a Time
</a><br>
<span id=abs24_Apr_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Information can now be found on almost any topic ranging from news to do-it-yourself guides to health-related articles.  Unfortunately for readers, the complexity and readability of these texts can vary widely.  Even if the concepts of an article are accessible, the language and structure of the text can prohibit a person from understanding these concepts.
<p>
Text simplification techniques are aimed at reducing the reading and grammatical complexity of text while retaining the meaning and are one approach to increasing information accessibility.  Motivated by both corpus analyses and human experiments, I will introduce a number of recent text simplification techniques ranging from semi-automated approaches, that require a human in the loop, to automated approaches, including word-level, phrase-level and syntax-level models.
<p>
Bio: David Kauchak is currently an assistant professor in the Computer Science Department at Pomona College.  Previously, he was at Middlebury College and has worked at Google, ISI, PARC and Adchemy.  He received his Ph.D. in Computer Science from University of California, San Diego.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Apr 2015</td>
<td align=left valign=top>Longhua Qian (Soochow / ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Apr_2015');">
Exploiting Bilingual Corpora for Relation Extraction
</a><br>
<span id=abs17_Apr_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Large-scale labeled corpora are always critical for Natural Language Processing tasks using statistical machine learning methods, but at the great expense of human labor of annotation. While we have various labeled corpora in different languages at hand, such as English and Chinese, either resource-rich or resource-poor, can these corpora be taken full advantage of for NLP tasks in different languages to help each other? The difficult lies in the fact that parallel corpora with aligned NLP entities are hard to acquire. In this talk, I shall first discuss how to generate pseudo-parallel corpora for relation extraction via machine translation and entity alignment techniques, and then I will proceed to apply this corpora to statistical ML-based relation extraction in terms of the degree of supervision: (1) supervised learning; (2) bilingual co-training; (3) bilingual active learning. This talk is chiefly based on the ACL-2014 paper â€œBilingual Active Learning for Relation Classification via Pseudo Parallel Corporaâ€.
<p>
Bio: Longhua Qian is a visiting researcher from the School of Computer Science and Technology, Soochow University, China. He joined the Natural Language Group and will work with Professor Kevin Knight and his team members for one year. He will participate in on-going projects on Abstract Meaning Representation (AMR) and Machine Reading etc. His mainly focuses on information extraction, relation extraction, and entity linking etc. He is also interested in extracting information from clinical medical records and building social networks from free text.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Apr 2015</td>
<td align=left valign=top>Atefeh Farzindar (NLP Technologies)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Apr_2015');">
TRANSLI, NLP-based social media analytics and monitoring
</a><br>
<span id=abs10_Apr_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> NLP Technologies have developed a technology for automated analysis of social media data. TRANSLI Social Media Analytics and monitoring, is an online visual analytics system designed to provide social intelligence from news and other events from Twitter. During this seminar, Dr. Atefeh Farzindar will give a presentation on TRANSLI-SM where the system features an intuitive user interface and is designed to browse and visualise the results of the semantic analysis of social discussion on specific events from Twitter. The user can obtain the information not only limited to the main event of interest but also to the intelligence for the sub events.
<p>
NLP Technologies Inc. is a Canadian company founded in 2005 and that expanded to California in 2014. The company specialises in natural language processing, NLP-based search engines, translation technologies and services, social media analytics, and automatic summarization.
http://www.nlptechnologies.ca/
<p>
<p>
Bio: Dr. Atefeh Farzindar, CEO NLP Technologies Inc.
and Adjunct professor at University of Montreal
Dr. Atefeh Farzindar is the co-founder and CEO of NLP Technologies. She received her PhD in Computer Science from the University of Montreal and her Doctorate in automatic summarization of legal documents from Paris-Sorbonne University in 2005. She has been an Adjunct professor at the Department of Computer Science at the University of Montreal since 2010, and she was Honorary Research Fellowship at the Research Group in Computational Linguistics at the University of Wolverhampton, UK (2010-2012).
<p>
Dr. Farzindar has been Action Editor in the international journal of Computational Intelligence since 2011. She co-edited two special issues on social media analysis for the International Journal of Computational Intelligence (CI) and Journal TAL, an international journal on natural language processing.
<p>
She co-authored an upcoming book on Natural Language Processing for Social Media [Morgan & Claypool Publishers, 2014], and authored a book chapter in Social Network Integration in Document Summarization, Innovative Document Summarization Techniques: Revolutionizing Knowledge Understanding, IGI Global publisher January 2014.
<p>
In 2013, Dr. Farzindar won  Femmessor-MontrÃ©alâ€™s contest,  Succeeding with a balanced lifestyle, in the  Innovative Technology and Information and Communications Technology category because of her involvement in the arts. Her paintings have recently been published in a book titled One Thousand and One Nights, in which the palette of vivid colours and her unique contemporary style revolved around on the place of women in modern society (Vernissage & Artist Book Launch April, MontrÃ©al, Galerie 203 https://www.youtube.com/watch?v=TLCghx1mvzY)
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 Apr 2015</td>
<td align=left valign=top>Don Metzler (Google)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_Apr_2015');">
Keeping Topic Models Fresh: Technical and Practical Challenges
</a><br>
<span id=abs03_Apr_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Topic models are statistical models that can be used to infer the most likely topics that some piece of text is about. Such models are useful for applications that rely on semantic representations of text, such as query classification, document understanding, and measuring semantic similarity. These models are widely used within Google. In this talk, I will first describe the details of one of these models -- one that learns over a million topics covering just about every language. I will then describe a number of technical and practical challenges involved in keeping such a model fresh and up-to-date within real-world applications.
<p>
Bio: Donald Metzler is a Staff Software Engineer at Google Inc. Prior to that, he was a Research Assistant Professor at the University of Southern California (USC) and a Senior Research Scientist at Yahoo!. He has served as the Program Chair of the WSDM, ICTIR, and OAIR conferences and sat on the editorial boards of the major journals. He has published over 40 research papers, has been awarded 4 patents, and co-authored the textbook Search Engines: Information Retrieval in Practice.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>20 Mar 2015</td>
<td align=left valign=top>Tomer Levinboim (Notre Dame)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_Mar_2015');">
Multitask Word Alignment with Random-Walk Regularizers
</a><br>
<span id=abs20_Mar_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Suppose we translate a word from English to French and back. Should we get the original English word? That is, is translation invertible?
Alternatively, suppose we translate an English word e to Spanish and then from Spanish to French, obtaining a word f.
Should e-f be a valid entry in an English-French dictionary? That is, is translation transitive?
Intuitively, if translation is done carefully, we expect to answer both these questions with "Yes, with high probability".
In this talk, I will discuss how to formulate our intuition about invertibility/transitivity with random-walks, using translation probability distributions.
I will then present two random-walk based regularization techniques that we recently used in a multitask word alignment setting:
(1) Model Invertibility Regularization (MIR) - a concave regularizer for bi-directional models which can be applied even without parallel data.
(2) Triangulation based Dirichlet prior - a method that capitalizes on parallel data with a pivot language, to construct and learn better translation priors.
This talk is based on joint work with Prof. David Chiang (ND) and Dr. Ashish Vaswani (ISI).
<p>
Bio:
Tomer Levinboim is a PhD student at the University of Notre Dame, working with Prof. David Chiang on developing machine learning techniques for improving machine translation and NLP of low resource languages.
He is generously hosted by Kevin Knight at USC/ISI.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 Mar 2015</td>
<td align=left valign=top>Neda Jahanshad (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_Mar_2015');">
Multi-site genetic analysis of the brainâ€™s white matter: ENIGMA-DTI
</a><br>
<span id=abs06_Mar_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> The functioning regions of the brain are connected through a complex network of fibers, described by the brainâ€™s white matter. Non-invasive imaging using MRI-based diffusion imaging can help capture important characteristics of the connections by describing the strength and directionality profile of water diffusion along white matter fibers. Variability in these connections have been noted in many neurological, degenerative, and psychiatric disorders where ultimately information transfer from on brain region to the other may be weakened or completely compromised. To discover genetic risk factors for altered connectivity and common genetic variants which put the brain at subtle risk for weakened connections, we find power in sample size and pool multiple datasets from around the world to determine common effects in all populations. However, there is no standard method for acquiring diffusion images and standardizing measures across datasets is an ongoing challenge. The Enhancing Neuro Imaging Genetics through Meta Analysis group on Diffusion Tensor Imaging has established a set of basic protocols to overcome a portion of these challenges, which I will describe, along with works-in-progress to tackle additional obstacles to reveal critical details of the brains network.
<p>
Bio. Neda Jahanshad is an assistant professor of Neurology at USC in the Imaging Genetics Center at ISI.  She received her PhD in Biomedical Engineering at UCLA in 2012 where she worked on optimizing diffusion imaging protocols to map structural brain connections in large populations. She has since extended the work to explore methods of pooling such imaging data from across the world and determine genetic and environmental contributions to the connectivity of the brain and determine how these effects vary across the lifespan. She is coordinating one of the largest studies of the brain's white matter through the ENIGMA Consortium http://enigma.ini.usc.edu.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>20 Feb 2015</td>
<td align=left valign=top>Jonathan May (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_Feb_2015');">
Semantic Parsing as Machine Translation
</a><br>
<span id=abs20_Feb_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> We cast the generation of semantic graphs from natural language text as a machine translation problem, where the source language is English and the target language is a labeled graph representing a semantic interpretation, known as an Abstract Meaning Representation (AMR). Via a series of data transformations we create a training set that is amenable to a string-to-tree syntax mt decoder. Previous work in SBMT and AMR parsing is combined to yield a trainable system that achieves state-of-the-art parsing results.
<p>
Bio: Jonathan May is a computer scientist at USC-ISI, where he also received a PhD in 2010. His current focus areas are in machine translation, machine learning, and natural language understanding. Jonathan co-developed and patented a highly portable method for optimizing thousands of features in machine translation systems that has since been incorporated into all leading open source MT systems. He has previously worked in automata theory and information extraction and at SDL Language Weaver and BBN Technologies.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>13 Feb 2015</td>
<td align=left valign=top>Dogan Can (USC/SAIL)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs13_Feb_2015');">
Efficient Computation of Substring Posteriors from Lattices using Weighted Factor Automata
</a><br>
<span id=abs13_Feb_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Efficient computation of substring posteriors from lattices has applications in the estimation of document frequencies in spoken corpora and lattice-based minimum Bayes-risk decoding in statistical machine translation. In this talk, we present a new algorithm for exact substring posterior computation that leverages the following observations to speed up computation: i) the set of substrings for which the posteriors will be computed typically comprises all n-grams in the lattice up to a certain length, ii) posterior probability is equivalent to expected count for substrings that do not repeat on any path of the input lattice, iii) there are efficient algorithms for computing expected counts from lattices. We present experimental results comparing our algorithm with the best known algorithm in literature as well as a baseline algorithm based on finite state automata operations.
<p>
Bio: Dogan Can is a fifth year Ph.D. student at USC SAIL (Signal Analysis and Interpretation Lab). He works with Professor Shrikanth Narayanan on a range of topics including lattice indexing for spoken information retrieval, concurrent/online speech processing architectures and statistical modeling of psychotherapy sessions. His research interests include weighted finite state automata, automatic speech recognition, information retrieval, dialogue modeling and behavioral informatics.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Jan 2015</td>
<td align=left valign=top>Derrek Hibar (USC/INI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Jan_2015');">
Neuroimaging Genetics in the ENIGMA Consortium
</a><br>
<span id=abs30_Jan_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> The highly complex structure of the human brain is strongly shaped by genetic influences. Subcortical brain regions act jointly with cortical areas to coordinate movement, memory, motivation, reinforcement and learning. To investigate how common genetic variants affect the structure of these brain regions, we conducted genome-wide association studies (GWAS) of the volumes of seven subcortical regions and intracranial volume, derived from magnetic resonance images (MRIs) of 30,717 individuals. By identifying genetic influences on brain structure, we can begin to map the genetic architecture underlying variability in human brain development and function, a process that will help elucidate the dysfunctions that lie at the core of neuropsychiatric disorders.
<p>
Bio: Derrek Hibar is an assistant professor in the Department of Neurology in the Keck School of Medicine of USC where he studies common genetic influences on brain structure and susceptibility to psychiatric disorders. He is currently coordinating one of the largest studies of brain structure to date as part of the ENIGMA Consortium (http://enigma.ini.usc.edu).
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Jan 2015</td>
<td align=left valign=top>Devin Griffiths (USC/Dornsife)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Jan_2015');">
Understanding Analogies: Theory and Method
</a><br>
<span id=abs23_Jan_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Analogies allow us to make connections between different domains of knowledge and to apply what we already know to new situations. For this reason, they're important to developing new theories and new understandings of the social and natural world, and have often been seen as an important task for machine learning. In my talk, I'll explore how different theories of how analogy works shape the different approaches that research teams take when modeling analogical thinking. Specifically, I'll contrast what I term "formal" or "top-down" theories of analogy with a "serial" or "bottom-up" approach. Finally, I'll describe a syntactic and semantic method for searching out analogies within corpora. I'm convinced that understanding analogies better, and being able to find locate new analogies in historical documents, can help us understand where new ideas come from.
<p>
Bio: Devin Griffiths is an assistant professor in the English Department at USC, where he studies nineteenth-century British literature and scientific history. His current book project, titled "Between the Darwins," explores how analogies were used in the nineteenth century to create new theories of evolution and social progress. His areas of research include science and literature, poetics, book history, and the digital humanities.
<p>
Webcast Link: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=56439af4a5cb41f49a2c5faef5683cd11d
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Jan 2015</td>
<td align=left valign=top>Jonathan Gordon (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Jan_2015');">
Towards the Interpretation of Metaphoric Language
</a><br>
<span id=abs16_Jan_2015 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Understanding what people mean when they use metaphoric language is a central problem in natural language understanding. Metaphors give a partial understanding of one kind of experience in terms of another, highlighting similarities and hiding differences. In this talk, I give an overview of the problems posed by metaphoric language. I then describe ongoing crosslinguistic work on the knowledge-based interpretation of metaphors by abductive inference. This work moves us toward a better understanding not only of what people are saying with metaphors but also how the metaphors used by groups of people (e.g., the supporters and opponents of gun control) expose their different world views.
<p>
Bio:
Jonathan Gordon is a postdoctoral researcher at the USC Information Sciences Institute, where he is advised by Jerry Hobbs. His 2014 doctoral dissertation, 'Inferential Commonsense Knowledge from Text', was supervised by Lenhart Schubert at the University of Rochester. Jonathan's research interests include natural language understanding, semantics, and knowledge extraction.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Dec 2014</td>
<td align=left valign=top>Kingson Man (USC/BCI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Dec_2014');">
Multisensory integration in a neural framework for concepts
</a><br>
<span id=abs05_Dec_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> <p>
How are concepts represented in the brain? When we hear the ringing of a bell, or watch a bell swinging back and forth, is there a shared "BELL" pattern of neural activity in our brains? Philosophers have debated the nature of concepts for centuries, but recent technical advances have allowed neuroscientists to make contributions to this topic. The combination of functional neuroimaging and machine learning has allowed us to examine distributed patterns of activity in the human brain to decode what they represent about the world, and to what level of abstraction. I describe our recent findings that revealed a hierarchical organization of multisensory information integration, leading to representations that generalize across different sensory modalities. I will also discuss our work on the social function of concepts, which enables the communication of similar thoughts and associations between individuals.
<p>
<p>
<p>
Bio:
<p>
I am a research associate at the Brain and Creativity Institute of the University of Southern California. I earned my Ph.D. at USC, mentored by Antonio Damasio. I am interested in the general problem of consciousness, and in particular how different sensations are bound together by the brain into a unified experience of the world.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>20 Nov 2014</td>
<td align=left valign=top>Robert Munro (Idibon)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_Nov_2014');">
Technologies for every language: how machine learning can reach everyone in the world
</a><br>
<span id=abs20_Nov_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Speakers of more than 5,000 languages have access to internet and communication technologies. The majority of phones, tablets and computers now ship with language-enabled capabilities like speech-recognition and intelligent auto-correction, and people increasingly interact with data-intensive cloud-based language technologies like search-engines and spam-filters. For both personal and large-scale technologies, the service quality drops or disappears entirely outside of a handful of languages. Speakers of low-resource languages correlate with lower access to healthcare, education and higher vulnerability to disasters. Serving the broadest possible range of languages is crucial to ensuring equitable participation in the global information economy.
I will present examples of how natural language processing and distributed human computing are improving the lives of speakers of all the world's languages, in areas including education, disaster-response, health and access to employment. When applying natural language processing to the full diversity of the world's communications, we need to go beyond simple keyword analysis and implement complex technologies that require human-in-the-loop processing to ensure usable accuracy. I will share results that show how for-profit technologies are improving people's lives by providing sustainable economic growth opportunities when they support more languages, aligning business objectives with global diversity.
<p>
Bio:
Robert Munro is the CEO of Idibon, a company with the objective of providing language technologies for all the world's languages. In past work he has served as Chief Information Officer for the largest solar energy company in Sierra Leone; was the Chief Technology Officer for the largest use of big data technologies to track disease outbreaks globally; worked for the UN High Commission for Refugees in Liberia; lead the crowdsourced response to the 2010 earthquake Haiti; and has helped information processing in disaster response and election monitoring in more than a dozen countries. In current work, Idibon helps everyone from Fortune 500s to disaster response organizations process language data at scale. Outside of work, he has learned about the world's diversity by cycling more than 20,000 kilometers across 20 countries. Robert has a PhD from Stanford University.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Nov 2014</td>
<td align=left valign=top>Gully Burns (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Nov_2014');">
Machine Reading of the Biomedical Literature: It's All About Data
</a><br>
<span id=abs14_Nov_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Like most scientific disciplines, cancer biology involves performing experiments and interpreting them. At present, most modeling efforts center on trying to bring together collections of interpretations as 'pathway diagrams' but do not attempt to capture the semantics of supporting experimental data. Here, I will describe a new strategic approach for machine reading of scientific articles based on a generic representation of experimental data with explicit examples within the field of cancer biology. I will also discuss this effort in the context of the Abstract Meaning Representation (AMR) and present an informal generative story for your consideration and feedback.
<p>
Bio: Gully Burns develops pragmatic biomedical knowledge engineering systems for scientists that provide directly useful functionality in their everyday use and is based on innovative, cutting edge computer science. He was originally trained as a physicist at Imperial College in London before switching to do a Ph.D. in neuroscience at Oxford. He came to work at USC in 1997, developing the 'NeuroScholar' project in Larry Swanson's lab before joining the Information Sciences Institute in 2006. He is now works as project leader in ISI's Information Integration Group.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Nov 2014</td>
<td align=left valign=top>Nima Pourdamghani (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Nov_2014');">
Aligning English Strings with Abstract Meaning Representation Graphs
</a><br>
<span id=abs07_Nov_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> We align pairs of English sentences and corresponding Abstract Meaning Representations (AMR), at the token level.  Such alignments will be useful for downstream extraction of semantic interpretation and generation rules.  Our method involves linearizing AMR structures and performing symmetrized EM training.  We obtain 86.5% and 83.1% alignment F score on development and test sets.
<p>
Bio:
Nima Pourdamghani is a second year Ph.D. student at ISI. He works with Professor Kevin Knight on Abstract Meaning Representation and its application to machine translation.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>31 Oct 2014</td>
<td align=left valign=top>Nikolaos Malandrakis (USC/SAIL)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs31_Oct_2014');">
Generating Psycholinguistic Norms
</a><br>
<span id=abs31_Oct_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Abstract numerical representations of word and term content are very popular in NLP applications of behavioral analysis, like sentiment analysis, where the low dimensional representation allows for the use of complicated machine learning techniques, despite the lack of annotated in-domain data. In this presentation we will discuss our experiments on automatically expanding manually annotated lexica of linguistic norms, starting from word emotion norms and generalizing to include higher order terms, norms beyond emotion (like concreteness and age of acquisition) as well as languages other than English. We will present our attempts at domain adaptation of these norms, as well as the composition of norms for larger lexical units via their constituents by utilizing distributional semantic representations. As examples of actual applications we will present a highly ranked system of sentiment analysis submitted to SemEval 2014 and a multi-modal depression diagnosis system for German submitted to AVEC 2014.
<p>
Bio: Nikolaos Malandrakis is a third year PhD student at the USC Computer Science Department and a research assistant at the Signal Analysis and Interpretation Laboratory (SAIL). He is originally from Chania, Greece, where he completed a BSc and MSc in Computer Engineering at the Technical University of Crete.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Oct 2014</td>
<td align=left valign=top>Qing Dou (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Oct_2014');">
Beyond Parallel Data: Joint Word Alignment and Decipherment Improves Machine Translation [EMNLP Practice Talk]
</a><br>
<span id=abs17_Oct_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Inspired by previous work, where decipherment is used to improve machine translation, we propose a new idea to combine word alignment and decipherment into a single learning process. We use EM to estimate the model parameters, not only to maximize the probability of parallel corpus, but also the monolingual corpus. We apply our approach to im- prove Malagasy-English machine transla- tion, where only a small amount of paral- lel data is available. In our experiments, we observe gains of 0.9 to 2.1 Bleu over a strong baseline.
<p>
Qing Dou is a fifth year Ph.D. student at ISI. He works with Professor Kevin Knight on various decipherment problems and its application to different Natural Language Processing tasks.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Oct 2014</td>
<td align=left valign=top>Boris Gutman (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Oct_2014');">
Interplay between Continuous and Discrete Aspects of Brain Image Analysis
</a><br>
<span id=abs10_Oct_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Brain MRI offers tremendous opportunity to learn about cortical anatomy, function and connectivity. In this talk I will go over several standard techniques for image understanding used in brain imaging. These include image registration, segmentation, tractography and graph-based connectivity analyses. Among these algorithms, we routinely encounter both continuous and discrete types of analysis. Non-linear image registration, typically formalized as a diffeomorphism on the image domain, is an example of the former:  we may ask for instance how much volume change the brain is experiencing locally over time, clearly a continuous measure. In another example, we may trace continuous curves in space that best fit a Diffusion Tensor MR image to approximate fibers in the brainâ€™s white matter. One the other hand, connectivity between distinct units within the nervous system is an example of discrete analysis: for instance, the brainâ€™s functionally distinct regions are thought of as nodes in a graph, whose edges are defined by the connecting fiber models.
<p>
<p>
After a brief description of the standard methods at hand, I will suggest an approach for combining the two types of analysis. By assuming the continuous paradigm for connectivity, we can push our connectome model from being a discrete graph to being a linear operator. Using some well-known results from operator theory, we can decompose the operator into its resident â€œeigen-networks,â€ and apply continuous methods directly. As an example, we can spatially register connectivity matrices with spatially distributed nodes. Finally, I will show two simple examples of continuous analogues for standard graph theory measures, and their potential application for an Alzheimer â€™s disease study.
<p>
<p>
Bio: Boris Gutman received his B.S. in Applied Mathematics and PhD in Biomedical Engineering from UCLA before joining USCâ€™s Imaging Genetics Center (IGC). He is currently a post-doctoral scholar at the IGC, under the supervision of Professor Paul M. Thompson.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 Oct 2014</td>
<td align=left valign=top>Kevin Knight (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_Oct_2014');">
Getting Good at Research
</a><br>
<span id=abs03_Oct_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> If you do good research, you'll find that many doors open. I'll offer some suggest for how to make that happen. This should be an interactive session.
<p>
Bio: Kevin Knight is the director of the ISI Natural Language group, a professor of Computer Science at USC, and an ISI Fellow.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Sep 2014</td>
<td align=left valign=top>Bill MacCartney (Google/Stanford)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Sep_2014');">
Semantic Parsing at Google
</a><br>
<span id=abs26_Sep_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> With the shift from desktop to mobile, and the rise of voice-driven UIs, a growing proportion of the Google query stream is not well-served by conventional keyword-based information retrieval.  More and more queries use natural language ("when does walgreens close"), seek answers not found on any web page ("how do i get to work from here"), or demand action rather than information ("text my wife i'm 10 minutes late").  Satisfying such queries requires semantic parsing, that is, mapping the query into a structured, machine-readable representation of meaning.  In this talk, I will give an overview of the techniques Google has developed to address the problem of semantic parsing, and discuss some of the challenges that remain.  I'll also highlight differences between academia and industry in how the problem is conceived.
<p>
Bio: Bill MacCartney is a Senior Research Scientist at Google, working primarily on semantic parsing. He is also a Consulting Assistant Professor of Computer Science at Stanford.  For more info: http://nlp.stanford.edu/~wcmac/
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 Sep 2014</td>
<td align=left valign=top>Markus Dreyer (SDL)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_Sep_2014');">
An open-source toolkit for the representation, manipulation and optimization of weighted hypergraphs
</a><br>
<span id=abs19_Sep_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Weighted hypergraphs arise naturally in parsing, syntax-based machine
translation and other tree-based NLP models, as well as in weighted
logic programming.
<p>
We present an open-source toolkit for the representation and
manipulation of weighted hypergraphs. It provides hypergraph data
structures and algorithms, such as the shortest path and
inside-outside algorithms, composition, projection, and more. In
addition, it provides functionality to optimize hypergraph feature
weights from training data. We model finite-state machines as a
special case. We give a tutorial on hypergraphs and the hypergraph
toolkit and explain how you can use these tools in your research.
<p>
This is joint work with Jonathan Graehl.
<p>
Bio: Markus Dreyer is a Senior Research Scientist at SDL Language
Weaver. His research focuses on algorithms and machine learning
techniques for large-scale machine translation and NLP. He received
his PhD in Computer Science from Johns Hopkins University, advised by
Jason Eisner. For more information, see http://goo.gl/d6mHUi.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Sep 2014</td>
<td align=left valign=top>Eunsol Choi (University of Washington) and Matic Horvat (Cambridge)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Sep_2014');">
Towards automatic extraction of experimental data from scientific papers [Intern final talk]
</a><br>
<span id=abs11_Sep_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Many areas of science have experienced rapid growth in the amount of scientific literature published. For example, there are approximately 400 new papers published each year in the area of Machine Translation. As such amount of new data is virtually impossible to processes by a single researcher, a new tool is needed that would help researchers explore existing and discover new MT literature. To address this problem we built an approach for automatic extraction of experimental data from scientific papers that populates a database enabling structured queries.
<p>
Bios:
Eunsol Choi is a PhD student at the University of Washington, advised by Prof. Luke Zettlemoyer. Prior to UW, she studied mathematics and computer science at Cornell University.
<p>
Matic Horvat is a PhD student at University of Cambridge researching integration of semantics and Statistical Machine Translation. He is originally from Ljubljana, Slovenia, where he completed a BSc in Computer Science in 2012. He continued with a masters in Advanced Computer Science at University of Cambridge, graduating in 2013.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Sep 2014</td>
<td align=left valign=top>Claire Bonial (University of Colorado, Boulder)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Sep_2014');">
Take a look at this!  Form, Function and Productivity of English Light Verb Constructions
</a><br>
<span id=abs05_Sep_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> English light verb constructions (LVCs), such as have a drink, make an offer, take a bath, do an investigation, and give a groan, represent a powerfully expressive type of Multi-Word Expression (MWE) in English; however, the precise definition, semantic function and productivity of English LVCs remain unclear, hampering efforts to treat LVCs appropriately in Natural Language Processing (NLP) resources.  This research focuses on exploring these three issues.  A definition for LVCs that combines syntactic and semantic criteria is developed, initially based on existing research on delimiting and defining LVCs, and iteratively refined during the development of an LVC annotation schema for the PropBank project (Palmer et al., 2005).  Existing theories on the linguistic function of LVCs both cross-linguistically and in English are discussed, and a corpus study of LVCs provides evidence that the primary function of LVCs in English is to enable speakers to describe events in a manner that can take advantage of rich nominal modification; for example, The inspector general did a rather controversial investigation...  Finally, a dominant hypothesis concerning the productivity of certain verbal constructions is investigated in relation to LVCs, using large-scale Mechanical Turk surveys.  This firstly probes the question of why â€œfamiliesâ€ of semantically similar LVCs occur (e.g. make a statement/speech/declaration/proposal), but other arguably similar LVCs are odd to speakers (e.g. ?make a yell/hint).  Secondly, this provides the groundwork for better detection of very low frequency LVCs that arise from a speakerâ€™s ability to shift and extend verbal meanings within novel constructions.  The contributions of these findings on both NLP and linguistic theory are presented.
<p>
Bio:
Claire Bonial is in the final weeks of her academic adventure at the University of Colorado, Boulder, and will soon be graduating with a joint PhD in Linguistics and Cognitive Science.  Claire has had a rich academic and professional experience at CU, including collaborative research with Martha Palmer on a variety of Natural Language Processing (NLP) resources, such as PropBank and VerbNet.  Although Claire has served in many roles on these projects, her primary contribution has been theoretical research on English Light Verb Constructions (LVCs, e.g. take a walk, make a mistake).  Specifically, she has worked to improve the coverage and treatment of this type of multi-word expression.  Two years ago, Claire began working with Kevin Knight and ISI on the Abstract Meaning Representation (AMR) project, which has since afforded her several lovely trips to Marina del Rey, as well as the opportunity to apply her linguistic expertise in the development of this exciting and challenging project.
<p>
Webcast Link:
http://webcasterms1.isi.edu/mediasite/Viewer/?peid=c389f52cfb16424facb6386ff180de771d
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>29 Aug 2014</td>
<td align=left valign=top>Allen Schmaltz (Harvard) and Julian Schamper (RWTH Aachen)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs29_Aug_2014');">
Toward Semantic Parsing [Intern final talk]
</a><br>
<span id=abs29_Aug_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Semantic parsing has potential applications in a number of areas, including machine translation and machine reading, among many others. In this talk we will present our initial work on the parsing task for the semantic representation language known as Abstract Meaning Representation (AMR). The task is to take an English sentence and transform it into its semantic representation.
<p>
We will present a series of approaches and associated results, providing guidelines for future work in this area. We will show approaches using heuristics, tree transducers, and probabilistic context free grammars. We will also present approaches for AMR rule extraction for the applicable formalisms. In doing so, we will also highlight challenges relative to syntactic parsing.
<p>
Additionally, we will provide a map for the future directions in AMR parsing that we plan to pursue in the fall.
<p>
Bios:
Allen Schmaltz is a Ph.D. student in Computer Science in the School of Engineering and Applied Sciences at Harvard University (2013-present; S.M. 2014), working with Stuart Shieber. He is interested in formal, statistical, and human-augmented machine learning approaches for computational linguistics. Before starting his Ph.D. in Computer Science, he completed the better part of an additional Ph.D. in the (quantitative) social sciences at Harvard University (2010-2013), received a M.A. from Stanford University (2010), and received a B.A. from Northwestern University (2006). Earlier in his academic career he also studied at Cornell University and in Yokohama, Japan, among other places.
<p>
Julian Schamper studies computer science at RWTH Aachen University. He did his bachelor thesis in the field of deciphering foreign language and works as a student research assistant at Prof. Hermann Ney's Human Language Technology and Pattern Recognition Group.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Aug 2014</td>
<td align=left valign=top>Allen Schmaltz (Harvard)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Aug_2014');">
Determinantal Point Processes for Human-Augmented Machine Translation [Intern talk]
</a><br>
<span id=abs22_Aug_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> <p>
This talk will introduce languageFractal, an online system for human-augmented machine translation (MT) that aims to incorporate monolingual speakers into the translation pipeline in a cost-effective manner. The essential principle is to take a middle ground between pure MT and a fully crowdsourced approach by augmenting MT results with human corrections in an iterative cycle. To efficiently emit phrases and sentences to users and to effectively explore the space of possible translation options, we propose the use of determinantal point processes (DPPs), which can be used to model subset selection problems in which diversity of the subset is a desirable characteristic.
<p>
I will provide a brief tutorial on DPPs (including L-ensembles and the structured variant), and I will present an overview of our formulation of DPPs for dynamic programming problems in the context of the human-augmented machine translation pipeline. I will also introduce the languageFractal pilot and pipeline, the full trials of which will run through the 2014-2015 academic year at Harvard University.
<p>
<p>
Bio: Allen Schmaltz is a Ph.D. student in Computer Science in the School of Engineering and Applied Sciences at Harvard University (2013-present; S.M. 2014), working with Stuart Shieber. He is interested in formal, statistical, and human-augmented machine learning approaches for computational linguistics. Before starting his Ph.D. in Computer Science, he completed the better part of an additional Ph.D. in the (quantitative) social sciences at Harvard University (2010-2013), received a M.A. from Stanford University (2010), and received a B.A. from Northwestern University (2006). Earlier in his academic career he also studied at Cornell University and in Yokohama, Japan, among other places.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>08 Aug 2014</td>
<td align=left valign=top>Tim Schlippe (Karlsruhe Institute of Technology)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs08_Aug_2014');">
Rapid Generation of Pronunciation Dictionaries for New Domains and Languages
</a><br>
<span id=abs08_Aug_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Automatic speech recognition systems exist only for a small fraction of the more than 7,100 languages in the world since the development of such systems is usually expensive and time-consuming. Therefore, porting speech technology rapidly to new languages with little effort and cost is an important part of research and development.
Pronunciation dictionaries are a central component for both automatic speech recognition and speech synthesis. They provide the mapping from the orthographic form of a word to its pronunciation, typically expressed as a sequence of phonemes.
I will present innovative strategies and methods for the rapid generation of pronunciation dictionaries for new application domains and languages. Depending on various conditions, solutions are developed and proposed â€“ starting from the simple scenario in which the target language can be found in written form on the Internet and we have a simple mapping between speech and written language â€“ up to the difficult scenario in which no written form for the target language exists. We embedded many of the tools implemented in this work in the Rapid Language Adaptation Toolkit. Its web interface is publicly accessible and allows people to build first speech recognition systems with little technical background.
<p>
<p>
Bio: Since 2008 Tim Schlippe is a research assistent and PhD student at Karlsruhe Institute of Technology (KIT), Institute for Anthropomatics, in Germany.
At KIT he is involved in teaching and several projects. He has published multiple publications in the field of multilingual speech recognition.
For his master's thesis he was as a visiting researcher at Carnegie Mellon University, doing research in the field of statistical machine translation.
Tim Schlippe will finish his PhD in November 2014. His current research interests are:
Multilingual speech recognition with a focus on rapid adaptation of speech recognition systems to new domains and languages, pronunciation modeling, and language modeling.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>31 Jul 2014</td>
<td align=left valign=top>Ali Borji (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs31_Jul_2014');">
Computational Modeling of Bottom-up and Top-down Visual Attention
</a><br>
<span id=abs31_Jul_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 11:00 am - 12:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Over the last two decades, the inter-disciplinary fields of visual attention and saliency have attracted
a lot of interest in cognitive sciences, computer vision, robotics, and machine learning. The high complexity of natural environments requires the primate visual system to combine, in a highly dynamic and adaptive manner, sensory signals that originate from the environment (bottom-up) with behavioral goals and priorities dictated by the task at hand (top-down). I will talk about my recent research in two directions: 1) Bottom-up attention: I will give a snapshot of biological findings on visual attention (e.g., how gaze direction of people in a scene influences eye movements of an external observer), theoretical background on saliency concepts, our model benchmark and saliency models, and 2) Top-down attention: I will describe our neuromorphic algorithms to predict, in a task-independent manner, which elements in a video scene might more strongly attract the gaze of a human. Multi-modal data including bottom-up saliency, "gist" or global context, physical actions and object properties (using example recorded eye movements and videos of humans engaged in various 3D video games, including flight combat, driving, first-person shooting, running a hot-dog stand that serves hungry customers) are utilized to associate particular scenes with particular locations of interest, given the task (e.g., when the task is to drive, if the scene depicts a road turning left, the system learns to look at that left turn). Finally, I will present some successful engineering and clinical applications of our models.
<p>
Bio:
Ali Borji received the BS and MS degrees in computer engineering from the Petroleum University of Technology, Tehran, Iran, 2001 and Shiraz University, Shiraz, Iran, 2004, respectively.
He received the PhD degree in computational neurosciences from the Institute for Studies
in Fundamental Sciences (IPM) in Tehran, 2009. He then spent a year at University of Bonn as a postdoc. He has been a postdoctoral scholar at iLab, University of Southern California, Los Angeles since March 2010.
<p>
His research interests include computer vision, machine learning, and neurosciences with particular emphasis on visual attention, visual search, active learning, scene and object recognition, and biologically plausible vision models.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Jul 2014</td>
<td align=left valign=top>Daniel Lamprecht (TU Graz)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Jul_2014');">
Navigation Dynamics in Networks
</a><br>
<span id=abs25_Jul_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Research on networks has already revealed much about the structure of real-world networks. Network dynamics such as navigation or exploration, however, are something less well-researched. Yet, we constantly design and use networked systems meant for navigation and exploration. In this talk, I will present a short overview of what we know about navigability, followed by the our work on exploring dynamics occurring on recommendation networks - networks formed implicitly by recommender systems. Navigability can serve as an evaluation criterion for recommender systems and reveal to what extent a system supports navigation and exploration. Based on analysis of topology and dynamical processes, we find that current systems do not support navigation very well, and propose techniques to overcome this.
<p>
Bio: Daniel Lamprecht is a PhD student at Graz University of Technology and is interning at ISI this summer. His research explores network science, web science and recommender systems and especially focuses on network navigability. This summer, he's working with Kristina Lerman on navigation dynamics and click biases in Wikigames. In the past, he has also studied navigation dynamics in information networks with the aid of biomedical ontologies.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 Jul 2014</td>
<td align=left valign=top>Jonathan May (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_Jul_2014');">
An Arabizi-English Social Media Statistical Machine Translation System
</a><br>
<span id=abs18_Jul_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We present a machine translation engine that can translate romanized Arabic, often known as Arabizi, into English. With such a system we can, for the first time, translate the massive amounts of Arabizi that are generated every day in the social media sphere but until now have been uninterpretable by automated means. We accomplish our task by leveraging a machine translation system trained on non-Arabizi social media data and a weighted finite-state transducer-based Arabizi-to-Arabic conversion module, equipped with an Arabic character-based n-gram language model. The resulting system allows
high capacity on-the-fly translation from Arabizi to English. We demonstrate via several experiments that our performance is quite close to the theoretical maximum attained by perfect deromanization of Arabizi input. This constitutes the first presentation of an end-to-end social media Arabizi-to-English translation system.
<p>
<p>
bio:
Jonathan May is a computer scientist at USC-ISI, where he also received a PhD in 2010. His current focus areas are in machine translation, machine learning, and natural language understanding. Jonathan co-developed and patented a highly portable method for optimizing thousands of features in machine translation systems that has since been incorporated into all leading open source MT systems. He has previously worked in automata theory and information extraction and at SDL Language Weaver and BBN Technologies.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Jul 2014</td>
<td align=left valign=top>Yang Feng (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Jul_2014');">
Factored Markov Translation with Robust Modeling
</a><br>
<span id=abs11_Jul_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Phrase-based translation models usually memorize local translation literally and make independent assumption between phrases which makes it neither generalize well on unseen data nor model sentence-level effects between phrases. We present a new method to model correlations between phrases as a Markov model and meanwhile employ a robust smoothing strategy to provide better generalization. This method defines a recursive estimation process and backs off in parallel paths to infer richer structures. Our evaluation shows an 1.1â€“3.2% BLEU improvement over competitive baselines for Chinese-English and Arabic-English translation.
<p>
Bio: Yang Feng is a postdoctoral scholar in Kevin Knight's NLP group in USC/ISI. She got her Ph.D. degree in 2011 from Institute of Computing Technology, Chinese Academy of Sciences. Her interests are machine translation and machine learning, focusing on Bayesian inference and Gaussian process. Now her main work is to improve ISI syntax-based system.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>02 Jul 2014</td>
<td align=left valign=top>Matic Horvat (Cambridge)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs02_Jul_2014');">
A Graph-Based Approach to String Regeneration [Intern talk]
</a><br>
<span id=abs02_Jul_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 2:30 pm - 3:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> I'll talk about a graph based approach to the string regeneration problem, published at 2014 EACL Student Research Workshop. I will conclude my talk by briefly talking about my PhD research direction of including semantics (MRS) into a state-of-the-art SMT system.
<p>
String regeneration is the problem of generating a fluent sentence from an unordered list of words. The purpose of investigating and developing approaches to solving the string regeneration problem is grammaticality and fluency improvement of machine generated text. I investigated a graph-based approach to the string regeneration problem that finds a permutation of words with the highest probability under an n-gram language model.
<p>
Bio: I am a PhD student at University of Cambridge researching integration of semantics and Statistical Machine Translation. I am originally from Ljubljana, Slovenia, where I completed a BSc in Computer Science in 2012. I continued with a masters in Advanced Computer Science at University of Cambridge, graduating in 2013.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Jun 2014</td>
<td align=left valign=top>Eunsol Choi (University of Washington)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Jun_2014');">
Open Domain Semantic Parser for QA / Information Extraction [Intern talk]
</a><br>
<span id=abs30_Jun_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: We consider the challenge of learning semantic parsers that scale to large, open-domain problems, such as question answering or knowledge base completion with Freebase. In such settings, the sentences cover a wide variety of topics and include many phrases whose meaning is difficult to represent in a fixed target ontology. For example, even simple phrases such as `daughter' and `number of people living in' cannot be directly represented in Freebase, whose ontology instead encodes facts about gender, parenthood, and population. Here, we introduce a semantic parsing approach that learns to resolve such ontological mismatches. The parser uses a probabilistic CCG to build linguistically motivated logical-form meaning representations, and includes an ontology matching model that adapts the output logical forms for each target ontology.
<p>
<p>
Bio: Eunsol Choi is a Ph.D student at the University of Washington, advised by Prof. Luke Zettlemoyer. Prior to UW, she studied mathematics and computer science at Cornell University.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Jun 2014</td>
<td align=left valign=top>Dirk Hovy (University of Copenhagen)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Jun_2014');">
Two ways to deal with annotation bias
</a><br>
<span id=abs16_Jun_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: In NLP, we rely on annotated data to train models. This implicitly assumes that the annotations represent the truth. However, this basic assumption can be violated in two ways: either because the annotators exhibit a certain bias (consciously or subconsciously), or because there simply is not one single truth. In this talk, I will present approaches to deal with both problems.
In the case of biased annotators, we can collect multiple annotations and use an unsupervised item-response model to infer the underlying truth and the reliability of the individual annotators. We present a software package, MACE (Multi-Annotator Competence Estimation) with considerable improvements over standard baselines both in terms of predicted label accuracy and estimates of trustworthiness, even under adversarial conditions. Additionally, we can trade precision for recall, achieving even higher performance by focusing on the instances our model is most confident in.
In the second case, where not a single truth exists, we can collect information about easily confused categories and incorporate this knowledge into the training process. We use small samples of doubly annotated POS data for Twitter to estimate annotation reliability and show how those metrics of likely inter-annotator agreement can be implemented in the loss functions of structured perceptron. We find that these cost-sensitive algorithms perform better across annotation projects and, more surprisingly, even on data annotated according to the same guidelines. Finally, we show that these models perform better on the downstream task of chunking.
<p>
Bio:
Dirk Hovy is a postdoc in the Center for Language Technology at the University of Copenhagen, working with Anders SÃ¸gaard on improving analysis of low-resource languages. Their recent paper on POS tagging with inter-annotator agreement won the best paper award at EACL 2014.
Dirk received his PhD from the University of Southern California (USC), where he was working at the Information Sciences Institute (ISI) on unsupervised relation extraction. He has a background in socio-linguistics and worked on unsupervised and semi-supervised models for relation extraction, temporal links, and WSD, as well as annotator assessment. He is interested in the "human" aspects of NLP, i.e., the individual bias people have when producing or annotating language, and how it affects NLP applications.
His other interests include cooking, cross-fit, and medieval art and literature.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Jun 2014</td>
<td align=left valign=top>Julian Schamper (RWTH Aachen)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Jun_2014');">
Solving Homophonic Sustitution Ciphers [Intern talk]
</a><br>
<span id=abs11_Jun_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Performing machine translation with monolingual data instead of parallel data is an interesting problem. Because of the lack of parallel data for many language pairs, solving the problem would arise interesting new use cases.
On the road towards this we look at similar but easier problems. In the past improvements on simple substitution ciphers (1:1) were made - Even word substitutions ciphers with large vocabularies were solved for example by a beamsearch approach. This talk concentrates on the more complicated cipher class of homophonic substitution ciphers (1:m) like the famous Z408 of the Zodiac killer or the second page of the Beale cipher. We preset a method based on beamsearch. Covered aspects are an improved heuristic, the order the beamsearch should explore the search space, pruning, and the impact of the cipher lengths and cipher alphabet size on the deciphering accuracy.
<p>
Bio:
Julian Schamper studies computer science at RWTH Aachen University. He did its bachelor thesis in the field of deciphering foreign language and works as a student research assistant at Prof. Hermann Ney's Human Language Technology and Pattern Recognition Group.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 Jun 2014</td>
<td align=left valign=top>Elnaz Nouri (USC/ICT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_Jun_2014');">
Cultural Negotiating Agents
</a><br>
<span id=abs06_Jun_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract:
People from different cultures and backgrounds tend to make different decisions faced with the same set of choices. Cultural background influences people's decisions in social interactions. Computational agents that are intended to simulate human behavior or engage in interpersonal interactions such as negotiation with humans need decision making models that are sensitive to culture. In this talk, we show how agents can learn to behave like people from specific cultures in the context of a negotiation game.
<p>
Bio: Elnaz Nouri is a PhD student in the Natural Language group at USC's Institute for creative Technologies (ICT).
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 May 2014</td>
<td align=left valign=top>Xing Shi (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_May_2014');">
How to Speak a Language Without Knowing It
</a><br>
<span id=abs23_May_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We develop a system that lets people overcome language barriers by letting them speak a language they do not know.  Our system accepts text entered by a user, translates the text, then converts the translation into a phonetic spelling in the userâ€™s own orthography. We trained the system on phonetic spellings in travel phrasebooks.
<p>
Xing Shi is a PhD student at USC, advised by Professor Kevin Knight.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 May 2014</td>
<td align=left valign=top>Hans Chalupsky (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_May_2014');">
Story-Level Inference to Improve Machine Reading
</a><br>
<span id=abs16_May_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Extracting well-defined entities and relations that hold between them from
unstructured text is an important prerequisite for a variety of tasks such as
knowledge base population, question answering, data analytics, visualization,
etc.  The difficulty of this problem is evidenced by the annual TAC-KBP
evaluations organized by NIST, where the best-performing systems in the
slot-filling task still only achieve an f-value in the high 30's.  These high
error rates on individual relations get further compounded once relations
have to be joined to answer a question.
<p>
State-of-the art statistical information extraction techniques focus primarily
on the phrase and sentence level to extract entities and relations between
them, and are generally ignorant of the greater context around them.  We
present a new approach which aggregates locally extracted information into a
larger story context and uses abductive reasoning to generate the best
story-level interpretation.  We demonstrate that this approach can
significantly improve relation extraction and question answering performance
on complex questions.  We will also describe ongoing work to apply this type
of inference to the TAC Knowledge Base Population task in order to improve
relation extraction and coreference resolution.
<p>
<p>
Bio:
<p>
Hans Chalupsky is a project leader at the Information Sciences Institute of
the University of Southern California, where he leads the Loom Knowledge
Representation and Reasoning Group.  He holds a Master's degree in computer
science from the Vienna University of Technology, Austria and a Ph.D. in
computer science from the State University of New York at Buffalo.
Dr. Chalupsky has over 25 years of experience in the design, development and
application of knowledge representation and reasoning systems such as
PowerLoom, and he is the principal architect of the KOJAK Link Discovery
System.  His research interests include knowledge representation and reasoning
systems, natural language processing, knowledge and link discovery, anomaly
detection and semantic interoperability.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 May 2014</td>
<td align=left valign=top>Qing Dou (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_May_2014');">
Beyond Parallel Data [Qualification practice talk]
</a><br>
<span id=abs14_May_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 11:00 am - 12:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Thanks to the availability of parallel data and advances in machine learning techniques, we have seen tremendous improvement in the field of machine translation over the past 20 years. However, due to lack of parallel data, the quality of machine translation is still far from satisfying for many language pairs and domains. In general, it is easier to obtain non-parallel data, and much work has tried to learn translations from non-parallel data. Nonetheless, improvements to machine translation have been limited. In this work, I follow a decipherment approach to learn translations from non parallel data and achieve significant gains in machine translation.
<p>
I apply slice sampling to Bayesian decipherment. Compared with the state-of-the-art algorithm, the new approach is highly scalable and accurate, making it possible to decipher billions of tokens with hundreds of thousands of word types at high accuracy for the first time. Furthermore, I introduce dependency relations to address the problems of word reordering, insertion, and deletion when deciphering foreign languages, and show that dependency relations help improve deciphering accuracy by over 5-fold.
I decipher large amounts of monolingual data to learn translations for out-of-vocabulary words and observe significant gains of up to 3.8 BLEU points in domain-adaptation. Moreover, I show that a translation lexicon learned from large amounts of non-parallel data with decipherment can improve a phrase-based machine translation system trained with limited parallel data. In experiments, I observe BLEU gains of 1.2 to 1.8 across three different test sets.
<p>
Given the above success, I propose to work on advancing machine translation of real world low density languages, and to explore using non-parallel data to improve word alignment and discovery of phrase translations.
<p>
Bio:
<p>
Qing Dou is a fourth year PhD student at USC/ISI, advised by Professor Kevin Knight.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 May 2014</td>
<td align=left valign=top>Aram Galstyan (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_May_2014');">
Deciphering Social Interactions from Text
</a><br>
<span id=abs09_May_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Studies of social systems have traditionally focused on analyzing various structural properties of networks induced by social communication, while ignoring the content of communication. Despite recent advances, language-based analysis of social processes is still a challenging problem due to the lack of sound mathematical frameworks and adequate computational methods for extracting and analyzing useful social signals from unstructured text.
<p>
<p>
Here I will describe our recent work on content-based analysis of social interactions, which involves two main steps: (a) Embedding communication content in an abstract content space, so that a sequence of textual exchanges is represented as trajectories in this space; and (b) Applying tools from information theory and dynamical systems to discover and characterize directional correlations among those trajectories. I will briefly describe the main elements of the technical approach, and demonstrate the usefulness of the proposed framework on two case studies: content-based characterization of social influence, and stylistic coordination in dialogues.
<p>
<p>
Bio:
<p>
<p>
Aram Galstyan is a Project Leader at the USC Information Sciences Institute and a Research Assistant Professor at the USC Computer Science Department. His current research focuses on characterizing and predicting behavior of dynamic networks using informationâ€“theoretic concepts. His other research interests include developing statisticalâ€“physics based approaches for understanding fundamental limits of various inference algorithms and characterizing the performance of those algorithms with respect to stability and robustness.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Apr 2014</td>
<td align=left valign=top>Linhong Zhu (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Apr_2014');">
Partitioning Networks with Node Attributes by Compressing Information Flow
</a><br>
<span id=abs25_Apr_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Real-world networks are often organized as modules or communities of similar nodes that serve as functional units. These networks are also rich in content, with nodes having distinguishing features or attributes. In order to discover a network's modular structure, it is necessary to take into account not only its links but also node attributes.
We describe an information-theoretic method that identifies modules by compressing descriptions of information flow on a network. Our formulation introduces node content into the description of information flow, which we then minimize to discover groups of nodes with similar attributes that also tend to trap the flow of information.
The method has several advantages: it is conceptually simple and does not require ad-hoc parameters to specify the number of modules or to control the relative contribution of links and node attributes to network structure.
We apply the proposed method to partition real-world networks with known community structure. We demonstrate that adding node attributes helps recover the underlying community structure in content-rich networks more effectively than using links alone. In addition, we show that our method is faster and more accurate than alternative state-of-the-art algorithms.
<p>
Linhong Zhu is currently a Postdoctoral  Research Associate at Information Sciences Institute, University of Southern California, under the supervision of Dr. Kristina Lerman and Dr. Aram Galstyan.  Before that, she worked as a scientist-I at Institute for Infocomm Research Singapore from Oct 2010 to Jan 2013. She got her B Eng. Degree in Computer Science from University of Science and Technology of China in 2006 (2002-2006) and received her Ph.D. Degree in Computer Engineering from Nanyang Technological University (2006-2011). Her research interests focus on large-scale social network analysis and sentiment analysis.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Apr 2014</td>
<td align=left valign=top>Hui Zhang (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Apr_2014b');">
[ACL2014 practice talk] Kneser-Ney Smoothing on Expected Counts
</a><br>
<span id=abs25_Apr_2014b style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 am - 11:30 am<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Widely used in speech and language processing, Kneser-Ney (KN) smoothing has consistently been shown to be one of the best-performing smoothing methods. However, KN smoothing assumes integer counts, limiting its potential usesâ€”for example, inside Expectation-Maximization. In this paper, we propose a generalization of KN smoothing that operates on fractional counts, or, more precisely, on distributions over counts. We rederive all the steps of KN smoothing to operate on count distributions instead of integral counts, and apply it to two tasks where KN smoothing was not applicable before: one in language model adaptation, and the other in word alignment. In both cases, our method improves performance significantly.
<p>
Hui Zhang is a fourth year PhD student working with Professor David Chiang at the USC Information Sciences Institute. His main research interests are in statistical machine translation and machine learning. He has focused on domain adaptation and smoothing techniques.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Apr 2014</td>
<td align=left valign=top>Derek Abbott (University of Adelaide)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Apr_2014');">
The Mystery of the Tamam Shud Code
</a><br>
<span id=abs16_Apr_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> One of the leading unsolved mysteries in Australia, is the case of the Somerton Man.  This was a very athletically fit man found in a nice suit lying deceased on a beach in Australia in 1948.  The mystery is that there was no mark on him and there was nothing to identify him. No one came forward to identify him either. Over 65 years later we still do not know his name or how he died.  He had no ID, but his pocket had a piece of paper with the words "Tamam Shud" on it. It was subsequently found that the piece of paper had been torn out of a copy of a poetry book called the Rubaiyat of Omar Khayyam. Penciled in the back of the book were letters that appeared to be some sort of code.  Is this a clue? This talk will outline the key facts of mystery and show how forensic skills in engineering and computing are being used to attempt to both identify the man and shed light on the mysterious letters.
<p>
Derek Abbott received a B.Sc. (Hons) in physics from Loughborough University, U.K. in 1982 and completed his Ph.D. in electrical and electronic engineering from the University of Adelaide, Adelaide, Australia, in 1995. From 1978 to 1986, he was a research engineer at the GEC Hirst Research Centre, London, U.K. From 1986â€“1987, he was a VLSI design engineer at Austek Microsystems, Australia. Since 1987, he has been with the University of Adelaide, where he is presently a full Professor with the School of Electrical and Electronic Engineering.  Prof. Abbott is a Fellow of the Institute of Physics (IOP) and a Fellow of the IEEE. He has won a number of awards including a Tall Poppy Award for Science (2004), a Premierâ€™s Award in Science and Technology for outstanding contributions to South Australia (2004), and an Australian Research Council (ARC) Future Fellowship (2012). He is on the editorial board of Proceedings of the IEEE. His interests are in complex systems and multidisciplinary applications of physics and engineering.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Apr 2014</td>
<td align=left valign=top>Farshad Kooti (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Apr_2014');">
Network Weirdness: Exploring the Origins of Network Paradoxes
</a><br>
<span id=abs11_Apr_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Social networks have many counter-intuitive properties, including the â€œfriendship paradoxâ€ that states, on average, your friends have more friends than you do. Recently, a variety of other paradoxes were demonstrated in online social networks. This paper explores the origins of these network paradoxes. Specifically, we ask whether they arise from mathematical properties of the networks or whether they have a behavioral origin. We show that sampling from fat-tailed distributions always gives rise to a paradox in the mean, but not the median. We propose a strong form of network paradoxes, based on utilizing the median, and validate it empirically using data from two online social networks. Specifically, we show that for any user the majority of userâ€™s friends and followers have more friends, followers, etc. than the user, and that this cannot be explained by statistical properties of sampling. Next, we explore the behavioral origins of the paradoxes by using the shuffle test to remove correlations between node degrees and attributes. We find that paradoxes for the mean persist in the shuffled network, but not for the median. We demonstrate that strong paradoxes arise due to the assortativity of user attributes, including degree, and correlation between degree and attribute.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>28 Feb 2014</td>
<td align=left valign=top>Kenji Sagae (USC/ICT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs28_Feb_2014');">
Dependency parsing with directed graph output
</a><br>
<span id=abs28_Feb_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Most data-driven dependency parsing approaches assume that the
structure of sentences is represented as trees. Although trees have
several desirable properties from a computational perspective, the
structure of linguistic phenomena that go beyond shallow syntax often
cannot be fully captured by tree representations. I will describe
data-driven dependency parsing approaches that produce more general
graphs as output, and present results obtained with these approaches
on predicate-argument structures extracted from CCG and HPSG datasets.
<p>
Kenji Sagae is a Research Scientist in the Institute for Creative Technolgies at the University of Southern California, and a Research Assistant Professor in the USC Computer Science Department.  He received his PhD from Carnegie Mellon University in 2006.  Prior to joining USC in 2008, he was a research associate at the University of Tokyo. His main area of research is Natural Language Processing, focusing on data-driven approaches for syntactic parsing, predicate-argument analysis and discourse processing. His current work includes the application of these techniques in analysis of personal narratives in blog posts, the study of child language, spoken dialogue systems, and multimodal processing.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Feb 2014</td>
<td align=left valign=top>Hal DaumÃ© III (University of Maryland)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Feb_2014');">
Predicting Linguistic Structures Accurately and Efficiently
</a><br>
<span id=abs14_Feb_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Many classic problems in natural language processing can be cast as building mapping from a complex input (e.g., a sequence of words) to a complex output (e.g., a syntax tree or semantic graph).  This task is challenging both because language is ambiguous (learning difficulties) and represented with discrete combinatorial structures (computational difficulties). Often these are at odds: the features you want to add to decrease learning difficulties cause nontrivial additional structure yielding worse computational difficulties.
<p>
I will begin by discussing algorithms that side-step the issue of combinatorial blowup and aim to predict an output structure directly. I will then present approaches that explicitly learn to trade-off accuracy and efficiency, applied to a variety of linguistic phenomena. Moreover, I will show that in some cases, we can actually obtain a model that is faster and more accurate by exploiting smarter learning algorithms.
<p>
Hal's homepage: http://www.umiacs.umd.edu/~hal/
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Jan 2014</td>
<td align=left valign=top>Mohsen Taheriyan (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Jan_2014');">
A Graph-based Approach to Learn Semantic Descriptions of Data Sources
</a><br>
<span id=abs17_Jan_2014 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Semantic models of data sources and services provide support to automate many tasks such as source discovery, data integration, and service composition, but writing these semantic descriptions by hand is a tedious and time-consuming task. Most of the related work focuses on automatic annotation with classes or properties of source attributes or input and output parameters. However, constructing a source model that includes the relationships between the attributes in addition to their semantic types remains a largely unsolved problem. In this talk, we present a graph-based approach to hypothesize a rich semantic description of a new target source from a set of known sources that have been modeled over the same domain ontology. We exploit the domain ontology and the known source models to build a graph that represents the space of plausible source descriptions. Then, we compute the top k candidates and suggest to the user a ranked list of the semantic models for the new source. The approach takes into account user corrections to learn more accurate semantic descriptions of future data sources. Our evaluation shows that our method produces models that are twice as accurate than the models produced using a state of the art system that does not learn from prior models.
<p>
Mohsen's webpage: http://www-scf.usc.edu/~taheriya/
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 Dec 2013</td>
<td align=left valign=top>Shiwali Mohan (University of Michigan)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_Dec_2013');">
Learning Hierarchical Tasks from Situated Interactive Instruction
</a><br>
<span id=abs06_Dec_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Our research aims at building interactive robots and agent that can expand their knowledge by interacting with human users. In this talk, I will give an overview of our ongoing work on learning novel tasks from linguistic, mixed-initiative instructions. The first part of the talk will address the problems of situated language comprehension for cognitive agents in real-world environments. The second part will focus on task learning. I will discuss the knowledge representations we employ to represent hierarchical, goal-oriented tasks and how this knowledge can be learned from interactions using an explanation-based learning framework.
<p>
Bio: Shiwali Mohan is a Ph.D. candidate in the department of Computer Science and Engineering at the University of Michigan, Ann Arbor. Her research interests include situated language, interactive learning, and cognitive systems.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Nov 2013</td>
<td align=left valign=top>Vikram Ramanarayanan (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Nov_2013');">
Data-Driven Techniques for Modeling Speech Motor Control
</a><br>
<span id=abs15_Nov_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Modeling the ways in which humans produce and perceive various forms of behavioral communication, such as speech, pose many diverse challenges. For instance, from a controls perspective, it is important to understand and model how control and coordination of various biological actuators in human body is achieved order to produce motor actions. From a signal processing perspective, we would like to discover novel representations or system architectures that are used in order to effect this coordination.
<p>
We present a computational, data-driven approach to derive interpretable movement primitives from speech articulation data in a bottom-up manner. It puts forth a convolutive Nonnegative Matrix Factorization algorithm with sparseness constraints (cNMFsc) to decompose a given data matrix into a set of spatio-temporal basis sequences and an activation matrix. The algorithm optimizes a cost function that trades off the mismatch between the proposed model and the input data against the number of primitives that are active at any given instant. We further argue that such primitives can be modeled using nonlinear dynamical systems in a control-theoretic framework for speech motor control. Specifically, we extend our approach to extract a spatio-temporal dictionary of control primitives (sequences of control parameters), which can then be used to control a dynamical systems model of the vocal tract to produce any desired sequence of movements. Although the method is particularly applied to measured and synthesized articulatory data in our case, the framework is general and can be applied to any multivariate timeseries. The results suggest that the proposed algorithm extracts movement primitives from human speech production data that are linguistically interpretable.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>08 Nov 2013</td>
<td align=left valign=top>Giuseppe Carenini (University of British Columbia, Canada)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs08_Nov_2013');">
Modeling Topics, Opinions and Discourse Structure in Asynchronous Conversations
</a><br>
<span id=abs08_Nov_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [Rm 689]<br>
<b>Abstract:</b> Due to the Internet revolution, human conversational data--in written forms--are accumulating at a phenomenal rate, as more and more people engage  in email exchanges, blogging, texting and other social media activities. In this talk, we will present automatic methods for analyzing conversational text generated in asynchronous conversations, i.e., where participants communicate with each other at different times (e.g., email, blog, forum). Our focus will be on novel techniques to detect the topics covered in the conversation, to identify whether an utterance in the conversation is expressing an opinion, as well as to determine the discourse structure of each message. In our work, we apply both graph-based methods and probabilistic graphical models.
<p>
Giuseppe is an Associate Professor in Computer Science at the University of British Columbia (BC, Canada). Giuseppe has broad interdisciplinary interests. His work on natural language processing and information visualization to support decision making has been published in over 90 peer-reviewed papers. Dr. Carenini was the area chair for â€œSentiment Analysis, Opinion Mining, and Text Classificationâ€ of ACL 2009 and the area chair for â€œSummarization and Generationâ€ of NAACL 2012. He has recently co-edited an ACM-TIST
Special Issue on â€œIntelligent Visual Interfaces for Text Analysisâ€. In July 2011, he has published a co-authored book on â€œMethods for Mining and Summarizing Text Conversationsâ€. In his work, Dr. Carenini has also extensively collaborated with industrial partners, including Microsoft and IBM. Giuseppe was awarded a Google Research Award and an
IBM CASCON Best Exhibit Award in 2007 and 2010 respectively.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>01 Nov 2013</td>
<td align=left valign=top>Greg Ver Steeg (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs01_Nov_2013');">
Coarse-graining Text
</a><br>
<span id=abs01_Nov_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Because natural language is complex, researchers in many domains look for lower-dimensional representations of text to suit their purposes. Different methods attempt to single out intuitive aspects of language like content, sentiment, or style. I will discuss a new, unsupervised approach to learning abstract representations of text (or other high-dimensional signals). The motivating principle is to use information theory to construct higher-order features that explain correlations between lower-order features. I will present preliminary results using this framework.
<p>
Greg Ver Steeg is a research professor at ISI. His research explores practical methods for inferring meaningful structure in complex systems like social networks. He did his PhD in quantum physics at Caltech.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Oct 2013</td>
<td align=left valign=top>Roy Schwartz (NLP Lab, Hebrew University in Jerusalem)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Oct_2013');">
Semantic Representation using Flexible Patterns
</a><br>
<span id=abs25_Oct_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Ever since their introduction in 1992, hand-crafted lexico-semantic patterns have been shown to be useful for many semantic tasks. In recent years, an automatic, fully unsupervised method to generate patterns was developed ("flexible patterns"). I will demonstrate that flexible patterns are useful for extracting semantic information on words, word relations and sentences. I will present in detail the latest results in the field â€“ applying flexible patterns on the task of authorship attribution on tweets (Schwartz et al., EMNLP2013).
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 Oct 2013</td>
<td align=left valign=top>Kuzman Ganchev (Google Research)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_Oct_2013');">
Cross lingual transfer and learning with side information
</a><br>
<span id=abs24_Oct_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> I will describe a framework for cross-lingual transfer of
probabilistic models that uses posterior regularization.  As a long
aside, I will describe several methods for learning with side
information: constraint driven learning, posterior regularization,
generalized expectation, augmented loss as well as how they relate to
each other and to Bayesian measurements.   I will conclude with some
applications from my work and from the literature, including sequence
and tree models.
<p>
Biography:
<p>
I was born in Sofia, Bulgaria where I lived until February 1989. My
family moved to Zimbabwe and then in 1995 to New Zealand where I went
to high school. I came to the US in 1999 to study at Swarthmore
College. I spent the 2001-2002 academic year studying abroad in Paris.
After graduating with a Bachelor of Arts in Computer Science in 2003 I
worked at StreamSage Inc. in Washington DC until starting at the
University of Pennsylvania in Fall 2004. During the summer of 2007 I
was an intern at TrialPay in Mountain View, CA and during the summer
of 2008 I was an intern at Bank of America in New York. I graduated
from UPenn in 2010 and have since been working at Google Inc. in New
York.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Oct 2013</td>
<td align=left valign=top>Qing Dou (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Oct_2013b');">
Dependency Based Decipherment for Resource-Limited Machine Translation (EMNLP2013 practice talk)
</a><br>
<span id=abs16_Oct_2013b style="display:none;">
<font size=-1>
<b>Time:</b> 11:00 am - 12:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [Rm # 689]<br>
<b>Abstract:</b> We introduce dependency relations into deciphering foreign languages and show that dependency relations help improve the state-of-the-art deciphering accuracy by over 500%. We learn a translation lexicon from large amounts of genuinely non parallel data with decipherment to improve a phrase-based machine translation system trained with limited parallel data. In experiments, we observe BLEU gains of 1.2 to 1.8 across three different test sets.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 Sep 2013</td>
<td align=left valign=top>Andrew S. Gordon (USC/ICT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_Sep_2013');">
Heider-Simmel Interactive Theater
</a><br>
<span id=abs27_Sep_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> In a famous 1944 paper, psychologist Fritz Heider and his student Marianne Simmel described an experiment where undergraduates were shown a short animated film depicting the movement of geometric shapes. Asked to describe what happened in the film, these students produced narratives that described the behavior of these shapes in anthropomorphic terms, ascribing to them plans, goals, emotions, and social roles that accounted for their behavior. Fritz Heider later wrote his seminal book, The Psychology of Interpersonal Relations, which articulated the role of Commonsense Psychology in the interpretation of the behavior of other people.
In this talk I'll discuss our recent efforts to model the reasoning of the students in Heider and Simmel's original experiment. I'll describe our vision of a "Heider-Simmel Interactive Theater," a software application where people can create their own short movies involving geometric shapes in the style of Heider and Simmel's original film, which are then interpreted by the computer to generate a textual narrative of the author's creation. Then I'll lay out the technical plan, which involves the integration of probabilistic graphical models, weighted abduction, data-driven text generation, logical formalizations of commonsense psychology, and game-based data collection from the public at large.
Before coming to the talk, please sign up and play "Triangle Charades" at the following website: http://charades.ict.usc.edu
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>20 Sep 2013</td>
<td align=left valign=top>Yang Feng (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_Sep_2013');">
A Markov Model of Machine Translation using Non-parametric Bayesian Inference (ACL 2013)
</a><br>
<span id=abs20_Sep_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Most modern machine translation systems use phrase pairs as translation units, allowing for accurate modeling of phrase-internal translation and reordering. However phrase-based approaches are much less able to model sentence level effects between different phrase-pairs. We propose a new model to address this imbalance, based on a word-based Markov model of translation which generates target translations left-to-right. Our model encodes word and phrase level phenomena by conditioning translation decisions on previous decisions and uses a hierarchical Pitman-Yor Process prior to provide dynamic adaptive smoothing. This mechanism implicitly supports not only traditional phrase pairs, but also gapping phrases which are non-consecutive in the source.
<p>
Yang Feng is a posdoc of the natural language group in USC/ISI. She got her ph.D degree from Institute of Computing Technology, Chinese Academy of Sciences. Her research interests are in all aspects of machine translation and machine learning focusing on graphical models and Bayesian inference.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>13 Sep 2013</td>
<td align=left valign=top>Kevin Knight (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs13_Sep_2013');">
Some Potential NLP Thesis Topics and Other Fun Research Projects
</a><br>
<span id=abs13_Sep_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> I'll present a dozen interesting, potentially high-impact NLP research projects. I'd like to make this a very interactive session.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 Sep 2013</td>
<td align=left valign=top>Jeon-Hyung Kang (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_Sep_2013');">
LA-CTR: A Limited Attention Collaborative Topic Regression for Social Media
</a><br>
<span id=abs06_Sep_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Conference Room [RM # 689]<br>
<b>Abstract:</b> Probabilistic models can learn usersâ€™ preferences from the history of their item adoptions on a social media site, and in turn, recommend new items to users based on learned preferences. However, current models ignore psychological factors that play an important role in shaping online social behavior. One such factor is attention, the mechanism that integrates perceptual and cognitive features to select the items the user will consciously process and may eventually adopt. Recent research has shown that people have finite attention, which constrains their online interactions, and that they divide their limited attention non-uniformly over other people. We propose a collaborative topic regression model that incorporates limited, non-uniformly divided attention. We show that the proposed model is able to learn more accurate user preferences than state-of-art models, which do not take human cognitive factors into account. Specifically we analyze voting on news items on the social news aggregator and show that our model is better able to predict held out votes than alternate models. Our study demonstrates that psycho-socially motivated models are better able to describe and predict observed behavior than models which only consider latent social structure and content.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Aug 2013</td>
<td align=left valign=top>Tomer Levinboim (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Aug_2013');">
MKL and Low Rank Multiplicative Shaping
</a><br>
<span id=abs30_Aug_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Multiple Kernel Learning (MKL) has been a subject of intensive research over the past decade.
Instead of searching for a good kernel function (implicitly, feature transformation of our data), the idea is to learn a combination of kernels that optimizes our objective.
This formulation has found usage in feature selection and interpretability as well as (sometimes) leading to increased classification accuracy.
In the talk, I will provide an introduction to MKL as well as present and compare a few MKL formulations for SVM classification.
Given time, I will present our own non-linear (yet still convex) MKL formulation that linearly combines kernels that are first multiplied by low-rank matrices.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Aug 2013</td>
<td align=left valign=top>Jonathan May (SDL Research)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Aug_2013');">
Models of Translation Competitions (long paper at ACL2013)
</a><br>
<span id=abs23_Aug_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> What do we want to learn from a translation competition and how do we learn it with confidence? We argue that a disproportionate focus on ranking competition participants has led to lots of different rankings, but little insight about which rankings we should trust. In response, we provide the first framework that allows an empirical comparison of different analyses of competition results. We then use this framework to compare several analytical models on data from the Workshop on Machine Translation (WMT).
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Aug 2013</td>
<td align=left valign=top>Gully Burns (USC/ISI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Aug_2013');">
Bridging Between Bioinformatics and Natural Language Processing
</a><br>
<span id=abs16_Aug_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> The Mouse Genome Informatics database (MGI) has participated extensively in shared NLP challenges focussed on developing infrastructure for their use. This collaboration has advanced the field of applying NLP to biomedical text but has not yet generated workable technology for use in the lab. In advance of a workshop (Monday August 19, 2013 at ISI) dedicated to this subject, I will describe the SciKnowMine project to introduce the domain of biomedical NLP and to showcase how we can collaboratively accelerate the process of biocuration, making these important databases far more effective.
<p>
Students, colleagues! You are very welcome to the workshop: <a href=http://www.isi.edu/projects/sciknowmine/sciknowmine_release_workshop_-_bridging_bionlp_and_biocuration>http://www.isi.edu/projects/sciknowmine/sciknowmine_release_workshop_-_bridging_bionlp_and_biocuration</a>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Jul 2013</td>
<td align=left valign=top>Fabienne Braune (University of Stuttgart)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Jul_2013');">
Multi bottom-up tree transducers in statistical machine translation
</a><br>
<span id=abs26_Jul_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> After a brief overview of applications of tree transducers in statistical machine translation, we introduce multi bottom-up tree transducers (XMBOT).
We then present a complete translation system integrating XMBOT. The two main components of our pipeline are (a) rule extraction and (b) decoding. We begin by presenting the extraction of XMBOT rules from an aligned and bi-parsed parallel corpus. In a second step, we introduce our XMBOT decoder which is an adaptation of the syntax-based component of the Moses open-source MT toolkit to handle XMBOT rules. We end this talk with an evaluation of our system on the WMT 2009 English-to-German translation task.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 Jul 2013</td>
<td align=left valign=top>Jacqueline Lee (MIT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_Jul_2013');">
Bayesian Approaches to Acoustic Model and Pronunciation Lexicon Discovery
</a><br>
<span id=abs19_Jul_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> In the first part of the talk, we investigate the problem of acoustic modeling in which prior language-specific knowledge and transcribed data are unavailable. We present an unsupervised model that simultaneously segments the speech, discovers a proper set of sub-word units (e.g., phones) and learns a Hidden Markov Model (HMM) for each induced acoustic unit. Our approach is formulated as a Dirichlet process mixture model in which each mixture is an HMM that represents a sub-word unit. We apply our model to the TIMIT corpus, and the results demonstrate that our model discovers phone units that are highly correlated with English phones as well as produces better segmentation than the state-of-the-art baselines. We test the quality of the learned acoustic models on a spoken term detection task. Compared to the baseline, our model is able to improve the detection precision of top hits by a large margin.
<p>
The creation of a pronunciation lexicon remains the most inefficient process in developing an automatic speech recognizer. In the second part of the talk, we discuss an unsupervised alternative to the conventional manual approach for creating pronunciation dictionaries. We present a hierarchical Bayesian model, which jointly discovers the phonetic inventory and the Letter-to-Sound (L2S) mapping rules in a language using only transcribed data. When tested on a corpus of spontaneous queries, our results demonstrate the superiority of the proposed joint learning scheme over its sequential counterpart, in which the latent phonetic inventory and L2S mappings are learned separately. Furthermore, the recognizers built with the automatically induced lexicon consistently outperform grapheme-based recognizers and even approach the performance of recognition systems trained using conventional supervised procedures.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>12 Jul 2013</td>
<td align=left valign=top>Daniel Bauer (Columbia)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs12_Jul_2013');">
Understanding Descriptions of Visual Scenes Using Graph Grammars
</a><br>
<span id=abs12_Jul_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> I will present work on the interpretation of descriptions of visual scenes such as 'A man is sitting on a chair and using the computer'. One application of this research is the automatic generation of 3D scenes which provides a way for non-artists to create graphical content and have wide-ranging applications in entertainment and education.
<p>
The core task of text-to-scene generation involves understanding the high-level content of a description and translating it into a low-level representation representing a 3D scene as a set of relations between pre-existing 3D models. Linguistic, spatial, and world-knowledge inference is required in this process on different levels.
<p>
My talk will present VigNet, a repository of lexical- and world knowledge needed for text-to-scene generation, which is based on FrameNet. I will also describe how visual scenes can be represented as directed graphs  and how information in VigNet can be encoded in Synchronous Hyperedge Replacement Grammars to enable semantic parsing and generation of a scene.
<p>
Bio:
Daniel Bauer is a PhD candidate at Columbia University. His research interests include lexical and computational semantics, semantic parsing, and formal grammars in syntax and semantics. He is a co-founder of WordsEye Inc, a company that aims to make text-to-3D-scene generation available to everyone on social media. Daniel is currently an intern at ISI for the second summer in a row. He received his undergrad degree in Cognitive Science from the University of OsnabrÃ¼ck, Germany, and a MSc in Language Science and Technology from Saarland University.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Jul 2013</td>
<td align=left valign=top>Victor Chahuneau (CMU)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Jul_2013');">
Translating into Morphologically Rich Languages with Synthetic Phrases
</a><br>
<span id=abs10_Jul_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> Translation into morphologically rich languages is an important but recalcitrant problem in machine translation. When confronted with the large vocabulary sizes resulting from various morphological phenomena, the independence assumptions made by standard translation models mean that vast amounts of parallel training data (which do not generally exist) would be necessary to reliably estimate the numerous required parameters. On the other hand, previous attempts to remedy this situation have been unsatisfying either because they were highly language-dependent, or because they failed from a modeling perspective (e.g., they improved performance on long-tail types at the expense of frequent types).
<p>
We present a simple and effective approach that deals with the problem in two phases. First, a discriminative model is learned to predict inflections of target words from rich source-side annotations. Then, this model is used to create additional sentence-specific phrases that are added to a standard translation model prior to decoding. Our approach relies on morphological analysis of the target language but we show that an unsupervised Bayesian model can also be used in place of a standard supervised analyzer. We report significant improvements in translation quality when translating from English to Russian, Hebrew and Swahili.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Jun 2013</td>
<td align=left valign=top>Malte Nuhn (Aachen University, Germany)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Jun_2013');">
Is Decipherment Difficult?
</a><br>
<span id=abs07_Jun_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Is it possible to learn useful translations from large amounts of
monolingual data to improve machine translation? The intuitive
feeling is that learning a language without bilingual data is at
least "more difficult than learning from example translations". In
this talk, I will present recent results on decipherment: I will show
that the decipherment problem is indeed difficult (NP-hard) and what
approximations to the original problem can be made without hurting
decipherment accuracy much.
<p>
Bio:
Having studied Physics and Computer Science at RWTH Aachen University, I'm currently a PhD student at Prof. Ney's Human Language Technology and Pattern Recognition Group in Aachen. I'm particularly interested in applying decipherment techniques to improve machine translation.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Jun 2013</td>
<td align=left valign=top>Dirk Hovy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Jun_2013');">
Learning Whom to Trust with MACE(NAACL Practice Talk)
</a><br>
<span id=abs05_Jun_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Non-expert annotation services like Amazon's Mechanical Turk (AMT) are cheap and fast ways to evaluate systems and provide categorical annotations for training data. Unfortunately, some annotators choose bad labels in order to maximize their pay. Manual identification is tedious, so we experiment with an item-response model. It learns in an unsupervised fashion to a) identify which annotators are trustworthy and b) predict the correct underlying labels. We match performance of more complex state-of-the-art systems and perform well even under adversarial conditions. We show considerable improvements over standard baselines, both for predicted label accuracy and trustworthiness estimates. We show that the latter can be further improved by introducing a prior on model parameters and using Variational Bayes inference. Additionally, we present a method for trading precision and recall, achieving even higher performance by focusing on the instances our model is most confident in. We provide an implementation of MACE (Multi- Annotator Competence Estimation) for download at (http://www.isi.edu/publications/licensed-sw/mace/).
<p>
Bio:
Dirk Hovy is a recent PhD graduate from USC's Information Sciences Institute, working with Jerry Hobbs and Ed Hovy. He has a background in socio-linguistics. His current work includes unsupervised and semi-supervised sequential models of relation extraction and WSD, as well as annotator assessment. He has also worked on temporal relations, metaphors, and prosody. A full list of his publications can be found at(http://www.dirkhovy.com/portfolio/papers/index.php). His other interests include cooking, picking up heavy things (and putting them back down), and medieval art and literature.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 May 2013</td>
<td align=left valign=top>Qing Dou</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_May_2013');">
Deciphering Gigaword
</a><br>
<span id=abs17_May_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> State of the art machine translation systems learn translation rules from large amounts of parallel data (pairs of sentences that are translation of each other). Unfortunately, the amount of parallel data is very limited for many languages and domains. In general, it is easier to obtain monolingual data. Is it possible to learn useful translations from large amounts of monolingual data to improve machine translation when the amount of parallel data is limited? In this talk, I will present my ongoing work that applies decipherment techniques to decipher hundreds of millions Spanish news texts into English and learns a translation lexicon from the decipherment to improve a translation model learned from limited parallel data.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 May 2013</td>
<td align=left valign=top>Dirk Hovy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_May_2013');">
Learning Semantic Types and Relations from Text (Defense Practice Talk)
</a><br>
<span id=abs03_May_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> NLP applications such as Question Answering (QA), Information Extraction (IE), or Machine Translation (MT) are incorporating increasing amounts of semantic information. A fundamental building block of semantic information is the relation between a predicate and its arguments, e.g. eat(John,burger). In order to reason at higher levels of abstraction, it is useful to group relation instances according to the types of their predicates and the types of their arguments. For example, while eat(Mary,burger) and devour(John,tofu) are two distinct relation instances, they share the underlying predicate and argument types INGEST(PERSON,FOOD).
<p>
A central question is: where do the types and relations come from?
<p>
The subfield of NLP concerned with this is relation extraction, which comprises two main tasks:
1. identifying and extracting relation instances from text
2. determining the types of their predicates and arguments
The first task is difficult for several reasons. Relations can express their predicate explicitly or implicitly. Furthermore, their elements can be far part, with unrelated words intervening. In this thesis, we restrict ourselves to relations that are explicitly expressed between syntactically related words. We harvest the relation instances from dependency parses.
The second task is the central focus of this thesis. Specifically, we will address these three problems: 1) determining argument types 2) determining predicate types 3) determining argument and predicate types. For each task, we model predicate and argument types as latent variables in a hidden Markov models. Depending on the type system available for each of these tasks, our approaches range from unsupervised to semi-supervised to fully supervised training methods.
<p>
The central contributions of this thesis are as follows:
1. Learning argument types (unsupervised): We present a novel approach that learns the type system along with the relation candidates when neither is given. In contrast to previous work on unsupervised relation extraction, it produces human-interpretable types rather than clusters. We also investigate its applicability to downstream tasks such as knowledge base population and construction of ontological structures. An auxiliary contribution, born from the necessity to evaluate the quality of human subjects, is MACE (Multi-Annotator Competence Estimation), a tool that helps estimate both annotator competence and the most likely answer.
2. Learning predicate types (unsupervised and supervised): Relations are ubiquitous in language, and many problems can be modeled as relation problems. We demonstrate this on a common NLP task, word sense disambiguation (WSD) for prepositions (PSD). We use selectional constraints between the preposition and its argument in order to determine the sense of the preposition. In contrast, previous approaches to PSD used n-gram context windows that do not capture the relation structure. We improve supervised state-of-the-art for two type systems.
3. Argument types and predicates types (semi-supervised): Previously, there was no work in jointly learning argument and predicate types because (as with many joint learning tasks) there is no jointly annotated data available. Instead, we have two partially annotated data sets, using two disjoint type systems: one with type annotations for the predicates, and one with type annotations for the arguments. We present a semisupervised approach to jointly learn argument types and predicate types, and demonstrate it for jointly solving PSD and supersense-tagging of their arguments. To the best of our knowledge, we are the first to address this joint learning task.
Our work opens up interesting avenues for both the typing of existing large collections of triple stores, using all available information, and for WSD of various word classes.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>12 Apr 2013</td>
<td align=left valign=top>Hui Zhang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs12_Apr_2013');">
Beyond Left-to-Right: Multiple Decomposition Structures for SMT
</a><br>
<span id=abs12_Apr_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Standard phrase-based translation models do not explicitly model context dependence between translation units. As a result, they rely on large phrase pairs and target language models to recover contextual effects in translation. In this work, we explore language models over Minimal Translation Units (MTUs) to explicitly capture contextual dependencies across phrase boundaries in the channel model. As there is no single best direction in which contextual information should flow, we explore multiple decomposition structures as well as  dynamic bidirectional decomposition. The resulting models are evaluated  in an intrinsic task of lexical selection for MT as well as a full MT system, through n-best re-ranking. These experiments demonstrate that additional contextual modeling does indeed benefit a phrase-based system(up to 2.8 BLEU score) and that the direction of conditioning is important. Integrating multiple conditioning orders provides consistent benefit, and the most important directions differ by language pair.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Apr 2013</td>
<td align=left valign=top>Abe Kazemzadeh</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Apr_2013');">
Sentiment and Sarcasm in the 2012 US Presidential Election
</a><br>
<span id=abs05_Apr_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Political discourse is challenging from a sentiment analysis point of view because political issues are subjective and highly dynamic.  Political language may contain neologisms that do not occur frequently in general purpose lexical sentiment models. Also, the presence of humor, sarcasm, and comparatives may introduce errors in sentiment analysis. In Twitter, these issues are amplified by the use of Twitter-specific features and constrained message lengths. In this presentation, we will present a collaborative project between the University of Southern California (USC) Signal Analysis and Interpretation Laboratory, USC Annenberg Innovation Laboratory, and IBM. Our system is relies on manual curation of keywords and hashtags, crowd-sourced annotation, statistical machine learned sentiment models, and a real-time visualization that is ideal for display during live events.  We describe our corpus and several experiments using different settings of our sentiment models.  Among our findings are that sentiment in politics is skewed towards negative, annotation agreement tend to be low, and that sarcasm is a factor that explains some of the annotator disagreement.
<p>
We have also studied bigger picture questions such as how much weight tweets by Big Bird (or someone pretending to be Big Bird) should be allocated in reporting the results of sentiment analysis.  Question about the role of humor and sarcasm in social media lead to some skepticism of naive applications of sentiment analysis but present interesting examples of content that influences social media user behavior and spills over into traditional media.
<p>
This is joint work with Dogan Can, Nikos Malandrakis, Hao Wang, Alex Leavitt, Kevin Driscoll, Kristen Guth, Theo Mazumdar, Varun Lingaraju, Sagar Jhobalia, Mellisa Loudon, Shrikanth Narayanan, FranÃ§s Bar, Kjerstin Thorson, Mike Ananny, Sam Thomson, Ed Elze, Graham Mackintosh, Robert Uleman, Leon Katsnelson, and Chris Gruber.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 Mar 2013</td>
<td align=left valign=top>Carlo Strapparava</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_Mar_2013');">
Computational explorations of creative language
</a><br>
<span id=abs18_Mar_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Dealing with creative language and in particular with affective, persuasive and even humorous language has often been considered outside the scope of computational linguistics. Nonetheless it is possible to exploit current NLP techniques starting some explorations about it. We briefly review some computational experiences about these typical creative genres. We will start introducing techniques for dealing with emotional and witty language. Then we will talk about the exploitation of some extra-linguistic features: for example music and lyrics in emotion detection, and an audience-reaction tagged corpus of political speeches for the analysis of persuasive language. As examples of practical applications, we will present a system for automatized memory techniques for vocabulary acquisition in a second language, and an application for automatizing creative naming (branding).
<p>
Bio: Carlo Strapparava is a senior researcher at FBK-irst (Fondazione Bruno Kessler - Istituto per la ricerca scientifica e Tecnologica) in the Human Language Technologies Unit. His research activity covers artificial intelligence, natural language processing, intelligent interfaces, human-computer interaction, cognitive science, knowledge-based systems, user models, adaptive hypermedia, lexical knowledge bases, word-sense disambiguation, affective computing and computational humour. He is the author of over 150 papers, published in scientific journals, book chapters and in conference proceedings. He also played a key role in the definition and the development of many projects funded  by European research programmes.
He regularly serves in the program committees of the major NLP conferences (ACL, EMNLP, etc.). He was executive board member of SIGLEX, a Special Interest Group on the Lexicon of the Association for Computational Linguistics (2007-2010), Senseval (Evaluation Exercises for the Semantic Analysis of Text) organisation committee (2005-2010).
On June 2011, he was awarded with a Google Research Award on Natural Language Processing, specifically on the computational treatment of creative language.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>08 Mar 2013</td>
<td align=left valign=top>Sujith Ravi</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs08_Mar_2013');">
Scalable Unsupervised Learning for Natural Language Processing
</a><br>
<span id=abs08_Mar_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Abstract: Natural language processing (NLP) tools have become ubiquitous for data analysis in digital environments such as the Web and social media. Popular applications include tools for clustering, sequence labeling, machine translation, to name a few. But unfortunately, majority of the existing toolkits rely on supervised learning to train models using labeled data. This poses several challenges---labeled data is not readily available in all languages or domains and building an NLP system from scratch for a new domain (or language, user, etc.) requires significant human effort which is both time-consuming and expensive. Moreover, scaling this strategy on the Web is infeasible.
<p>
Recent advances in unsupervised algorithms have demonstrated promising results on several NLP tasks without using any labeled data. But despite their utility, scalable unsupervised algorithms rarely provide probabilistic representations of the data which can be useful for predicting on unseen data or integrated as components with a larger model or pipeline. In addition, these methods often favor simple model descriptions (e.g., k-means algorithm for clustering) in exchange for rich statistical models. This leads to the problem of rapidly diminishing returns when applying these methods on increasing amounts of data. Instead, we need to design algorithms that can scale elegantly to large data as well as complex models.
<p>
In this work, I will present our recent work on scalable probabilistic learning with Bayesian inference. We show a novel algorithm for fitting mixtures of exponential families, which generalizes several models that are typically used in NLP and other areas. A major contribution of our work is a novel sampling method that uses locality sensitive hashing to achieve high throughput in generating proposals during sampling. Using "clustering" as an example application, I will describe our approach and show that it scales elegantly to large numbers of clusters achieving a speedup of several orders of magnitude over existing toolkits, while maintaining high clustering quality. In addition, we also prove probabilistic error guarantees for the new sampling algorithm. This is joint work with Amr Ahmed and Alex Smola. Lastly, I will briefly mention some ongoing work on large-scale unsupervised learning for other NLP applications such as machine translation.
<p>
Bio: Sujith Ravi is a Research Scientist at Google. He completed his PhD at University of Southern California/Information Sciences Institute and joined Yahoo! Research, Santa Clara as a Research Scientist before joining Google, Mountain View in 2012. His main research interests span various problems and theory related to the fields of Natural Language Processing (NLP) and Machine Learning. He is specifically interested in large-scale unsupervised and semi-supervised methods and their applications to structured prediction problems in NLP, information extraction, user modeling in social media, graph optimization algorithms for summarizing noisy data, computational decipherment and computational advertising. His work has been reported in several magazines such as New Scientist, ACM TechNews, etc. For more information, you can visit his personal page (http://www.sravi.org).
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Feb 2013</td>
<td align=left valign=top>Louis-Philippe Morency</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Feb_2013');">
Modeling Human Communication Dynamics: From Depression Assessment to Multimodal Sentiment Analysis
</a><br>
<span id=abs22_Feb_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Conference Room [689]<br>
<b>Abstract:</b> Human face-to-face communication is a little like a dance, in that participants continuously adjust their behaviors based on verbal and nonverbal displays and signals. Human interpersonal behaviors have long been studied in linguistic, communication, sociology and psychology. The recent advances in machine learning, pattern recognition and signal processing enabled a new generation of computational tools to analyze, recognize and predict human communication behaviors during social interactions. This new research direction have broad applicability, including the improvement of human behavior recognition, the synthesis of natural animations for robots and virtual humans, the development of intelligent tutoring systems, and the diagnoses of social disorders (e.g., autism spectrum disorder).
<p>
In this talk, I will present some of our recent work modeling multiple aspects of human communication dynamics, including behavioral dynamic, multimodal dynamic and interpersonal dynamic. I will describe the different computational models specifically designed model these dynamics, including the Latent-Dynamic Conditional Random Fields, Multi-view Hidden Conditional Random Fields and the Latent Mixture of Discriminative Experts. I will show how these technologies can be applied to real-world problems such as negotiation outcome prediction, YouTube opinion mining, group learning analytics and psychological distress indicators. Finally, I will summarize our recent progress in integrating these sensing technologies with a virtual human for healthcare application.
<p>
Bio:
<p>
Louis-Philippe Morency is a Research Assistant Professor in the Department of Computer Science at the University of Southern California (USC) and Research Scientist at the USC Institute for Creative Technologies where he leads the Multimodal Communication and Machine Learning Laboratory (MultiComp Lab). He received his Ph.D. and Master degrees from MIT Computer Science and Artificial Intelligence Laboratory. His research interests are in computational study of nonverbal social communication, a multi-disciplinary research topic that overlays the fields of multimodal interaction, computer vision, machine learning, social psychology and artificial intelligence. Dr. Morency was selected in 2008 by IEEE Intelligent Systems as one of the Ten to Watch for the future of AI research. He received 6 best paper awards in multiple ACM- and IEEE-sponsored conferences for his work on context-based gesture recognition, multimodal probabilistic fusion and computational modeling of human communication dynamics. His work was reported in The Economist, New Scientist and Fast Company magazines.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>08 Feb 2013</td>
<td align=left valign=top>Kartik Audhkhasi</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs08_Feb_2013');">
A Computational Framework for Ensembles of Diverse Experts
</a><br>
<span id=abs08_Feb_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Ensembles of machine experts, from simple linear classifiers to complex hidden Markov models, have out-performed single experts across many applications. Likewise, ensembles have been central to computing with human experts, e.g. for data annotation. This widespread use of ensembles, albeit largely heuristic, is motivated by their better generalization and robustness to ambiguity in the production, representation, and processing of information.
<p>
This talk will focus on three important problems which contribute towards a unified computational framework for ensembles of diverse experts. The first problem deals with "modeling" a diverse ensemble. I will present our proposed Globally-Variant Locally-Constant (GVLC) model as a statistical framework for answering this question. The second question is about "analysis", where I will address the link between ensemble diversity and performance using statistical learning theory. The final segment of my talk will focus on "designing" an ensemble of diverse linear classifiers, specifically conditional maximum entropy models. Practical applications throughout the talk will include emotion classification from speech, text classification, and crowd-sourcing for automatic speech recognition.
<p>
Speaker bio: Kartik Audhkhasi received B.Tech. in Electrical Engineering and M.Tech. in Information and Communication Technology from Indian Institute of Technology, Delhi in 2008. He is currently pursuing the Ph.D. degree in Electrical Engineering from University of Southern California, Los Angeles. His thesis research focuses on modeling, analysis, and design of ensembles of multiple human or machine experts. He is also interested in crowd-sourcing for speech and language processing. His broad interests include machine learning and signal processing. Kartik is the recipient of the Annenberg, IBM, and Ming Hsieh Institute PhD fellowships, and best teaching assistant awards of the EE department at USC.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>01 Feb 2013</td>
<td align=left valign=top>Abeer Alwan</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs01_Feb_2013');">
Dealing with Limited and Noisy Data in Speech Processing: A Hybrid Knowledge-Based and Statistical Approach
</a><br>
<span id=abs01_Feb_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> In this talk, I will focus on the importance of integrating knowledge of human speech production and speech perception mechanisms, and language-specific information with statistically-based, data-driven approaches to develop robust and scalable speech processing algorithms. The need for such hybrid systems is especially critical when dealing with data corrupted by background acoustic noise, when training data are limited, and when dealing with accents.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Jan 2013</td>
<td align=left valign=top>Daniel Marcu</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Jan_2013');">
The Things I Learned While Doing Research in the Commercial World
</a><br>
<span id=abs25_Jan_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> When asked, as a PhD student, what I wanted to do when I grow up, I had one and only one answer: academic-oriented, natural language processing research.  During the last decade, I have learned though to also appreciate the research opportunities in the commercial world. In this talk, I will compare several academic and commercial research models and ground the comparison in examples derived from my own experience while working as a researcher for USC, Language Weaver, and SDL
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 Jan 2013</td>
<td align=left valign=top>Shrikanth Narayanan</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_Jan_2013');">
Behavioral Signal Processing: Deriving Human Behavioral Informatics from Multimodal Signals
</a><br>
<span id=abs24_Jan_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Conference Room [689]<br>
<b>Abstract:</b> Human behavior is exceedingly complex. Its expression and experience are inherently multimodal, and are characterized by individual and contextual heterogeneity.  The confluence of sensing, communication and computing is however allowing access to data, in diverse forms and modalities, that is enabling us understand and model human behavior in ways that were unimaginable even a few years ago. No domain exemplifies these opportunities more than that related to human health and wellbeing. Consider for example the domain of Autism where crucial diagnostic information comes from manually-analyzed audiovisual data of verbal and nonverbal behavior. Behavioral signal processing advances can enable not only new possibilities for gathering data in a variety of settings--from laboratory and clinics to free living conditions--but in offering computational models to advance evidence-driven theory and practice.
<p>
This talk will describe our ongoing efforts on Behavioral Signal Processing (BSP)--technology and algorithms for quantitatively and objectively understanding typical, atypical and distressed human behavior--with a specific focus on communicative, affective and social behavior. Using examples drawn from different application domains, the talk will also illustrate Behavioral Informatics applications of these processing techniques that contribute to quantifying higher-level, often subjectively described, human behavior in a domain-sensitive fashion.
[Work supported by NIH, NSF, DARPA, and ONR].
<p>
<p>
Biography of the Speaker:
Shrikanth (Shri) Narayanan is Andrew J. Viterbi Professor of Engineering at USC, where he is Professor of Electrical Engineering, and, jointly in, Computer Science, Linguistics and Psychology. Prior to USC he was with AT&T Bell Labs and AT&T Research. His research focuses on human-centered information processing and communication technologies. He is a Fellow of the Acoustical Society of America, IEEE, and the American Association for the Advancement of Science (AAAS). Shri Narayanan is an Editor for the Computer, Speech and Language Journal and an Associate Editor for the IEEE Transactions on Multimedia, the IEEE Transactions on Affective Computing and the Journal of Acoustical Society of America having previously served an Associate Editor for the IEEE Transactions of Speech and Audio Processing (2000-2004) and the IEEE Signal Processing Magazine (2005-2008). He is a recipient of several honors including the 2005 and 2009 Best Paper awards from the IEEE Signal Processing Society and serving as its Distinguished Lecturer for 2010-11. With his students, he has received a number of best paper awards including winning the Interspeech Challenges in 2009 (Emotion classification), 2011 (Speaker state classification) and in 2012 (Speaker trait classification).  He has published over 500 papers and has 13 U.S. patents.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Jan 2013</td>
<td align=left valign=top>Abe Kazemzadeh</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Jan_2013');">
Natural Language Description of Emotion (Ph.D. Thesis Defense Practice Talk)
</a><br>
<span id=abs11_Jan_2013 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> This dissertation studies how people describe emotions with language
and how computers can simulate this descriptive behavior.  Although
many non-human animals can express their current emotions as social
signals, only humans can communicate about emotions symbolically.
This symbolic communication of emotion allows us to talk about
emotions that we may not currently be feeling, for example describing
emotions that occurred in the past, gossiping about the emotions of
others, and reasoning about emotions hypothetically.  Another feature of this
descriptive behavior is that we talk about emotions as if they were
discrete entities, even though we may not always have necessary and
sufficient observational cues to distinguish one emotion from another,
or even to say what is and is not an emotion. This motivates us to
focus on aspects of meaning that are learned primarily through
language interaction rather than by observations through the senses.
To capture these intuitions about how people describe emotions, we
propose the following thesis: natural language descriptions of emotion
are definite descriptions that refer to intersubjective theoretical entities.
<p>
We support our thesis using theoretical, experimental, computational
results. The theoretical arguments use Russell's notion of definite
descriptions, Carnap's notion of theoretical entities, and the
question-asking period in child language acquisition. The experimental
data we collected include dialogs between humans and computers and
web-based surveys, both using crowd-sourcing on Amazon Mechanical
Turk. The computational models include a dialog agent based on
sequential Bayesian belief update within a generalized pushdown automaton,
as well as a fuzzy logic model of similarity and subsethood between emotion terms.
<p>
For future work, we propose a research agenda that includes a
continuation of work on the emotion domain as well as new work on
other domains where subjective descriptions are established through
natural language communication.
<p>
<p>
Short Bio:
Abe Kazemzadeh is a PhD candidate at the USC Computer Science Dept and
a research assistant at the the Signal Analysis and Interpretation
Laboratory (SAIL). His interests include natural language, logic,
emotions, games, and algebra. He is currently the chief technology
officer at the USC Annenberg Innovation Laboratory (AIL).
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Dec 2012</td>
<td align=left valign=top>Ulf Hermjakob</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Dec_2012');">
Launching Semantics-Based Machine Translation
</a><br>
<span id=abs14_Dec_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Conference Room [1135]<br>
<b>Abstract:</b> I will present work defining an Abstract Meaning Represention (AMR)
(joint work with Kevin Knight et al.) that serves as an intermediate
semantic structure when translating between languages such as Chinese
and English as well as automatic and manual annotation efforts to
build corpora of AMRs.
<p>
I will give a demo of our web-based AMR Editor, which is used by dozens
of annotators at LDC, SDL/LanguageWeaver (Cluj) and other places.
Finally, I will give an overview of our initial end-to-end prototype,
with rule extraction (own work), decoding from source language to AMR
(work by Yinggong Zhao) and AMR to target language generation (Yang Gao).
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Dec 2012</td>
<td align=left valign=top>Shu Cai</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Dec_2012');">
Smatch: an Evaluation Metric for Semantic Feature Structures
</a><br>
<span id=abs07_Dec_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Conference Room [1135]<br>
<b>Abstract:</b> Feature structures are useful for capturing logical semantic relationships. In this talk, we present smatch, a metric that determines semantic overlap between two semantic feature structures. We give an ef.cient algorithm to compute the metric, and we show the results of an inter-annotator agreement study.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Nov 2012</td>
<td align=left valign=top>Jerry Hobbs</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Nov_2012');">
Abduction and Metaphor
</a><br>
<span id=abs16_Nov_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Conference Room [1135]<br>
<b>Abstract:</b> I will talk about recent progress in implementing an efficient method for doing a type of inferencing called abduction, or inference to the best explanation.  I will illustrate its wide applicability to a variety of language interpretation problems.  I'll describe our recent work on implementing ontologies, or logical theories of commonsense domains.  Then I will show how we are applying all this to the interpretation of metaphors.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 Nov 2012</td>
<td align=left valign=top>Ashish Vaswani and David Chiang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_Nov_2012');">
Neural Networks for NLP
</a><br>
<span id=abs09_Nov_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Conference Room [1135]<br>
<b>Abstract:</b> Recent years have seen a resurgence of Neural Networks  in Natural Language Processing. Much of this success can be attributed to learning compact representations (or embeddings) of words, which are used as input to train standard Neural Network architectures. In the first part of the talk I will describe two approaches for learning word embeddings for large vocabularies. In the second part, I will talk about successful applications of Neural Networks in NLP tasks like Part-Of-Speech tagging, Chunking, Parsing etc. without any feature engineering. I will also describe some preliminary work on Neural Networks for unsupervised Part-Of-Speech tagging.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Nov 2012</td>
<td align=left valign=top>Ashish Vaswani and David Chiang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Nov_2012');">
Neural Networks for NLP
</a><br>
<span id=abs07_Nov_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Conference Room [1135]<br>
<b>Abstract:</b> Recent years have seen a resurgence of Neural Networks  in Natural Language Processing. Much of this success can be attributed to learning compact representations (or embeddings) of words, which are used as input to train standard Neural Network architectures. In the first part of the talk I will describe two approaches for learning word embeddings for large vocabularies. In the second part, I will talk about successful applications of Neural Networks in NLP tasks like Part-Of-Speech tagging, Chunking, Parsing etc. without any feature engineering. I will also describe some preliminary work on Neural Networks for unsupervised Part-Of-Speech tagging.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>02 Nov 2012</td>
<td align=left valign=top>Christian Chiarcos</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs02_Nov_2012');">
Linguistic Linked Open Data. Linking Corpora
</a><br>
<span id=abs02_Nov_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Conference Room [1135]<br>
<b>Abstract:</b> In the last 15 years, the interoperability of language resources has
been recognized as a major problem in the development of NLP
infrastructures -- partly due to an increased focus on novel,
underresourced languages and efforts to bootstrap language resources
by annotation projection -- partly due to the increased interest in
more abstract levels of linguistic analysis beyond morphosyntax and
syntax, namely semantics, reference and discourse.
<p>
This talk describes the application of Semantic Web formalisms, RDF,
OWL/DL and SPARQL, to facilitate the interoperability of linguistic
corpora and linguistic annotations. Interoperability of linguistic
corpora involves two aspects: Structural interoperability (annotations
of different origin are represented using the same formalism) and
conceptual interoperability (annotations of different origin are
linked to a common vocabulary). I will describe ontology-based
approaches for both aspects, the POWLA ontology that defines a data
model for annotated corpora, and the Ontologies of Linguistic
Annotation (OLiA) that provide definitions for linguistic categories
and properties (Chiarcos 2012). As compared to state-of-the-art
approaches based on standoff XML, e.g., the recently published ISO
standard for an Linguistic Annotation Framework, key advantages of
this approach include the existence of a rich technological ecosystem
developed around RDF and OWL, including standardized query languages
for directed acyclic (multi-) graphs (SPARQL), APIs, data base
implementations, as well as the availability of OWL reasoners that can
be applied to validate the consistency of linguistic corpora and their
annotations and to infer additional information that is relevant, for
example, for their appropriate visualization.
<p>
Naturally, representing corpora in OWL and RDF also allows to
interlink resources freely, e.g., different annotation layers of a
multi-layer corpus, translated texts in parallel corpora, or
linguistic corpora and lexical-semantic resources. Modeled in this
way, corpora can be fully integrated in a Linked Open Data (sub-)cloud
of linguistic resources, along with lexical-semantic resources and
knowledge bases of information about languages and linguistic
terminology. The second part of my talk will introduce recent efforts
to create a Linked Open Data sub-cloud of linguistic resources, the
Linguistic Linked Open Data cloud (Chiarcos et al. 2012, cf.
http://linguistics.okfn.org).
<p>
References
<p>
Christian Chiarcos, Sebastian Hellmann, Sebastian Nordhoff, et al.
(2012), The Open Linguistics Working Group, Proceedings of the 8th
International Conference on Language Resources and Evaluation
(LREC-2012). Istanbul, Turkey, May 2012.
[http://www.lrec-conf.org/proceedings/lrec2012/pdf/912_Paper.pdf]
<p>
Christian Chiarcos (2012), Interoperability of Corpora and
Annotations, In: Christian Chiarcos, Sebastian Nordhoff, and Sebastian
Hellmann (eds.) Linked Data in Linguistics. Representing and
Connecting Language Data and Language Metadata. Springer, Heidelberg.
[http://www.springer.com/computer/ai/book/978-3-642-28248-5]
<p>
Bio
<p>
Christian Chiarcos studied Computer Science and General Linguistics at
the Technical University Berlin, Germany, and received his PhD in
Computational Linguistics from the University of Potsdam, Germany in
2010. He is currently affiliated with the University of Frankfurt/M.,
Germany. Since April 2012, he is visiting scholar at the ISI. His
primary areas of expertese include the study and modeling of discourse
semantics, as well as the development of infrastructures for rich and
heterogeneous linguistic annotations.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>31 Oct 2012</td>
<td align=left valign=top>Marcello Federico (FBK Trento, Italy), Marco Trombetti (Translated srl, Rome - Italy)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs31_Oct_2012');">
Towards  the integration of human and machine translation
</a><br>
<span id=abs31_Oct_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We will given an overview of  the challenges and early results of an EC-funded project,  named MateCat,
whose goal is  developing an enhanced web-based CAT tool integrating new MT functionalities. In particular,
MateCat will investigate the integration of MT into the CAT working process along three main directions:
self-tuning MT, user adaptive MT, and informative MT.  In this seminar, we will report on recent activities
concerning domain and on-line MT adaptation and will introduce the first version of the MateCat tool,
that will be officially released in open source by the end of the year.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>29 Oct 2012</td>
<td align=left valign=top>Douglas W. Oard, University of Maryland</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs29_Oct_2012');">
Evaluating E-Discovery Search: The TREC Legal Track
</a><br>
<span id=abs29_Oct_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 2:00 pm - 3:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Civil litigation in this country relies on each side making relevant evidence available to the other, a process known as "discovery." The explosive growth of information in digital form has led to an increasing focus on how search technology can best be applied to balance costs and responsiveness in what has come to be known as "e-discovery".
This is now a multi-billion dollar business, one in which new vendors are entering the market frequently, usually with impressive claims about the efficacy of their products or services.  Courts, attorneys, and companies are actively looking to understand what should constitute best practice, both in the design of search technology and in how that technology is employed.  In this talk I will provide an overview of the e-discovery process, and then I will use that background to motivate a discussion of which aspects of that process the TREC Legal Track is seeking to model.  I will then spend most of the talk describing two novel aspects of evaluation design: (1) recall-focused evaluation in large collections, and (2) modeling an interactive process for "responsive review" with fairly high fidelity.  Although I will draw on the results of participating teams to illustrate what we have learned, my principal focus will be on discussing what we presently understand to be the strengths and weaknesses of our evaluation designs.
<p>
About the Speaker:
<p>
Douglas Oard is a Professor at the University of Maryland, College Park, with joint appointments in the College of Information Studies and the Institute for Advanced Computer Studies, where he currently serves as director of the Computational Linguistics and Information Processing lab.  Dr. Oard earned his Ph.D. in Electrical Engineering from the University of Maryland.  His research interests center around the use of emerging technologies to support information seeking by end users.  His recent work has focused on interactive techniques for cross-language information retrieval, searching conversational media such as speech and email, evaluation design for e-discovery in the TREC Legal Track, and support for sense-making in large digital archival collections.  Additional information is available at http://terpconnect.umd.edu/~oard/.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Oct 2012</td>
<td align=left valign=top>Philipp Koehn</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Oct_2012');">
Computer Aided Translation
</a><br>
<span id=abs26_Oct_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 am - 4:00 pm<br>
<b>Location:</b> 10th Floor Conference Room [1026]<br>
<b>Abstract:</b> Despite all the recent successes of machine translation, when it
comes to high quality publishable translation, human translators
are still unchallenged. Since we can't beat them, can we help
them to become more productive? I will talk about some recent
work on developing assistance tools for human translators.
You can also check out a prototype at http://www.caitra.org/
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 Oct 2012</td>
<td align=left valign=top>Marc Schulder</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_Oct_2012');">
Metaphor Detection through Term Frequency
</a><br>
<span id=abs19_Oct_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 am - 4:00 pm<br>
<b>Location:</b> 11th Floor Conference Room [1135]<br>
<b>Abstract:</b> Metaphors are used to replace complicated or unfamiliar ideas with familiar, yet unrelated concepts that share an important attribute with the intended idea. The result is a conceptual mapping between metaphoric source and literal target meaning.
Computational metaphor processing is divided into detection and interpretation.
To detect metaphors, most existing approaches attempt to identify these conceptual mappings. They require resources for the source (metaphor) as well as the target domain, and a set of defined mappings between the two. Creating these resources is expensive and limits the scope of these systems
They are also usually restricted to well-observed, conventionalized metaphors, and can not deal with neologisms. Since metaphors are a productive area of language, this is a major shortfall.
We propose a statistical approach to metaphor detection that utilizes the uncommonness of novel metaphors. Words that do not match a text's typical vocabulary are highlighted as metaphor candidates. No knowledge of semantic concepts or the metaphor's source domain is required for this. We analyze the performance of this approach as an unsupervised standalone classifier and as a feature in a supervised graphical model.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>12 Oct 2012</td>
<td align=left valign=top>Jagadeesh Jagarlamudi</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs12_Oct_2012');">
Discriminative Interlingual Representations for NLP
</a><br>
<span id=abs12_Oct_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 11:00 am - 12:00 pm<br>
<b>Location:</b> 11th Floor Conference Room [1135]<br>
<b>Abstract:</b> The language barrier in many of the multilingual natural language processing (NLP) tasks, such as name transliteration, mining bilingual word translations, etc., can be overcome by mapping objects (names and words in the respective tasks) from different languages (or í—viewsí˜) into a common low-dimensional subspace. Multi-view models learn such a low-dimensional subspace using a training corpus of paired objects, e.g. name pairs written in different languages.
<p>
The central idea of my dissertation is to learn low-dimensional subspaces (or interlingual representations) that are effective for various multilingual and monolingual NLP tasks. First, I demonstrate the effectiveness of interlingual representations in mining bilingual word translations for machine translation, and then proceed to developing models for diverse situations that often arise in NLP tasks. In particular, I design models for 1) bridge setting -- when there are more than two views but we only have training data from a single pivot view into each of the remaining views 2) reranking setting -- when an object from one view is  associated with a ranked list of objects from another view, and finally 3) when the underlying objects have rich structure, such as a tree.
<p>
These problem settings arise frequently in real world applications. I choose a canonical task for each of the settings and compare my model with existing state-of-the-art baseline systems. I provide empirical evidence for the first two models on multilingual name transliteration and the part-of-speech tagging tasks, respectively. For the third problem setting, I discuss my ongoing work on vector based compositionality learning task. This task aims to find the meaning, represented as a vector in d-dimensional space, of a sentence or a phrase based on the meaning of its constituent words.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Oct 2012</td>
<td align=left valign=top>Victoria Fossum</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Oct_2012');">
Sequential vs. hierarchical syntactic models of human sentence processing
</a><br>
<span id=abs10_Oct_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 2:00 pm - 3:00 pm<br>
<b>Location:</b> 6th Floor Conference Room [689]<br>
<b>Abstract:</b> Human incremental sentence processing is the process by which we read
a sentence, word-by-word, and ultimately comprehend its meaning.  A
central question in sentence processing research is to understand the
precise nature of the linguistic representations that we construct
while comprehending a sentence.  Experimental evidence demonstrates
that syntactic structure plays a role in these representations.  But
open questions remain about the type of syntactic structure that is
most relevant to the human sentence processing mechanism: is this
syntactic structure sequential or hierarchical?  Does it include
lexical information (in which case it is "lexicalized"), or is lexical
information processed independently from the syntactic structure (in
which case the syntactic structure is "unlexicalized")?
<p>
A previous study (Frank and Bod, 2011) compared unlexicalized
sequential and hierarchical models of human sentence processing, and
found that sequential models explain observed human behavior (e.g. eye
movements) during sentence processing better than hierarchical models.
The authors concluded that the human sentence processing mechanism is
insensitive to hierarchical syntactic structure.
<p>
We investigate this claim, and find a picture that is more complicated
than the one presented by the previous study.  First, we show that
lexicalized syntactic models explain observed human behavior during
sentence processing better than unlexicalized syntactic models.
Second, we consider a broader set of sequential and hierarchical
models, and show that the findings of (Frank and Bod, 2011) do not
generalize to this broader set.  Finally, we show why, even within the
set of models considered by (Frank and Bod, 2011), their findings are
not entirely conclusive.  Our results indicate that the claim that the
human sentence processing mechanism is insensitive to hierarchical
syntactic structure is premature.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Oct 2012</td>
<td align=left valign=top>Dirk Hovy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Oct_2012');">
Learning Whom to Trust with MACE
</a><br>
<span id=abs05_Oct_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> <br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 Jul 2012</td>
<td align=left valign=top>Stephan Gouws (Stellenbosch University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_Jul_2012');">
Projecting features across domains using deep learning
</a><br>
<span id=abs06_Jul_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Conference Room [1135]<br>
<b>Abstract:</b> Over the last few years, neural network-based deep-learning models achieved good results in various NLP tasks, such as  language modelling, POS tagging, parsing, chunking, and NER. In contrast to discrete models like HMMs, neural models operate by jointly learning continuous input representations (embeddings), and the model to interpret them. These embeddings represent words and/or phrases in a lower-dimensional, latent, syntactic-semantic space and can often be learned in an unsupervised manner.
<p>
We aim to exploit this property of deep learning to transfer knowledge from resource-rich to resource-poor domains. We facilitate the transfer of knowledge by constraining the learned embeddings of both domains to share as much structural similarity as possible. I will discuss preliminary results for noisy text normalization in Twitter, where the task is to transfer the correct clean words from English to the noisy Twitter domain, and review the main deep learning models for NLP (Bengio et al. (2003, Mnih and Hinton (2007), Collobert and Weston (2008), Mikolov et al. (2010), and Socher et al. (2011)).
<p>
Bio:
<p>
Stephan Gouws is a PhD student at Stellenbosch University in South Africa. He is currently on a short-term visit at the ISI. His main research focus is on developing robust, semi-supervised techniques for processing language in and across noisy domains. In 2011 he was also on a 6-month visit to the ISI during which he worked on orthographic normalization of non-standard Twitter text.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 Jul 2012</td>
<td align=left valign=top>Ashish Vaswani</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_Jul_2012');">
Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the l0-norm
</a><br>
<span id=abs03_Jul_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 4th Floor Conference Room<br>
<b>Abstract:</b> Two decades after their invention, the IBM word-based translation models, widely available in the GIZA++ toolkit, remain the dominant approach to word alignment and an integral part of many statistical translation systems. Although many models have surpassed them in accuracy, none have supplanted them in practice.We propose a simple extension to the IBM models: an l0 prior to encourage sparsity in the word-to-word translation model. This extension has been implemented in GIZA++ and scales to large-scale data . We achieve significant improvements over IBM Model 4 in both word alignment and translation quality.
<p>
This is a practice talk for ACL.
<p>
Bio:
<p>
Ashish Vaswani is a PhD student at ISI.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>29 Jun 2012</td>
<td align=left valign=top>Bevan Jones</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs29_Jun_2012');">
Semantic Parsing with Bayesian Tree Transducers
</a><br>
<span id=abs29_Jun_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 4th Floor Conference Room<br>
<b>Abstract:</b> Many semantic parsing models use tree transformations to map between natural language and meaning representation. However, while tree transformations are central to several state-of-the-art approaches, little use has been made of the literature on tree automata, which could both clarify the relationships between different approaches and increase the generality of new contributions. We attempt to clarify the relationship by presenting a tree transducer model that is closely related to previous work made without appealing to automata theory. We then describe a variational Bayesian inference algorithm that is applicable to a wide class of tree transducers, producing state-of-the-art semantic parsing results when coupled with our model while remaining applicable to any domain employing probabilistic tree transducers (not just semantic parsing).
<p>
This is joint work with Mark Johnson and Sharon Goldwater to be presented at this yearâ€™s ACL
<p>
Bio:
<p>
I research computational models of language acquisition, exploring questions of how linguistic structure and meaning might interact during learning. For instance, I have worked on Bayesian models of unsupervised word segmentation, exploring how simultaneous word meaning acquisition influences the identification of lexical boundaries. Currently, I work on semantic parsing, using a combination of Bayesian techniques and automata theory to model more complex structural relationships between compositional meaning and syntactic structure. My PhD began at the department of Cognitive, Linguistic and Psychological Sciences at Brown University but has since moved to the School of Informatics at the University of Edinburgh and the Computing Department of Macquarie University.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Jun 2012</td>
<td align=left valign=top>Vita Markman (Disney Interactive)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Jun_2012');">
Discovering Latent Similarities in Car Models Based On Customer Reviews: Towards a Consumer-Driven Product Recommendation System
</a><br>
<span id=abs22_Jun_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Conference Room [1135]<br>
<b>Abstract:</b> This pilot study explores the hypothesis that customer reviews of cars can be used to create and/or fine tune a recommendation system that offers a list of ranked top-N matches for a given vehicle.  Our main premise is that positive or negative reviews invariably focus on the features relevant to the car being reviewed and hence can be used to uncover subtle similarities among various car models, as well as discover macro-types of cars (e.g. family cars, luxury, high performance sports etc).  To discover similar models based on reviews we propose a Weighted Dice Coefficient which weighs each shared or non-shared word token by its tf-idf score.  Closest top five cars are then discovered for each of the 226 reviewed car models.   We also show that integrating tf-idf scores into the similarity metric improves the accuracy of the top five picks, as compared to the standard Dice Coefficient.
<p>
Bio:
<p>
I graduated from Rutgers in 2005 with a PhD in Linguistics. Having taught linguistics at Pomona College and Simon Fraser University between 2006 and 2008, I moved into industry in 2008. I currently work as a Computational Linguist at Disney Interactive Media Group. My work primarily concerns developing natural language processing techniques to ensure that the content of Disney's online chat is safe for kids. My work involves developing various NLP methods that filter online chat for inappropriate content, while taking into account the vast informality, sparsity, and noise of the on-line child chat language. In addition, I conduct independent research on Twitter data, specifically clustering one-line micro-tweets by topic. My additional research includes mining online car reviews to identify common car-types based on the features people rate as positive or negative.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 May 2012</td>
<td align=left valign=top>Liang Huang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_May_2012');">
Structured Perceptron with Inexact Search (NAACL HLT Practice Talk)
</a><br>
<span id=abs25_May_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Most existing theory of structured prediction assumes exact inference, which is often intractable in many practical problems. This leads to the routine use of approximate inference such as beam search but there is not much theory behind it. Based on the structured perceptron, we propose a general framework of "violation-fixing" perceptrons for inexact search with a theoretical guarantee for convergence under new separability conditions. This framework subsumes and justifies the popular heuristic "early-update" for perceptron with beam search (Collins and Roark, 2004). We also propose several new update methods within this framework, among which the "max-violation" method dramatically reduces training time (by 3 fold as compared to early-update) on state-of-the-art part-of-speech tagging and incremental parsing systems.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 May 2012</td>
<td align=left valign=top>Jason Riesa</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_May_2012');">
Automatic Parallel Fragment Extraction From Noisy Data (NAACL HLT Practice Talk)
</a><br>
<span id=abs18_May_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We present a novel method to detect parallel fragments within noisy parallel corpora. Isolating these parallel fragments from the noisy data in which they are contained frees us from noisy alignments and stray links that can severely constrain translation-rule extraction. We do this with existing machinery, making use of an existing word alignment model for this task. We evaluate the quality and utility of the extracted data on large-scale Chinese-English and Arabic-English translation tasks and show significant improvements over a state-of-the-art baseline.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 May 2012</td>
<td align=left valign=top>Dirk Hovy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_May_2012b');">
Exploiting Partial Annotations with EM Training (NAACL HLT Practice Talk)
</a><br>
<span id=abs18_May_2012b style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> For many NLP tasks, EM-trained HMMs are the common models. However, in order to escape local maxima and find the best model, we need to start with a good initial model. Researchers suggested repeated random restarts or constraints that guide the model evolution. Neither approach is ideal. Restarts are time-intensive, and most constraint-based approaches require serious re-engineering or external solvers. In this paper we measure the effectiveness of very limited initial constraints: specifically, annotations of a small number of words in the training data. We vary the amount and distribution of initial partial annotations, and compare the results to unsupervised and supervised approaches. We find that partial annotations improve accuracy and reduce the need for random restarts, which speeds up training time considerably.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 May 2012</td>
<td align=left valign=top>Dirk Hovy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_May_2012');">
Using Syntactic Information for Unsupervised Relation Extraction and Typing (Thesis Proposal Practice Talk)
</a><br>
<span id=abs03_May_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 4:00 pm - 5:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Question Answering (QA) is a longstanding goal in Natural Language Processing (NLP). In its simplest form, QA relies on keyword matching to find single-word answers (e.g., search engines).
But single words taken out of context are ambiguous -- only context disambiguates them. This meaningful context comes in the form of syntactic and/or semantic relations between predicates and arguments. Relations are thus at the core of meaning and information. Systems like Siri or Watson have put QA in more widespread use, and users move away from single-word questions to more complex ones. Finding and classifying relations to answer those questions will thus become the central challenge for future QA systems.
<p>
The large number of relations makes relation extraction challenging; given a sentence, many possible relations can be extracted. If we can specify the relations we are interested in beforehand, we can annotate data to train supervised systems. Often though, definition beforehand is impossible, and we have to find all possible relations that hold in a text. In those cases, we must rely on unsupervised approaches.
A second problem is rapid adaptation to new domains and topics. Relations extracted from one domain may not be relevant to another.
A third problem is variation in the ways relations are expressed in text. Often, intervening words and phrases between predicates and arguments cause fixed-window pattern matching approaches to fail.
<p>
Most previous relation extraction approaches have either relied on annotated data or (semi-) structured sources of information. These approaches require pre-defined relations and manually annotated data. Furthermore, many of these approaches rely on pattern matching over surface strings, which is not robust to variations. If previous approaches used unsupervised training methods, they largely focused on clustering, effectively ignoring sequential structure in the data.
<p>
The future of QA will require us to quickly adapt to new domains and topics with little annotated data. Only if we can discover and disambiguate relations automatically can we build systems capable of open-ended QA.
I present several techniques for discovering relations from text. I show how to use unsupervised sequential models to discover relations from raw text. These methods do not require any existing resources, manual annotation, or pre-defined relations, and can be applied to any domain. I use dependency parse structures as inputs to these methods, making these approaches more robust to surface variations. I show improvements over state-of-the-art systems as well as novel approaches to fully exploit the structure contained in the data.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 Apr 2012</td>
<td align=left valign=top>Christian Chiarcos (Uni Potsdam)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_Apr_2012');">
Towards operationalizable models of discourse phenomena: Addressing discourse relations
</a><br>
<span id=abs27_Apr_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> The modeling of discourse has been a major topic of research in the linguistics and AI communities for decades. With respect to language, discourse phenomena refer to the use of linguistic indicators that reflect the functional organization of utterances, relationships between different utterances, with the interlocutors' state of mind and with the situational surrounding.
The development of models of discourse that are operationalizable (as a part of NLP applications) is essential, for example, in machine translation:
<p>
* to interpret, to translate and to generate pronouns, definite and indefinite NPs correctly,
<p>
* to translate non-canonical constructions (e.g., passive),
<p>
* to generate the correct word order (e.g., when translating into a free-word order language),
<p>
* to insert or to drop discourse markers and conjunctions, or
<p>
* to choose the appropriate type of syntactic embedding in complex sentences.
<p>
In other branches of NLP, different aspects of discourse are important, e.g., relations between utterances (machine reading), the hierarchical organization of discourse (text summarization) and the sequential organization of utterances in a text (text structuring/natural language generation).
<p>
Numerous models of different aspects of discourse have been proposed, including discourse structure (the hierarchical organization of utterances in discourse), discourse relations (relations between independent utterances in discourse), information structure (the functional structure of utterances in context), and information status (accessibility of antecedents of pronouns, definite descriptions and elliptic constructions). These approaches range from relatively abstract models from cognitive and functional linguistics (e.g.,  Givon 1983), over elaborate formal models developed in formal semantics (e.g., Asher 1993), to "parameterized", rule-based models in AI (e.g., Grosz et al. 1995).
<p>
Since the mid-1990s, this traditional, "theory-centered" line of research has been complemented with an "annotation-centered" methodology, i.e., the development and the use of annotated corpora to test predictions and to develop statistical classifiers. In the first part of the talk, I describe selected activities of the applied computational linguistics group at the University of Potsdam/Germany in this direction, which include
<p>
* the annotation of discourse structure, coreference, information structure and information status (Stede 2004, Krasavina and Chiarcos 2007, Ritz et al. 2008)
<p>
* the development of generic multi-layer architectures capable to represent and to access these annotations along with other types of annotation  applied to the same stretch of data (Chiarcos et al. 2008), e.g., annotations for constituent syntax, dependency syntax, or frame semantics, and
<p>
* the application of machine learning techniques to predict discourse features from less abstract annotation layers (Ritz 2007, Chiarcos 2011).
<p>
The primary drawback of annotation-centered models are the immense cognitive (and thus, financial) efforts necessary to produce reliable discourse annotations. One way to address this problem is to make use of corpora without discourse annotations to test predictions of candidate models, and to develop unsupervised or weakly supervised approaches to support or to replace manual annotation.
<p>
In the second part of my talk, this "data-centered" approach on discourse will be illustrated for the example of discourse relations, one of the main topics of my work at ISI. I describe a pilot study that shows that significant, reproducible and interpretable insights about the discourse relation (that is likely to be) connecting a pair of events can be achieved from a sufficiently large corpus with syntax annotations only. Further, possible lines for subsequent research will be sketched.
<p>
<p>
Nicholas Asher (1993). Reference to Abstract Objects in Discourse. Kluwer, Dordrecht, 1993.
<p>
Christian Chiarcos (2011). Evaluating salience metrics for the context-adequate realization of discourse referents. In: Proceedings of the 13th European Workshop on Natural Language Generation (ENLG 2011). Association of Computational Linguistics, Nancy, France, Sep 2011, 32-43.
<p>
Christian Chiarcos, Stefanie Dipper, Michael Gotze, Ulf Leser, Anke LÃ¼deling, Julia Ritz, and Manfred Stede (2008). A Flexible Framework for Integrating Annotations from Different Tools and Tagsets. TAL (Traitement automatique des langues) 49 (2): 218-248.
<p>
Talmy Givon (ed., 1983). Topic Continuity in Discourse: A Quantitative Cross-Language Study. John Benjamins, Amsterdam and Philadelphia.
<p>
Barbara J. Grosz, Aravind K. Joshi, and Scott Weinstein (1995). Centering: A framework for modelling the local coherence of discourse. Computational Linguistics, 21(2):203â€“225.
<p>
Olga Krasavina and Christian Chiarcos (2007). PoCoS - Potsdam Coreference Scheme. In Proceedings of the Linguistic Annotation Workshop. Held in Conjunction with the ACL-2007, Prague, Czech Republic, pages 156â€“163.
<p>
Julia Ritz, Svetlana Petrova, Michael GÃ¶tze, and Stefanie Dipper (2007). Automatic Identification of Information Structure in Small Corpora of Modern and Old High German. GLDV-Fruhjahrstagung 2007, Tubingen, Germany.
<p>
Julia Ritz, Stefanie Dipper, und Michael GÃ¶tze (2008). Annotation of Information Structure: An Evaluation Across Different Types of Texts. In Proceedings of the the 6th LREC conference. Marrakech, Morocco.
<p>
Manfred Stede (2004). The Potsdam Commentary Corpus. In Bonnie Webber and Donna K. Byron, editors, Proceedings of the ACL-2004 Workshop on Discourse Annotation, Barcelona, pages 96â€“102.
<p>
<p>
<p>
Biography:
<p>
Christian Chiarcos, born 1977, studied Computer Science (MSc, 2002) and General Linguistics (MA, 2004) at the Technical University Berlin, Germany. From 2002 to 2003, he received a scholarship in the context of the project "Collocations in Dictionary" at the Berlin-Brandenburg Academy of Science under the auspicion of Christiane Fellbaum (Princeton). From 2003 to 2005, he participated in the graduate school "Economy and Complexity in Language" at the Humboldt-Unversity at Berlin and the University of Potsdam, Germany, where he developed a corpus-based approach to predict syntactic alternations for Natural Language Generation. This research formed the basis for his PhD thesis "Mental Salience and Grammatical Form" (University of Potsdam, 2010).
<p>
Since 2006, he worked in the Applied Computational Linguistics group at the University of Potsdam, Germany, where he participated in different research projects dedicated to the development of interoperable infrastructures for NLP and multi-layer corpora. Since 2007, this research was carried out in the context of the Collaborative Research Center "Information Structure", a multidisciplinary network of projects at the University of Potsdam and the Humboldt-University Berlin, dedicated to the study of discourse phenomena.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Mar 2012</td>
<td align=left valign=top>Jason Riesa</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Mar_2012');">
Syntactic Alignment Models for Large-Scale Translation (PhD Defense Practice Talk)
</a><br>
<span id=abs16_Mar_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Word alignment, the process of inferring the implicit links between words across two languages, serves as an integral piece of the puzzle of learning linguistic translation knowledge. It enables us to acquire automatically from data the rules that govern the transformation of words, phrases, and syntactic structures from one language to another. Word alignment is used in many tasks in Natural Language Processing, such as bilingual dictionary induction, cross-lingual information retrieval, and distilling parallel text from within noisy data. In this talk, we focus on word alignment for statistical machine translation.
<p>
We advance the state-of-the-art in search, modeling, and learning of alignments and show empirically that, when taken together, these contributions significantly improve the output quality of large-scale statistical machine translation, outperforming existing methods. The work we describe may be used for any language-pair, supporting arbitrary and overlapping features from varied sources.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Feb 2012</td>
<td align=left valign=top>Adam Pauls (UC Berkeley)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Feb_2012');">
Large Scale Syntactic Language Modeling with Treelets
</a><br>
<span id=abs17_Feb_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We propose a simple generative syntactic language model that conditions on overlapping tree contexts in the same way that n-gram language models condition on overlapping sentence context. We estimate the parameters of our model by collecting counts from automatically parsed text using standard n-gram language model estimation techniques, allowing us to train a model on over one billion tokens of data using a single machine in a mater of hours. We evaluate on a range of grammaticality tasks, and find that we consistently outperform n-gram models and other generative baselines, and even compete with state-of-the-art discriminative models hand-designed for each task, despite training on positive data alone. We also show some improvements in preliminary machine translation experiments.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Feb 2012</td>
<td align=left valign=top>Liang Huang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Feb_2012');">
Efficient Search and Learning for Language Understanding and Translation
</a><br>
<span id=abs10_Feb_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 6th Floor Large Conference Room [689]<br>
<b>Abstract:</b> <p>
What is in common between translating from English into Chinese and
compiling C++ into machine code? And yet what are the differences that
make the former so much harder for computers? How can computers learn
from human translators?
<p>
This talk sketches an efficient (linear-time) "understanding +
rewriting" paradigm for machine translation inspired by both human
translators as well as compilers. In this paradigm, a source language
sentence is first parsed into a syntactic tree, which is then
recursively converted into a target language sentence via
tree-to-string rewriting rules. In both "understanding" and
"rewriting" stages, this paradigm closely resembles the efficiency and
incrementality of both human processing and compiling. We will discuss
these two stages in turn.
<p>
First, for the "understanding" part, we present a linear-time
approximate dynamic programming algorithm for incremental parsing that
is as accurate as those much slower (cubic-time) chart parsers, while
being as fast as those fast but lossy greedy parsers, thus getting the
advantages of both worlds for the first time, achieving
state-of-the-art speed and accuracy. But how do we efficiently learn
such a parsing model with approximate inference from huge amounts of
data? We propose a general framework for structured prediction based
on the structured perceptron that is guaranteed to succeed with
inexact search and works well in practice.
<p>
Next, the "rewriting" stage translates these source-language parse
trees into the target language. But parsing errors from the previous
stage adversely affect translation quality. An obvious solution is to
use the top-k parses, rather than the 1-best tree, but this only helps
a little bit due to the limited scope of the k-best list. We instead
propose a "forest-based approach", which translates a packed forest
encoding *exponentially* many parses in a polynomial space by sharing
common subtrees. Large-scale experiments showed very significant
improvements in terms of translation quality, which outperforms the
leading systems in literature. Like the "understanding" part, the
translation algorithm here is also linear-time and incremental, thus
resembles human translation.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>13 Jan 2012</td>
<td align=left valign=top>Hercules Dalianis (Stockholm University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs13_Jan_2012');">
Reusing clinical documentation for better health
</a><br>
<span id=abs13_Jan_2012 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Today a large number of Electronic Patient Records (EPRs) are produced for legal reasons but they are very seldom reused, neither for clinical research nor for business (hospital) intelligence reasons. Moreover, the clinician's daily work in documenting the patient status is not always supported in a proper way. Hospital management needs key and real time information of the health care processes. Simultaneously, patients have become more demanding customers that want to be involved in their own health care process. We are aiming to support these demands.
<p>
Clinical documentation forms an abundant source to extract valuable information that can be used for this purpose, however clinical corpora contain protected health information and must be kept in a safe way. Today only in Sweden (with a population of 10 million) 4-10 million pages of patient records are produced each year.
<p>
We have studied the Stockholm EPR Corpus, a huge clinical document collection written in Swedish, containing over one million patient records. The document collection is distributed over 900 clinics from the Stockholm area encompassing three years 2006-2008. We have used this clinical corpus as a knowledge base to develop a set of tools that can work as basic building blocks for the future tools for health engineering. We have been assisted by physicians that have interpreted the content in the clinical text to us, they have annotated the clinical text and they have also set requirements on these tools together with their colleagues. We have identified four groups of users in the health domain: physicians, clinical researchers, hospital management and patients. We will show examples on these tools and the benefits they will give to health care.
<p>
1) For physicians: Automatic ICD-10 assignment 2) For clinical researchers: Comorbidity networks 3) For hospital management: ICD-10 validation and adverse event detection, and finally 4) For patients: automatic text summarization.
<p>
Brief Bio:
<p>
Dr. Hercules Dalianis, Professor, born 20 July 1959
<p>
Dalianis is a professor in Computer and Systems Sciences at Stockholm University. Dalianis received his Ph.D in 1996. Dalianis was a postdoc researcher at University of Southern California/ISI in Los Angeles in 1997. Dalianis was also postdoc researcher (forskarassistent) at KTH-Royal Institute of Technology in Stockholm, 1999-2003. Dalianis held a three year guest professorship at CST, University of Copenhagen during 2002-2005, founded by the Norfa, the Nordic council. Dalianis works in the interface between industry and university and with the aim to make research results useful for society. Dalianis has specialized in the area of human language technology, to make computers understand and process human language text, but also to make computers produce text automatically. Currently Dalianis is working in the area of clinical text mining with the aim to improve health care in form of better electronic patient record systems, presentation of the patient records and extraction of valuable information for clinical researchers as well as for the patients.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Dec 2011</td>
<td align=left valign=top>Chris Dyer (Carnegie Mellon)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Dec_2011');">
Generate-and-Test Models for Alignment and Machine Translation
</a><br>
<span id=abs16_Dec_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> I discuss translation as an optimization problem subject to
three kinds of constraints: lexical, configurational, and constraints
enforcing target-language wellformedness. Lexical constraints ensure
that the lexical choices in the output are meaning-preserving;
configurational constraints ensure that the relationships between
source words and phrases (e.g., semantic roles and modifier-head
relationships) are properly transformed in translation; and
target-language wellformedness constraints ensure the grammaticality
of the output. In terms of the traditional source-channel model of
Brown et al. (1993), the "translation model" encodes lexical and
configurational constraints and the "language model" encodes target
language wellformedness constraints. On the other hand, the
constraint-based framework suggests a generate-and-test
(discriminative) model of translation in which features sensitive to
input and output structures, and the feature weights are trained to
maximize the (conditional) likelihood of a corpus of example
translations. The specified features represent empirical hypotheses
about what variables correlate (but not why) and thus encode
domain-specific knowledge that is useful for the problem at hand; the
learned weights indicate to what extent these hypotheses are confirmed
or refuted.
<p>
To verify the usefulness of the feature-based approach, I discuss the
performance two models: first, a lexical translation model evaluated
by the word alignments it learns. Unlike previous unsupervised
alignment models, the new model utilizes features that capture diverse
lexical and alignment relationships, including morphological
relatedness, orthographic similarity, and conventional co-occurrence
statistics. Results from typologically diverse language pairs
demonstrate that the generate-and-test model provides substantial
performance benefits compared to state-of-the-art generative
baselines.  Second, I discuss the results of an end-to-end translation
model in which lexical, configurational, and wellformedness
constraints are modeled independently. Because of the independence
assumptions, the model is substantially more compact than
state-of-the-art translation models, but still performs significantly
better on languages where source-target word order differences are
substantial.
<p>
Bio: Chris Dyer is a postdoctoral researcher in Noah Smith's lab in
the Language Technologies Institute at Carnegie Mellon University. He
completed his PhD on statistical machine translation with Philip
Resnik at the University of Maryland in 2010. Together with Jimmy Lin,
he is author of "Data-Intensive Text Processing with MapReduce",
published by Morgan & Claypool in 2010. Current research interests
include machine translation, unsupervised learning, Bayesian
techniques, and "big data" problems in NLP.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>12 Dec 2011</td>
<td align=left valign=top>Gael Dias (University of Caen Basse-Normandie, France)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs12_Dec_2011');">
Cross Domain Subjectivity Classification using Multi-View Learning
</a><br>
<span id=abs12_Dec_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 4:00 pm - 5:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> In this talk, we will present our research on learning models
with high cross domain accuracy for subjectivity classification. After a
small introduction about related works and challenges of sentiment
analysis, we will start by presenting new features for subjectivity
analysis. Then, we will present two different paradigms of multi-view
learning strategies to learn transfer models: multi-view learning with
agreement and guided multi-view learning. Then, we will present an
exhaustive evaluation based on both paradigms including two
states-of-the-art algorithms and show that accuracy over 91% can be
obtained using three views. In our concluding remarks, we will talk
about future extensions of the presented methodology. Then, we will
briefly present the Human Language Technology team of the GREYC
Laboratory of the University of Caen Basse-Normandie (France) and
present projects that are being studied ans further prospects.
<p>
Biography: Gael Dias is full professor at the University of Caen
Basse-Normandie (France). His research interests include unsupervised
methodologies for text mining, information retrieval and text
summarization. His recent research focuses on Sentiment Analysis,
Ontology Learning, Lexical Semantics, Web Personalization and
Collaboration, Temporal Information Retrieval, and Paraphrase Extraction
and Identification. He has served on program committees of international
conferences and workshops such as ACL/HLT 2011, COLING 2010, IJCNLP/ACL
2009, ACL 2007, HLT-NAACL 2007, COLING/ACL 2006 as well as is/was a
reviewer for Information Processing and Management, IEEE Transactions on
Audio, Speech and Language Processing, Natural Language Engineering
Journal, Journal of Language Resources and Evaluation, Journal of
Computer Speech and Language and  ACM Transactions on Speech and
Language Processing.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>04 Nov 2011</td>
<td align=left valign=top>Ariya Rastrow (Johns Hopkins)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs04_Nov_2011');">
Going beyond n-grams: Incorporating non-local dependencies for Speech Recognition
</a><br>
<span id=abs04_Nov_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Due to the availability of large amounts of training data and
computational resources, building more complex models with sentence
level knowledge and longer dependencies has been an active area of
research in automatic speech recognition (ASR). Yet, due to the
complexity of the speech recognition task, integration of many of
these complex and sophisticated knowledge sources into the first
decoding pass is not feasible. Many of these long-span models cannot
be represented as weighted finite-state automata (WFSA), making it
difficult even to incorporate them in a lattice rescoring pass.
<p>
First, we motivate our work by providing compelling empirical evidence
that n-gram LMs are not sufficient for ASR task and why we need to
incorporate non-local features such as syntax.  The development of
language models with such long-span (non-local) features is underway,
but is not addressed in this talk.  We instead address how such models
should be trained discriminatively and applied effectively.
Specifically, we describe a new approach for rescoring speech lattices
with such models (acoustic or language) that does not entail
computationally intensive lattice expansion or limited rescoring of
only an N -best list.
<p>
We view the set of word-sequences in a lattice as a discrete space and
develop a hill climbing technique to start with, say, the 1-best
hypothesis under the lattice-generating model(s) and iteratively
improve it using the new model. We demonstrate empirically that to
achieve the same reduction in error rate using a better estimated,
higher order LM, our technique evaluates fewer hypotheses than
conventional N-best rescoring by up to two orders of magnitude.
<p>
We also propose to integrate the idea of hill climbing into the
training of discriminative language models with non-local sentence
level features. Discriminative models provide the flexibility to
include both local n-gram features and arbitrary sentence level
features. However, unlike generative LMs with long-span dependencies
where one has to resort to N-best lists only during decoding
(rescoring), discriminative models force the use of N-best lists even
for LM training. We demonstrate significant computational saving during training as well as error-rate reduction over N-best training methods.
<p>
Bio:
<p>
Ariya Rastrow is a Ph.D. candidate at Johns Hopkins University,
working with Sanjeev Khudanpur and Mark Dredze. He was initially
advised by Fred Jelinek. The focus of his PhD research is to advance
speech recognition systems to efficiently incorporate linguistically
motivated non-local features into language models. In his recent work,
he has developed an efficient hill-climbing algorithm to apply
non-local complex models for the speech recognition task. He has also
worked on out-of-vocabulary (OOV) detection, spoken term detection and
semi-supervised adaptation techniques for speech recognition.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Oct 2011</td>
<td align=left valign=top>Ekaterina Ovchinnikova</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Oct_2011');">
Integration of World Knowledge for Natural Language Understanding
</a><br>
<span id=abs07_Oct_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Traditional inference-based natural language understanding (NLU) in a
computational framework suffered mainly from a lack of a sufficiently
large knowledge base of commonsense knowledge. Recent advances have
changed this situation: A large amount of machine-readable knowledge
is now freely available to the community. This talk focuses on
exploiting these developments to model large-scale NLU in an
inference-based framework.
<p>
The three main types of the existing knowledge sources are
lexical-semantic dictionaries, distributional resources, and
ontologies. After comparing these types of resources and outlining
their differences, I will present an integrative knowledge base
combining lexical-semantic, ontological, and distributional knowledge
in a modular way.
<p>
I will then talk about reasoning procedures able to make use of the
large scale knowledge base. In particular, I will compare two main
forms of logical inferences applied to NLU: deduction and abduction.
<p>
In the last part of the talk, I will present experiments on the
following knowledge-intensive NLU tasks: recognizing textual
entailment, semantic role labeling, and paraphrasing of noun-noun
dependencies.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>04 Oct 2011</td>
<td align=left valign=top>Steve DeNeefe</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs04_Oct_2011');">
Tree-adjoining Machine Translation (Ph.D. Defense Practice Talk)
</a><br>
<span id=abs04_Oct_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 4:00 pm - 5:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Machine Translation (MT) is the task of translating a document from a
source language (e.g., Chinese) into a target language (e.g., English)
via computer.  State-of-the-art statistical approaches to MT use large
collections of human-translated documents as training material,
gathering statistics on the patterns of correspondence between
languages according to the features specified by the translation
model.  Using this bilingual translation model in conjunction with a
target language model, created by gathering statistics from a large
monolingual corpus, a new document in the source language can be
automatically translated into its target-language equivalent with
surprising accuracy.
<p>
Much  MT research focuses on types of the patterns and features to
include in a translation model.
Recent statistical MT models have used syntax trees to enforce
grammaticality, but the currently popular tree substitution models
only memorize sequences of words or constituents, specifying exactly
what phrases to use and exactly what trees are grammatical, which does
not generalize well.  Adding the operation of tree-adjoining provides
the freedom to splice additional information into an existing
grammatical tree.  An adjoining translation model allows general,
linguistically-motivated translation patterns to be learned without
the clutter of endless variations of optional material.  The
appropriate modifiers, such as adjectives, adverbs, and prepositional
phrases, can be grafted into these core patterns as needed to
translate details.  We show that the increased generalization power
provided by adjoining, when used carefully, improves MT quality
without becoming computationally intractable.
<p>
In this thesis, we describe challenges encountered by both word-sequence-based
and syntax-tree-based MT systems today, and present an
in-depth, quantitative comparison of both models.  Then we describe a
novel model for statistical MT which addresses these challenges using
a synchronous tree-adjoining grammar.  We introduce a method of
converting these grammars to a weakly equivalent tree transducer for
decoding.   Then we present a method for learning the rules and
associated probabilities of this grammar from aligned tree/string
training data, and empirically analyze important characteristics of
the resulting model, considering and evaluating many variations.
Finally, our results show that adjoining delivers a consistent
improvement over a baseline statistical syntax-based MT model on both
medium and large-scale MT tasks using several language pairs.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Sep 2011</td>
<td align=left valign=top>Dirk Hovy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Sep_2011');">
Aligning Events and Time Stamps
</a><br>
<span id=abs30_Sep_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 4:00 pm - 5:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Machine Reading relies to a large extent on information about entities and events. While the definition of events is controversial, most people agree that they have certain properties like a time and a place.
<p>
We exploit this by trying to establish relations between events (such as ``bombing''  or ``election'') and temporal expressions that can be resolved to a timestamp, i.e., an expression like ``last Tuesday'' to an absolute value like 20110802.
This enables a number of interesting applications, such as generation of absolute timelines, cross-document event coreference, and resolution of logical discrepancies.
<p>
We define a baseline approach and improve upon it by identifying important subproblems (within-sentence vs. across-sentence), casting them as a relation extraction problem and showing that classification with kernel methods works well in capturing the information. Our results are competitive with previous approaches and reach a F-score of 76.6.
We also show that resolution across sentences is a lot harder and cannot be approached with the same techniques used for the within-sentence. We outline some promising findings and suggest further research.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Sep 2011</td>
<td align=left valign=top>Cerstin Mahlow (University of Zurich)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Sep_2011');">
Linguistically supported editing and revising: concept and prototypical implementation based on interactive NLP resources
</a><br>
<span id=abs16_Sep_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Composing, revising, and editing are highly demanding tasks. Even in polished
and published texts from professional writers we can observe errors and mistakes.
For many errors, we can infer how they came to be: Word processors offer
character-based functions only. These functions do not take into account
elements and structures of the language the author is using. Authors are thus
forced to translate their high-level goals into long and complex sequences
of low-level character-based functions. Both the translation process and the
execution of such sequences of functions are error-prone.
<p>
However, in text editors for programmers ww find so-called language-aware
editing functions. These functions operate on the elements and structures of a
programming or mark-up language and help to avoid errors, as language-aware
functions make revising and editing less tedious and error-prone.
<p>
We argue that the concept of language awareness can be transferred to writing
natural language texts using word processors. We propose functions that take the
structures of natural languages into consideration. We distinguish information
functions, movement functions, and operations to support revising and editing.
The design is based on current findings from writing research.
<p>
Language-aware editing functions rely on the recognition and categorization
of relevant elements and structures with respect to a certain language. We
use methods and resources from computational linguistics for morphological
analysis and generation, and for part-of-speech tagging. When evaluating
respective resources we face a rather disappointing situation: NLP resources
for German are less suitable than assumed and less applicable for real-world
applications than usually claimed in the literature.
<p>
Our prototypical implementation of language-aware functions for revising and
editing of German texts serves as a proof of concept. The implementation
illustrates opportunities and limits of current NLP resources for German.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 Sep 2011</td>
<td align=left valign=top>Richard Socher (Stanford University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_Sep_2011');">
Recursive Deep Learning in Natural Language Processing and Computer Vision
</a><br>
<span id=abs09_Sep_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Hierarchical and recursive structure is commonly found in different
modalities, including natural language sentences and scene images.  I
will present some of our recent work on three recursive neural network
architectures that learn meaning representations for such hierarchical
structure. These models obtain state-of-the-art performance on several
language and vision tasks.
<p>
The meaning of phrases and sentences is determined by the meanings of
its words and the rules of compositionality. We introduce a recursive
neural network (RNN) for syntactic parsing which can learn vector
representations that capture both syntactic and semantic information
of phrases and sentences. For instance, the phrases "declined to
comment" and "would not disclose" have similar representations.
Since our RNN does not depend on specific assumptions for language, it
can also be used to find hierarchical structure in complex scene
images. This algorithm obtains state-of-the-art performance for
semantic scene segmentation on the Stanford Background and the MSRC
datasets and outperforms Gist descriptors for scene classification by
4%.
<p>
The ability to identify sentiments about personal experiences,
products, movies etc. is crucial to understand user generated content
in social networks, blogs or product reviews. The second architecture
I will talk about is based on semi-supervised recursive autoencoders (RAE).
RAEs learn vector representations for phrases sufficiently well as to
outperform other traditional supervised sentiment classification methods
on several standard datasets.
Lastly, I describe an alternative unsupervised RAE model that can learn
features which outperform previous approaches for paraphrase
detection on the Microsoft Research Paraphrase corpus.
<p>
This talk presents joint work with Andrew Ng and Chris Manning.
<p>
<p>
Bio:
Richard Socher is a Computer Science PhD student at Stanford,
co-advised by Chris Manning and Andrew Ng.
Most recently, he won the Yahoo! Key Scientific Challenges Program
Award and the Distinguished Application Paper Award at ICML, 2011
for his work on recursive deep learning.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 Aug 2011</td>
<td align=left valign=top>Sravana Reddy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_Aug_2011b');">
Cracking Running-Key Ciphers and Deciphering Speech (Interns Final Talk)
</a><br>
<span id=abs24_Aug_2011b style="display:none;">
<font size=-1>
<b>Time:</b> 2:30 pm - 3:00 pm<br>
<b>Location:</b> 4th Floor Large Conference Room [460]<br>
<b>Abstract:</b> In the first part of this talk, I will discuss our work on deciphering running-key ciphers, which are produced by encrypting the plaintext with a natural language string of the same length as the plaintext (the 'running key'). These ciphers are harder to crack than simple substitution ciphers, and no previous work has succeeded in decoding them.
<p>
The second part of the talk will address the problem of speech recognition without access to word pronunciations or annotated training data. The problem's motivations arise from languages and domains where pronunciation lexicons and transcribed speech are not available. Given a representation of the speech as a sequence of phonemes, and a language model from non-parallel text, we present methods to find the sequence of words correspoding to the speech input.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 Aug 2011</td>
<td align=left valign=top>Xuchen Yao</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_Aug_2011');">
Introducing context-dependent features into machine translation (Interns Final Talk)
</a><br>
<span id=abs24_Aug_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 2:00 pm - 2:30 pm<br>
<b>Location:</b> 4th Floor Large Conference Room [460]<br>
<b>Abstract:</b> One fundamental assumption in machine translation is that sentences are translated independently of each other. We attack this assumption by trying to achieve lexical translation consistence among sentences within the same document. An additional lexicon reuse feature is introduced to help the decoder select a more consistent translation. In this talk we will discuss the design of the reuse feature and show experimental results.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 Aug 2011</td>
<td align=left valign=top>Stephen Tratz (PhD defense practice talk)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_Aug_2011');">
Semantically-Enriched Parsing for Natural Language Understanding
</a><br>
<span id=abs19_Aug_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> This thesis details three contributions to the advancement of
semantic-enriched parsing for English sentences: inventories of semantic
relations covering three semantically ambiguous linguistic phenomena,
large datasets annotated according to the inventories, and, finally, a
suite of tools for semantically-enriched parsing built using the data.
For the purposes of this thesis, semantically-enriched parsing is
defined as the reconstruction of the underlying grammatical structure of
text along with shallow semantic annotation of semantically-ambiguous
structures. Ultimately, semantically-enriched parsing is one of the most
critical steps in natural language understanding---the initial step in
which the text is read by the machine into a knowledge representation
for further processing and reasoning.
<p>
The first contribution of this thesis is to advance the theoretical
foundations for the interpretation of three ambiguous linguistic
phenomena in English that have significant overlap in terms of the
relations expressed: noun compounds, possessive constructions, and
prepositions. For these, I define inventories of relations based upon
extensive annotation by myself, previous work by others, and
inter-annotator agreement studies. In the case of prepositions, the
relations are created by refining an existing resource whereas the other
two are created from scratch. In addition to mappings to prior work,
mappings are provided across the different inventories in order to
create a unified set of relations.
<p>
Second, I produce large datasets annotated according to the
aforementioned sense inventories. Such data is vital for training most
automatic tools and also provides exemplars for the theory embodied in
the inventories. Some of these datasets are created from scratch,
including a collection of over 17,500 noun compounds and a collection of
over 21,900 possessive construction examples. In the case of
prepositions, an existing resource including over 24,000 annotated
examples is refined.
<p>
The final contribution is a suite of tools that can construct
semantically-enriched parse trees. The suite is designed to work in a
sequential, pipeline-like fashion and can be thought of as consisting of
two subsections. The first part reconstructs the grammatical structure
of the text using a dependency parser that extends the non-directional
easy-first algorithm developed by Goldberg and Elhadad (2010) in order
to support non-projective trees and is trained using my improved
dependency tree conversion of the Penn Treebank. Second is a semantic
annotation module that adds shallow semantic annotation for noun
compounds, preposition senses, and possessives. Combined, these tools
produce semantically-enriched parse trees that include both grammatical
structure and shallow semantics. The core parser itself achieves
state-of-the-art accuracy and can process over 75 sentences per second,
which is substantially faster than most of the accurate parsers
available today.
<p>
In conclusion, this thesis work provides significant contributions to
computational linguistics, both in terms of theory and resources. It
advances our understanding of the relations expressed by three
semantically-ambiguous linguistic phenomena, creates large annotated
datasets useful for machine learning, and produces a fast, accurate, and
informative system for semantically-enriched parsing.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Aug 2011</td>
<td align=left valign=top>Licheng Fang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Aug_2011');">
Structured Language Modelling for Machine Translation
</a><br>
<span id=abs17_Aug_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 2:00 pm - 2:30 pm<br>
<b>Location:</b> 4th Floor Large Conference Room [460]<br>
<b>Abstract:</b> Machine translation can potentially benefit from the guidance of a
language model that evaluates translation candidates based on syntactic
structures. In this talk we are going to describe the summer project to
build such an incremental structured language model that can be used in
machine translation systems that generate the target language in a
left-to-right manner. We will describe in detail our work in modelling,
search, and parameter smoothing.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Aug 2011</td>
<td align=left valign=top>Dave Uthus</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Aug_2011');">
Overcoming Information Overload in Navy Chat
</a><br>
<span id=abs05_Aug_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 4th Floor Large Conference Room [460]<br>
<b>Abstract:</b> In this talk, I will describe the research we are undertaking at the Naval Research Laboratory which revolves around chat (such as Internet Relay Chat) and the problems it causes in the military domain. Chat has become a primary means for command and control communications in the US Navy. Unfortunately, its popularity has contributed to the classic problem of information overload. For example, Navy watchstanders monitor multiple chat rooms while simultaneously performing their other monitoring duties (e.g.,  tactical situation screens and radio communications). Some researchers have proposed how automated techniques can help to alleviate these problems, but very little research has addressed this problem.
<p>
I will give an overview of the three primary tasks that are the current focus of our research. The first is urgency detection, which involves detecting important chat messages within a dynamic chat stream. The second is summarization, which involves summarizing chat conversations and temporally summarizing sets of chat messages. The third is human-subject studies, which involves simulating a watchstander environment and testing whether our urgency detection and summarization ideas, along with 3D-audio cueing, can aid a watchstander in conducting their duties.
<p>
Short Bio: David Uthus is a National Research Council Postdoctoral Fellow hosted at the Naval Research Laboratory, where he is currently undertaking research focusing on analyzing multiparticipant chat. He received his PhD (2010) and MSc (2006) from the University of Auckland in New Zealand and his BSc (2004) from the University of California, Davis. His research interests include microtext analysis, machine learning, metaheuristics, heuristic search, and sport scheduling.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Jul 2011</td>
<td align=left valign=top>Markus Dreyer (SDL Language Weaver)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Jul_2011b');">
Discovering Morphological Paradigms from Plain Text Using a Dirichlet Process Mixture Model (EMNLP 2011 practice talk)
</a><br>
<span id=abs15_Jul_2011b style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We present an inference algorithm that organizes observed
words (tokens) into structured inflectional paradigms (types). It
also naturally predicts the spelling of unobserved forms that are
missing from these paradigms, and discovers inflectional
principles (grammar) that generalize to wholly unobserved words.
Our Bayesian generative model of the data explicitly represents
tokens, types, inflections, paradigms, and locally conditioned
string edits. It assumes that inflected word tokens are generated
from an infinite mixture of inflectional paradigms (string
tuples). Each paradigm is sampled all at once from a graphical
model, whose potential functions are weighted infinite-state
transducers with language-specific parameters to be learned. These
assumptions naturally lead to an elegant empirical Bayes
inference procedure that exploits Monte Carlo EM, belief
propagation, and dynamic programming. Given 50-100 seed
paradigms, adding a 10-million-word corpus reduces prediction
error for morphological inflections by up to 10%.
<p>
This is joint work with Jason Eisner, JHU.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Jul 2011</td>
<td align=left valign=top>Jonathan May (SDL Language Weaver)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Jul_2011');">
Tuning as Ranking (EMNLP 2011 practice talk)
</a><br>
<span id=abs15_Jul_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We offer a simple, effective, and scalable method for statistical
machine translation parameter tuning based on the pairwise approach to
ranking. Unlike the popular MERT algorithm, our pairwise ranking
optimization (PRO) method is not limited to a handful of parameters
and can easily handle systems with thousands of features. Moreover,
unlike recent approaches built upon the MIRA algorithm of Crammer and
Singer, PRO is easy to implement. It uses off-the-shelf linear binary
classifier software and can be built on top of an existing MERT
framework in a matter of hours. We establish PRO's scalability and
effectiveness by comparing it to MERT and MIRA and demonstrate parity
on both phrase-based and syntax-based systems in a variety of language
pairs, using large scale data scenarios.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Jul 2011</td>
<td align=left valign=top>Deniz Yuret (Koc University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Jul_2011');">
The Noisy Channel Model for Unsupervised Word Sense Disambiguation
</a><br>
<span id=abs07_Jul_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We introduce a generative probabilistic model, the noisy
channel model, for unsupervised word sense disambiguation. In our
model, each context C is modeled as a distinct channel through which
the speaker intends to transmit a particular meaning S using a
possibly ambiguous word W. To reconstruct the intended meaning the
hearer uses the distribution of possible meanings in the given context
P(S|C) and possible words that can express each meaning P(W|S). We
assume P(W|S) is independent of the context and estimate it using
WordNet sense frequencies. The main problem of unsupervised WSD is
estimating context dependent P(S|C) without access to any sense tagged
text. We show one way to solve this problem using a statistical
language model based on large amounts of untagged text. Our model uses
coarse-grained semantic classes for S internally and we explore the
effect of using different levels of granularity on WSD performance.
The system outputs fine grained senses for evaluation and its
performance on noun disambiguation is better than most previously
reported unsupervised systems and close to the best supervised
systems.
<p>
Short Bio: Deniz Yuret is an assistant professor in Computer Engineering at Koc University in Istanbul. Previously he was at the MIT AI Lab and later
co-founded Inquira, Inc. His research is on lexical semantics and
unsupervised approaches to parsing and disambiguation. Currently he is
one of the organizers of the SemEval3 semantic evaluation exercise,
co-chair for the ACL 2011 semantics area, and an editor for the
Computational Linguistics Journal.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>28 Jun 2011</td>
<td align=left valign=top>Suzy Howlett (Macquarie University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs28_Jun_2011');">
Confidence in Syntax for Statistical Machine Translation
</a><br>
<span id=abs28_Jun_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Phrase-based statistical machine translation typically uses no syntactic information during translation, but while this information intuitively seems useful, including it has not necessarily helped translation performance. My PhD project is looking at this problem in the context of a syntactically-informed reordering preprocessing step prior to phrase-based translation. My work so far has shown that this preprocessing step does not necessarily improve performance when applied to every sentence; in my project I aim to develop a lattice-based system, armed with a number of syntax-based confidence features, that can choose on a sentence-by-sentence basis whether to use the reordering. In this presentation I will outline my progress so far, and welcome feedback and suggestions, particularly with respect to features to consider.
<p>
Short Bio:
Suzy Howlett is a PhD student at the Centre for Language Technology at Macquarie University, Australia, under the supervision of Mark Dras. She studied computer science and linguistics as an undergraduate at the University of Sydney, finishing in 2008 with an Honours year with James Curran, looking at automatically annotating additional training data for the C&C statistical CCG parser.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Jun 2011</td>
<td align=left valign=top>Xuchen Yao</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Jun_2011');">
Nonparametric Bayesian Word Sense Induction (ACL practice talk)
</a><br>
<span id=abs17_Jun_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:40 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We propose the use of a nonparametric Bayesian model, the Hierarchical Dirichlet Process (HDP), for the task of Word Sense Induction. Results are
shown through comparison against Latent Dirichlet Allocation (LDA), a parametric Bayesian model employed by Brody and Lapata (2009) for this task.
We find that the two models achieve similar levels of induction quality, while the HDP confers the advantage of automatically inducing a variable number of senses per word, as compared to manually fixing the number of senses a priori, as in LDA. This flexibility allows for the model to adapt to terms with greater or lesser polysemy, when evidenced by corpus distributional statistics.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Jun 2011</td>
<td align=left valign=top>Sravana Reddy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Jun_2011b');">
Unsupervised Discovery of Rhyme Schemes (ACL practice talk)
</a><br>
<span id=abs17_Jun_2011b style="display:none;">
<font size=-1>
<b>Time:</b> 3:40 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We describe an unsupervised, language-independent model for finding rhyme schemes in poetry, using no prior knowledge about rhyme or pronunciation.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Jun 2011</td>
<td align=left valign=top>Cartic Ramakrishnan</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Jun_2011');">
The Role of Information Extraction in the Design of a Document Triage Application for Biocuration
</a><br>
<span id=abs10_Jun_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Traditionally, automated triage of papers is performed using lexical (unigram, bigram, and sometimes trigram) features.  This talk
explores the use of information extraction (IE) techniques to create richer linguistic features than traditional bag-of-words models. Our
classifier includes lexico-syntactic patterns and more-complex features that represent a pattern coupled with its extracted noun,
represented both as a lexical term and as a semantic category. Our experimental results show that the IE-based features can improve performance over unigram and bigram features alone. We present intrinsic evaluation results of full-text document classification experiments to determine automatically whether a paper should be considered of interest to biologists at the Mouse Genome Informatics (MGI) system at the Jackson Laboratories. We also further discuss issues relating to design and deployment of our classifiers as an application to support scientific knowledge curation at MGI.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 May 2011</td>
<td align=left valign=top>Shu Cai</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_May_2011');">
Language-Independent Parsing with Empty Elements
</a><br>
<span id=abs27_May_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We present a simple, language-independent method for integrating recovery of empty elements into syntactic parsing. This method outperforms the best published method we are aware of on English and a recently published method on Chinese.
<p>
This is a joint work with David Chiang and Yoav Goldberg
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 May 2011</td>
<td align=left valign=top>Abe Kazemzadeh (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_May_2011');">
Natural Language Descriptions of Emotions
</a><br>
<span id=abs06_May_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> This proposal seeks to explain how humans describe emotions using
natural language. The
focus of the proposal is on words and phrases that refer to emotions,
rather than the more general phenomena of emotional language. The main
problem I address is that if natural language descriptions of emotions
refer to abstract concepts that are local to a particular human (or
agent), then how do these concepts vary from person to person and how
can shared meaning be established between people. The thesis of the
proposal is that natural
language emotion descriptions are definite descriptions that refer to
theoretical objects, which provide a logical framework for dealing
with this phenomenon in scientific experiments
and engineering solutions. An experiment, Emotion Twenty Questions
(EMO20Q), was devised to study the social natural language behavior of
humans, who must use descriptions of emotions to play the familiar
game of twenty questions when the unknown word is an emotion.
The idea of a theory based on natural language propositions is
developed and used to formalize the knowledge of a sign-using agent.
Based on this pilot data, it was seen that approximately 25% of the
emotion descriptions referred to emotions as objects with dimensional
attributes, similarity, or subsethood. This motivated the author to
use interval type-2 fuzzy sets as a computational model for the
conceptual meaning of emotion descriptions. This model introduces a
definition of a variable that ranges over emotions and allows for both
inter- and intra-subject variability. A second experiment used
interval surveys and translation tasks to assess this model. Finally,
the author proposes the use of spectral graph theory to represent
emotional knowledge as a network of proposition nodes that are
connected  to emotion nodes based on data from EMO20Q.
<p>
Short Bio: Abe Kazemzadeh is a PhD student at the USC Computer Science Dept and a
research assistant at the the Signal Analysis and Interpretation
Laboratory (SAIL). His interests include natural language, logic,
emotions, games, and algebra.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>29 Apr 2011</td>
<td align=left valign=top>Marie-Catherine de Marneffe (Stanford University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs29_Apr_2011');">
Computational models of utterance meaning
</a><br>
<span id=abs29_Apr_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Much of the meaning conveyed in language use goes beyond the literal meaning of the words. Suppose someone asks whether I want to go for lunch, and I reply: "I had a very large breakfast". The utterance does not convey only what it literally means, my interlocutor is probably going to infer that I am not hungry and do not want to go for lunch now. Computational systems today understand at most the literal meaning of human language utterances. I aim at capturing aspects of utterance meaning, the kind of information that a reader will reliably extract from an utterance within text.
<p>
The first part of the talk concentrates on interpreting answers to yes/no questions which do not straightforwardly convey a 'yes' or 'no' answer. I focus on questions involving scalar modifiers (Was it acceptable? It was unprecedented.) and numerical answers (Are you kids little? I have a 10 year-old and a 7 year-old.). I exploit the availability of large amount of text to learn meanings from words and sentences in real context. I show that we can ground scalar modifier meaning based on large unstructured databases, and that such meanings can drive pragmatic inference.
<p>
The second part of the talk targets veridicality -- whether a speaker intends to convey that the events described are actual, non-actual or uncertain -- which is central to language understanding, but little used in relation and event extraction systems. What do people infer from a sentence such as FBI agents alleged in court documents today that Zazi had admitted receiving weapons and explosives training from al Qaeda operatives? Did Zazi received weapons and explosives training? I show that not only lexical semantic properties but context and world knowledge shape veridicality judgments. Since such judgments are not always categorical, I suggest they should be modeled as distributions, and propose a classifier to do so. The classifier features provide a nuanced picture of the diverse factors that affect veridicality.
<p>
Short Bio:
Marie-Catherine de Marneffe is a fifth-year PhD student in Linguistics at Stanford University. Prior to her
doctoral studies, she visited the Stanford NLP research group for 2 years, working with Christopher D. Manning.
In 2000, she received her master degree in Classical Languages, and a master in Computer Science in 2002,
both from the UniversitÃ© catholique de Louvain (Belgium). Her work in computational semantics focuses on
on detecting entailment and contradiction in texts, grounding meaning from large unstructured databases, and assessing the information status of events from a reader's perspective. She is also interested in language acquisition, studying how
children acquire verb forms in French.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Apr 2011</td>
<td align=left valign=top>Dirk Hovy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Apr_2011');">
Models and Training for Unsupervised Preposition Sense Disambiguation
</a><br>
<span id=abs22_Apr_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We present a preliminary study on unsupervised preposition sense disambiguation (PSD), comparing different models and training techniques (EM, MAP-EM with L0 norm, Bayesian inference using Gibbs sampling). To our knowledge, this is the ï¬rst attempt at unsupervised preposition sense disambiguation. Ultimately, we want to disambiguate prepositions not by and for themselves, but in the context of sequential semantic labeling. This should also improve disambiguation of the words linked by the prepositions (here, morning, shopped, and Rome). We propose using unsupervised methods in order to leverage unlabeled data, since, to our knowledge, there are no annotated data sets. Our best accuracy for PSD reaches 56%, a signiï¬cant improvement (at p < .001) of 16% over the most-frequent-sense baseline.
<p>
This is a joint work with Ashish Vaswani, Stephen Tratz, David Chiang, and Eduard Hovy
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Apr 2011</td>
<td align=left valign=top>Thomas Schoenemann</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Apr_2011');">
Computing Viterbi Alignments via Integer Linear Programming
</a><br>
<span id=abs15_Apr_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> This talk is about an optimization problem that was shown to be
NP-hard: computing optimal alignments for the IBM-3 translation
model. I will show that in practice it can be solved quite efficiently
via Integer Linear Programming. In addition to using a standard solver
I will also show problem-specific preprocessing techniques: by
deriving upper and lower bounds, a large number of variables can be
removed from the start.
<p>
Short Bio: Thomas Schoenemann was born and grew up in Germany. He studied
Computer Science at RWTH Aachen, Germany, where he got a diploma in
2005, having written his diploma thesis on the topic of confidence
measures in machine translation in the group of Hermann
Ney. Afterwards he went to the University of Bonn, Germany, to do his
Ph.D. thesis in computer vision in the years 2006-2008. Up to a month
ago he was a postdoc in the vision group at Lund University, Sweden,
where he also resumed his work on translation. Currently he is taking
a time off to explore other fields and broaden his scope.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 Mar 2011</td>
<td align=left valign=top>Sujith Ravi (PhD defense practice talk)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_Mar_2011');">
Deciphering Natural Language
</a><br>
<span id=abs18_Mar_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Most state-of-the-art techniques used in natural language processing (NLP) are supervised and require labeled training data. For example, statistical language translation requires huge amounts of bilingual data for training translation systems. But such data does not exist for all language pairs and domains. Using human annotation to create new bilingual resources is not a scalable solution. This raises a key research challenge: How can we circumvent the problem of limited labeled resources for NLP applications? Interestingly, cryptanalysts and archaeologists have tackled similar challenges in solving "decipherment problems".
<p>
This thesis work aims to bring together techniques from classical cryptography, NLP and machine learning. We introduce a novel approach called "natural language decipherment" that can solve natural language problems without labeled (parallel) data. In this talk, we show how a wide variety of NLP problems can be formulated as decipherment tasks---for example, in statistical language translation one can view the foreign-language text as a cipher for English. Instead of relying on parallel training data, decipherment uses knowledge of the target language (e.g., English) and large quantities of readily available monolingual source (cipher) data to induce bilingual connections between the source and target languages. Using decipherment techniques, we make headway in attacking a hierarchy of problems ranging from letter substitution decipherment to sequence labeling problems (such as part-of-speech tagging) to language translation. Along the way, we make several key contributions---novel unsupervised algorithms that search for minimized models during decipherment and achieve state-of-the-art results on a number of important natural language tasks. Unlike conventional approaches, these decipherment methods can be easily extended to multiple domains and languages (especially resource-poor languages), thereby helping to spread the impact and benefits of NLP research.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Mar 2011</td>
<td align=left valign=top>Cosmin Adrian Bejan (ICT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Mar_2011');">
Nonparametric Bayesian Models for Clustering Feature-Rich Linguistic Objects
</a><br>
<span id=abs11_Mar_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> In this talk, I will present how a new class of unsupervised,
nonparametric Bayesian models can be effectively applied to solve real data applications that involve clustering feature-rich linguistic objects.
<p>
First, I will describe a generalization of the hierarchical Dirichlet
process model to account for additional properties associated with
observable objects. In addition, to overcome some of the limitations of
this new model, I will then describe a new hybrid model which combines
an infinite latent class model with a discrete time series model. The
main advantages of this hybrid model are the abilities to represent
a potentially infinite number of features associated with observable
objects and to perform an automatic selection of the most salient
features.  Furthermore, all the models described in this talk are
designed to account for a potential number of categorical outcomes.
The evaluation performed for solving both within- and cross document
event coreference shows significant improvements of the models when
compared against three baselines for this task.
<p>
Short Bio:
Cosmin Adrian Bejan is a postdoctoral researcher at the USC Institute
for Creative Technologies, where he is currently working on applications
that involve extraction and analysis of commonsense knowledge from large
collections of text documents. His research interests include event
semantics, semantic parsing, commonsense causal reasoning, unsupervised learning, and nonparametric Bayesian methods.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>04 Mar 2011</td>
<td align=left valign=top>Steve DeNeefe (practice job talk)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs04_Mar_2011');">
Tree Adjoining Machine Translation
</a><br>
<span id=abs04_Mar_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 4:30 pm - 5:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Tree adjoining grammars (TAGs) have greater linguistic expressiveness than the tree substitution grammars used in many natural language tasks, but are typically considered too complex or computationally expensive for practical systems.  Many current statistical machine translation (MT) models use tree substitution to memorize sequences of words or constituents, specifying exactly what phrases to use or exactly what trees are grammatical.  Adding the operation of tree adjoining provides the freedom to splice additional information into an existing grammatical tree.  An adjoining translation model allows general, linguistically-motivated translation patterns to be learned without the clutter of endless variations of optional material.  The appropriate modifiers, such as adjectives, adverbs, and prepositional phrases, can later be grafted in as needed to translate details.  We show that the increased generalization power provided by adjoining, when used carefully, improves MT quality without becoming computationally intractable.
<p>
In this talk, we describe challenges encountered by phrase-based and syntax-based MT systems today, and present an in-depth, quantitative comparison of both models.  Then, we describe a novel model for statistical MT which addresses these challenges using a synchronous tree adjoining grammar.  We introduce a method of converting these grammars to a weakly equivalent tree transducer for decoding.   Then we present a method for learning the rules and associated probabilities of this grammar from aligned tree/string training data.
<p>
Finally, our results show that adjoining delivers a consistent improvement over a baseline statistical syntax-based MT model on both medium and large-scale MT tasks using several language pairs.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 Mar 2011</td>
<td align=left valign=top>Christopher Thomas (Wright State University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_Mar_2011');">
What Goes Around Comes Around -- Improving the State of Knowledge on the Web through On-Demand Model Creation
</a><br>
<span id=abs03_Mar_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Information extraction is concerned with the retrieval of structured information from unstructured sources. Knowledge extraction/acquisition will need to go a step further by testing whether the extracted information is actually true. Since none of the extraction systems in current use can guarantee a perfect precision, it is necessary to incorporate manual verification steps into the information extraction pipeline in order to use extracted facts in further reasoning. My talk will present a framework that adopts a cyclic approach to advancing the state of factual knowledge within a system, taking advantage of available formal/structured knowledge sources, information extraction and human/social computing to verify the extracted information. For the fact extraction part, the system uses LoD as training data, a domain hierarchy extractor to delineate domain boundaries and non-NLP surface-pattern-based open IE techniques to connect concepts within the hierarchy. To combat the low recall that most IE approaches face, the system deploys generalization techniques and pertinence computation to increase the number of patterns. Verification is done by means of information use under the assumption that correct information will be utilized more often than incorrect one.
<p>
Bio:
Christopher Thomas is a PhD candidate in the Kno.e.sis Center at Wright State University. His research is in epistemological aspects of Computer Science and Artificial intelligence, namely knowledge extraction, representation, verification and dissemination. To build a coherent framework for this kind of systems epistemology, his publications span technical work on ontology design, ontology learning, information quality and information extraction as well as conceptual work on knowledge representation and social computing methods for knowledge verification.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Feb 2011</td>
<td align=left valign=top>Alan Ritter (University of Washington)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Feb_2011');">
Status Messages: A Unique Textual Source of Realtime and Social Information
</a><br>
<span id=abs17_Feb_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Recently there has been an explosion in the number of users posting short status messages on Social Media websites such as Facebook and Twitter.  Although noisy and informal, this new style of text represents a valuable source of information not available elsewhere: it provides the most up-to-date information on current events, in addition to a massive publicly available corpus of naturally occurring human conversations.  In this talk I will present ongoing work which explores both of these aspects.
<p>
First, I will describe efforts towards Information Extraction from status messages.  Because statuses can be posted quickly and are widely disseminated, they often provide the most up-to-date source of information on current events around the world and locally.  This dynamically changing source of realtime information is already being processed using keyword extraction techniques, for example the "trends" displayed on Twitter's website provide a list of phrases which are frequent in the current stream of messages.  In order to move beyond a flat list of phrases, we have been investigating the feasibility of applying Information Extraction techniques to produce more structured representations of events.  A key challenge is the noisy nature of this data; unlike newswire, or biomedical text, status messages contain frequent misspellings and abbreviations, inconsistent capitalization, unique grammar, etc...  To deal with these issues, we have been annotating a corpus of Twitter Posts with POS tags and Named Entities, then using these annotations to train Twitter-specific NLP tools.  As a demonstration of their utility, the resulting tools are combined to produce a calendar of popular events occurring in the future.
<p>
In addition, I will discuss work which exploits a corpus of roughly 1.3 million naturally occurring conversations collected from Twitter for building models of human conversation.  Three data-driven approaches to generating responses to Twitter status posts are considered, based on either information retrieval or phrase-based statistical machine translation.  Although there are many challenges to overcome in adapting phrase-based SMT to dialogue, we show that it is a promising approach to this problem.  We compare these approaches in a human evaluation, using annotators from Amazon's Mechanical Turk service.  Furthermore, we measure agreement between human evaluators and the BLEU automatic MT evaluation metric.  As far as we are aware, this is the first work to investigate the application of phrase-based SMT to dialogue generation.
<p>
Short Bio: Alan Ritter is a graduate student at the University of Washington advised by Oren Etzioni.  His research interests are in Information Extraction, Computational Lexical Semantics, and Language Processing in Social Media.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Feb 2011</td>
<td align=left valign=top>Hagen Fuerstenau (University of Saarland)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Feb_2011');">
Learning Structured Semantics under Weak Supervision
</a><br>
<span id=abs14_Feb_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 11:00 am - 12:00 pm<br>
<b>Location:</b> 4th Floor Large Conference Room [460]<br>
<b>Abstract:</b> In this talk I will present recent work on two topics: syntactically structured representations of word meaning in context and semi-supervised semantic role labeling.  These will be presented as two instances of a general theme: acquiring structured meaning representations with little or no manual annotation.
<p>
Vector space models have become a standard way of representing word meaning that can be learned in an unsupervised way.  The problem of polysemy, however, has only recently been addressed within this framework.  Several approaches to derive vector representations of words in specific sentential contexts have been proposed.  I will present recent work on extending such contextualization operations to vector models incorporating rich syntactic structure, achieving significant improvements in context-dependent lexical substitution tasks.
<p>
Going beyond the meaning of single words, I will then turn to work on semantic role labeling.  Here, a key obstacle is the annotation effort required for the training of high quality role labeling systems.  I will present a semi-supervised approach to semantic role labeling, based on generalizing semantic annotations from manually labeled seed sentences to unlabeled sentences via structural alignments, yielding significant improvements in role labeling performance.
<p>
I will conclude my talk with an outlook onto how the search for adequate models of semantics may profit from formulation in task-specific ways. In particular, I will sketch some ideas on structured semantic models for statistical machine translation.
<p>
Bio: Hagen FÃ¼rstenau is a researcher at Saarland University, Germany.  He
received an M.Sc. in Mathematics from Bonn University and is about to
finish his Ph.D. in Computational Linguistics.  His research interests
include data-driven methods in computational semantics and weakly
supervised machine learning.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Feb 2011</td>
<td align=left valign=top>Hui Zhang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Feb_2011');">
Joint Word Alignment and Synchronous Grammar Induction
</a><br>
<span id=abs11_Feb_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 4th Floor Large Conference Room [460]<br>
<b>Abstract:</b> Synchronous grammars have been shown to be effective as models of translation, and the performance of such systems depends heavily on the quality of the grammar induced from the training data. The standard method for induction of synchronous grammars uses automatic word alignments to constrain possible derivations, which makes them prey to alignment errors. In this work, we propose a method for joint word alignment and grammar induction. Our experiments show that our method significantly outperforms the standard method, while reducing the size of the grammar by more than half.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>04 Feb 2011</td>
<td align=left valign=top>Stephan Gouws (Stellenbosch University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs04_Feb_2011');">
Measuring Conceptual Similarity by Spreading Activation over Wikipedia's Hyperlink Graph
</a><br>
<span id=abs04_Feb_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> The World Wide Web brought with it an unprecedented level of
information overload. Computers are very effective at processing and
clustering numerical and binary data, however, the automated conceptual
clustering of natural-language data is considerably harder to automate.
Many techniques rely on relatively simple keyword-matching techniques or
probabilistic methods to measure semantic relatedness between words and
documents. However, these approaches do not always accurately capture
conceptual relatedness as measured by humans.
<p>
In this talk I'll briefly discuss a novel use of spreading activation
(SA) techniques (primarily from cognitive science) for computing
semantic relatedness between words and/or documents. This is done by
modelling the article hyperlink structure of Wikipedia as an associative
network structure for knowledge representation. The SA technique is
adapted and several problems are addressed for it to function over the
derived Wikipedia hyperlink graph. We evaluate these approaches over
standard document similarity datasets and by user evaluation
experiments, and achieve results which compare favourably with state of
the art methods.
<p>
By making use of the collaboratively-created resource Wikipedia, we
hereby also overcome a significant problem in making use of spreading
activation based techniques for information retrieval up to now, as
noted by Crestani (1997): "The problem of building a network which
effectively represents the useful relations [between concepts] has
always been the critical point of many of the attempts to use SA in IR.
These networks are very difficult to build, to maintain and keep up to
date.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>28 Jan 2011</td>
<td align=left valign=top>Markus Dreyer (SDL Language Weaver, formerly @ Johns Hopkins)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs28_Jan_2011');">
A Non-Parametric Model for the Discovery of Inflectional Paradigms from Plain Text using Graphical Models over Strings
</a><br>
<span id=abs28_Jan_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Statistical natural language processing can be difficult for
morphologically rich languages. The observed vocabularies of such
languages are very large, since each word may have been inflected for
morphological properties like person, number, gender, tense, or
others. This unfortunately masks important generalizations, leads to
problems with data sparseness and makes it hard to generate correctly
inflected text.
<p>
The presented dissertation work tackles the problem of inflectional morphology with a novel, unified statistical approach.  We present a generative probability
model that can be used to learn from plain text how the words of a
language are inflected, given some minimal supervision. In other
words, we discover the inflectional paradigms that are implicit, or
hidden, in a large unannotated text corpus.
<p>
This model consists of several components: a hierarchical Dirichlet
process clusters word tokens of the corpus into lexemes and their
inflections, and graphical models over strings -- a novel
graphical-model variant -- model the interactions of multiple
morphologically related type spellings, using weighted finite-state
transducers as potential functions.
<p>
We present the components of this model, from (1) weighted
finite-state transducers parameterized as log-linear models, to (2)
graphical models over multiple strings, to (3) the final Bayesian
non-parametric model over a corpus, its lexemes, inflections, and
paradigms. These three components of the model correspond to the
combined use of (1) dynamic programming, (2) belief propagation and
(3) MCMC for inference.
<p>
We show experimental results for several tasks along the way,
including a lemmatization task in multiple languages and, to
demonstrate that parts of our model are applicable outside of
morphology as well, a transliteration task. Finally, we show that
learning from large unannotated text corpora under our non-parametric
model significantly improves the quality of predicted word
inflections.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Jan 2011</td>
<td align=left valign=top>Donald Metzler</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Jan_2011');">
Relevance and Ranking in Online Dating Systems
</a><br>
<span id=abs14_Jan_2011 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Match-making systems refer to systems where users want to meet other individuals to satisfy some underlying need. Examples of match-making systems include dating services, resume/job bulletin boards, community based question answering, and consumer-to-consumer marketplaces. One fundamental component of a match-making system is the retrieval and ranking of candidate matches for a given user.
We present the first in-depth study of information retrieval approaches applied to match-making systems. Specifically, we focus on retrieval for a dating service. This domain offers several unique problems not found in traditional information retrieval tasks. These include two-sided relevance, very subjective relevance, extremely few relevant matches, and structured queries. We propose a machine learned ranking function that makes use of features extracted from the uniquely rich user profiles that consist of both structured and unstructured attributes. An extensive evaluation carried out using data gathered from a real online dating service shows the benefits of our proposed methodology with respect to traditional match-making baseline systems. Our analysis also provides deep insights into the aspects of match-making that are particularly important for producing highly relevant matches.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Nov 2010</td>
<td align=left valign=top>Jason Riesa</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Nov_2010');">
Structured Models for Bilingual Alignment (Ph.D. Proposal practice talk)
</a><br>
<span id=abs15_Nov_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 4:00 pm - 5:00 pm<br>
<b>Location:</b> 4th Floor Conference Room [460]<br>
<b>Abstract:</b> Bilingual alignment serves as an integral step and the foundation in
the building of any state-of-the-art statistical machine translation
system. It enables us to automatically learn and extract translation
rules from hundreds of millions of words of bilingual text.
<p>
Twenty years ago, the research area of machine translation was
beginning to make use of the increasing availability and speed of
computing resources demanded by the ideas of a previous generation,
notably Weaver (1949). The IBM translation models -- statistical
models for automatic word-to-word translation (Brown et al., 1990;
Brown et al., 1993) - spurred a flurry of new statistical and
empirical research in this area. They have become ubiquitous in the
field and are easy to train in an unsupervised fashion; Al-Onaizan et
al. (1999) and Och and Ney (2003) have given us open-source toolkits
for this purpose.
<p>
However, there are many problems that still exist. The work presented
in this thesis proposal will eliminate many of the problems with
alignment systems that have persisted for two decades, significantly improving machine translation
quality and decidedly advancing the state-of-the-art. In achieving
this goal, we develop new models of bilingual alignment and efficient
search algorithms for working with such models.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>12 Nov 2010</td>
<td align=left valign=top>Stephen Tratz</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs12_Nov_2010');">
Semantically-enriched Parsing for Natural Language Understanding (Ph.D. Proposal practice talk)
</a><br>
<span id=abs12_Nov_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Natural language is riddled with many ambiguities, greatly complicating
natural language processing tasks. Current parsers reconstruct the
syntax of sentences without addressing the numerous ambiguities of
language. This talk discusses a proposed solution for
semantically-enriched parsing that consists of ontological resources,
datasets, and tools that can be used to produce more informative parses
of English sentences. The resulting parses consist not only of syntactic
structure, but also semantic interpretations for noun compounds,
preposition senses, and possessive constructions.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Oct 2010</td>
<td align=left valign=top>Anselmo Penas</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Oct_2010');">
Toward a Reading Machine
</a><br>
<span id=abs07_Oct_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Machine Reading (MR) aims at bridging the gap between texts and a formal representation that a reasoning system can use to make inferences about the text. In the MR Program (MRP), the target ontology is given and the inferences are oriented to answer queries about a set of textual documents. Traditionally, this setting is approached by Information Extraction engines that use annotated texts to learn the mapping between the text and the entity classes and relations of the target ontology. However, in the current MRP setting, almost no annotated data is given, and the systems are expected to adapt to a new domain in a very short time. This setting introduces the need to develop new architectures able to learn from previous readings (of unannotated texts) and to leverage as much as possible the small amount of annotated data. The talk will report the current development of a system with these features.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Oct 2010</td>
<td align=left valign=top>Eduard Hovy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Oct_2010');">
Toward a Computational Theory of Semantic Content
</a><br>
<span id=abs05_Oct_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 4:00 pm - 5:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Semantics has been the object of deep study for many years. Yet representation of contentâ€”the actual meaning of the symbols used in semantic propositionsâ€”is curiously absent from most of this work. This talk argues that this is so because the most useful way of conceptualizing content is not in the form of symbols but as statistical word(sense) distributions, suitably organized.  Over the past few years, NLP research has increasingly treated topic signature word distributions (also called 'context vectors', 'topic models', 'language models', etc.) as a de facto replacement for semantics at various levels of granularity. Whether the task is wordsense disambiguation, certain forms of textual entailment, information extraction, paraphrase learning, and so on, it turns out to be very useful to consider a semantic unit as being defined by the distribution of word(senses) that regularly accompany it (in the classic words of Firth, "you shall know a word by the company it keeps"). This is true for semantic units of all sizes, from individual word(sense)s to sentences to text collections; the information learned and used by WSD engines closely resembles that learned by LDA and similar topic characterization engines.
<p>
In this talk I argue for a new kind of semantics, which is combines traditional symbolic logic-based proposition-style semantics of the kind used in older NLP with (computation-based) statistical word distribution information (what is being called Distributional Semantics in modern NLP). The core resource is a single lexico-semantic 'lexicon' that can be used for a variety of tasks provided it is reformulated appropriately. I show how to define such a lexicon, how to build and format it, and how to use it for various tasks. The talk pulls together a wide range of related topics, including Pantel-style resources like DIRT, inferences / expectations such as those used in Schank-style expectation-based parsing and expectation-driven NLU, PropBank-style word valence lexical items, and the treatment of negation and modalities.
<p>
Combining the two views of semantics seems promising but opens many questions that need study, including the operation of logical operators such as negation and modalities over word(sense) distributions, the nature of ontological facets required to define concepts, and the action of compositionality over statistical concepts.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>01 Oct 2010</td>
<td align=left valign=top>Erica Greene</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs01_Oct_2010');">
Automatic Analysis of Rhythmic Poetry with Applications to Generation and Translation (EMNLP 2010 Practice Talk)
</a><br>
<span id=abs01_Oct_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We employ statistical methods to analyze, generate, and translate rhythmic poetry. We first apply unsupervised learning to reveal word-stress patterns in a corpus of raw poetry. We then use these word-stress patterns, in addition to rhyme and discourse models, to generate English love poetry. Finally, we translate Italian poetry into English, choosing target realizations that conform to desired rhythmic patterns.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>01 Oct 2010</td>
<td align=left valign=top>Liang Huang and Haitao Mi</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs01_Oct_2010b');">
Efficient Incremental Decoding for Tree-to-String Translation (EMNLP 2010 Practice Talk)
</a><br>
<span id=abs01_Oct_2010b style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Syntax-based translation models should in principle be efficient with polynomially-sized search space, but in practice they are often embarassingly slow, partly due to the cost of language model integration. In this paper we borrow from phrase-based decoding the idea to generate a translation incrementally left-to-right, and show that for tree-to-string models, with a clever encoding of derivation history, this method runs in average case polynomial-time in theory, and linear-time with beam search in practice (whereas phrase-based decoding is exponential-time in theory and quadratic-time in practice). Experiments show that, with comparable translation quality, our tree-to-string system (in Python) can run more than 30 times faster than the phrase-based system Moses (in C++).
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 Aug 2010</td>
<td align=left valign=top>Yoav Goldberg</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_Aug_2010b');">
Intern Final Talk: Small is beautiful. Is it any good?
</a><br>
<span id=abs27_Aug_2010b style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> This talk summarizes our experience with searching for small models
for syntax-based machine translation.  I will first present cases
suggesting that smaller models are desirable, and present some
evidence that minimizing model size is a reasonable objective
function.  I will then show cases where this objective may be too
aggressive.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 Aug 2010</td>
<td align=left valign=top>Sasha Rush</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_Aug_2010');">
Intern Final Talk: Large-scale, High-dimensional, Discriminative Machine Translation
</a><br>
<span id=abs27_Aug_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> This talk summarizes my summer work on scaling a machine translation
system to train on a large data set. Similar system are tuned with
MERT on 1k sentences, we train a CRF on 100k sentences. I will discuss
techniques for training, features, distributed scaling,
regularization, and tuning, and give preliminary results.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Aug 2010</td>
<td align=left valign=top>Anni Irvine</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Aug_2010');">
Intern Final Talk: Making Discriminative Alignment Smarter
</a><br>
<span id=abs25_Aug_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 2:00 pm - 2:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Error analysis on grammars extracted for Machine Translation shows
that bad and useless translation rules are usually caused by bad
alignments. In this work, we improve previous work on hierarchical
discriminative alignment by incorporating knowledge of foreign side
parse trees, output from other aligners, and a look-ahead to grammar
extraction. We give examples and results on Chinese to English
translation.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Aug 2010</td>
<td align=left valign=top>Sravana Reddy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Aug_2010b');">
Intern Final Talk: Towards deciphering the Voynich manuscript
</a><br>
<span id=abs25_Aug_2010b style="display:none;">
<font size=-1>
<b>Time:</b> 2:30 pm - 3:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> The Voynich manuscript is a medieval illustrated book written
in an undeciphered script. I will present some questions and answers
about the linguistic and statistical properties of the text.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 Aug 2010</td>
<td align=left valign=top>Sasha Rush (MIT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_Aug_2010');">
Dual Decomposition for Natural Language Inference
</a><br>
<span id=abs06_Aug_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> This talk presents dual decomposition as a general technique for NLP.
<p>
The first part introduces dual decomposition as a framework for
deriving inference algorithms for NLP problems. The approach relies on
standard dynamic-programming algorithms as oracle solvers for
sub-problems, together with a simple method for forcing agreement
between the different oracles. The approach provably solves a linear
programming (LP) relaxation of the global inference problem. It leads
to algorithms that are simple, in that they use existing decoding
algorithms; efficient, in that they avoid exact algorithms for the
full model; and often exact, in that empirically they often recover
the correct solution in spite of using an LP relaxation.
<p>
The second part presents an application of dual decomposition to
non-projective parsing . We focus on parsing algorithms for
non-projective head automata, a generalization of head-automata models
to non-projective structures. The dual decomposition algorithms are
simple and efficient, relying on standard dynamic programming and
minimum spanning tree algorithms. They provably solve an LP relaxation
of the non-projective parsing problem. Empirically the LP relaxation
is very often tight: for many languages, exact solutions are achieved
on over 98% of test sentences.The accuracy of our models is higher
than previous work on a broad range of datasets.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Jul 2010</td>
<td align=left valign=top>William Yang Wang (Columbia)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Jul_2010');">
Automatic Vandalism Detection in Wikipedia (COLING 2010 Practice Talk)
</a><br>
<span id=abs30_Jul_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Discriminating vandalism edits from non-vandalism edits in Wikipedia is a challenging task, as ill-intentioned edits can include a variety of content and be expressed in many different forms and styles. Previous studies are limited to rule-based methods and learning based on lexical features, lacking in deep linguistic analysis. In this talk, I will discuss a novel Web-based syntactic-semantic modeling method, which utilizes Web search results as resource and trains topic-specific n-tag and syntactic n-gram language models to detect vandalism. By combining basic task-specific and lexical features, we have achieved high F-measures using logistic boosting and logistic model trees classifiers, surpassing the results reported by major Wikipedia vandalism detection systems. This is a joint work with Prof. Kathleen McKeown at Columbia University and will appear in the oral session at COLING 2010.
<p>
Bio:
William Yang Wang is a graduate student at Columbia University, and he
is currently visiting the NL Dialog Group at USC/ICT, working on phonetically
aware natural language understanding and speech synthesis. In 2008-2009, he was
with the Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Jul 2010</td>
<td align=left valign=top>Hoifung Poon (University of Washington)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Jul_2010');">
Statistical Relational Learning for Knowledge Extraction from the Web
</a><br>
<span id=abs26_Jul_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Extracting knowledge from unstructured text has been a long-standing
goal of NLP. The advent of the Web further increases its urgency by
making available billions of online documents. To represent the
acquired knowledge that is complex and heterogeneous, we need
first-order logic. To handle the inherent uncertainty and ambiguity in
extracting and reasoning with knowledge, we need probability.
Combining the two has led to rapid progress in the emerging field of
statistical relational learning. In this talk, I will show that
statistical relational learning offers promising solutions for
conquering the knowledge-extraction quest. I will present Markov
logic, which is the leading unifying framework for representing and
reasoning with complex and uncertain knowledge, and has spawned a
number of successful applications for knowledge extraction from the
Web. In particular, I will present OntoUSP, an end-to-end knowledge
extraction system that can read text and answer questions. OntoUSP is
completely unsupervised and benefits from jointly conducting ontology
induction, population, and knowledge extraction. Experiments show that
OntoUSP extracted five times as many correct answers compared to
state-of-the-art systems, with a precision of 91%.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Jul 2010</td>
<td align=left valign=top>Yoav Goldberg (Ben Gurion), Sravana Reddy (Chicago), and Kevin Knight</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Jul_2010');">
Three Mini-Talks on Creative Language
</a><br>
<span id=abs23_Jul_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Analyzing and generating creative language (stories, poems, jokes, etc) is a growing field within computational linguistics.  We will give three short talks on the topic -- Yoav on Haiku generation, Sravana on understanding eggcorns, and Kevin on poetry translation.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Jul 2010</td>
<td align=left valign=top>Kenji Sagae</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Jul_2010');">
Dynamic Programming for Linear-time Incremental Parsing (ACL 2010 Practice Talk)
</a><br>
<span id=abs07_Jul_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Incremental parsing techniques such as shift-reduce have gained popularity thanks to their efficiency, but there remains a major problem: the search is greedy and only explores a tiny fraction of the whole space (even with beam search) as opposed to dynamic programming. We show that, surprisingly, dynamic programming is in fact possible for many shift-reduce parsers, by merging "equivalent" stacks based on feature values. Empirically, our algorithm yields up to a five-fold speedup over a state-of-the-art shift-reduce dependency parser with no loss in accuracy. Better search also leads to better learning, and our final parser outperforms all previously reported dependency parsers for English and Chinese, yet is much faster.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>02 Jul 2010</td>
<td align=left valign=top>Ashish Vaswani</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs02_Jul_2010');">
An MDL-Inspired Objective Function for Unsupervised Training of Generative Models (ACL 2010 Practice Talk)
</a><br>
<span id=abs02_Jul_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> The Minimum Description Length (MDL) principle is a method for model
selection that trades off between the explanation of the data by the model
and the complexity of the model itself. Inspired by the MDL principle, we
develop an objective function for generative models that captures the
description of the data by the model (log-likelihood) and the description of
the model (model size). We also develop a efficient general search algorithm
based on the MAP-EM framework to optimize this function. Since recent work
has shown that minimizing the model size in a Hidden Markov Model for
part-of-speech (POS) tagging leads to higher accuracies, we test our
approach by applying it to this problem. The search algorithm involves a
simple change to EM and achieves high POS tagging accuracies on both English
and Italian data sets.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>02 Jul 2010</td>
<td align=left valign=top>Zornitsa Kozareva</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs02_Jul_2010b');">
Learning Arguments and Supertypes of Semantic Relations using Recursive Patterns (ACL 2010 Practice Talk)
</a><br>
<span id=abs02_Jul_2010b style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> A challenging problem in open information extraction and text mining is the learning of the selectional restrictions of semantic relations. We propose a minimally supervised bootstrapping algorithm that uses a single seed and a recursive lexico-syntactic pattern to learn the arguments and the supertypes of a diverse set of semantic relations from the Web. We evaluate the performance of our algorithm on multiple semantic relations expressed using "verb", "noun" and "verb prep" lexico-syntactic patterns. We embark on human based evaluation to assess the quality of the harvested information and find out that the overall accuracy of our algorithm is 90%. We also compare our results with existing knowledge base outlining the similarity and differences of the granularity and diversity of the harvested knowledge.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Jun 2010</td>
<td align=left valign=top>Jason Riesa</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Jun_2010');">
Hierarchical Search for Word Alignment (ACL 2010 Practice Talk)
</a><br>
<span id=abs30_Jun_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We present a simple yet powerful hierarchical search algorithm for automatic word alignment. Essentially, we treat word alignment as a parsing problem, and induce a forest of alignments from which we can efficiently extract a ranked k-best list. We score a given alignment within the forest with a flexible, linear discriminative model incorporating hundreds of local and nonlocal features features, trained on a relatively small amount of annotated data. We report results on Arabic-English word alignment and translation tasks. Our model outperforms a GIZA++ Model-4 baseline by 6.3 points in F-measure, yielding a 1.1 BLEU score increase over a state-of-the-art syntax-based machine translation system.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Jun 2010</td>
<td align=left valign=top>Jonathan May</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Jun_2010b');">
Efficient Inference Through Cascades of Weighted Tree Transducers (ACL 2010 Practice Talk)
</a><br>
<span id=abs30_Jun_2010b style="display:none;">
<font size=-1>
<b>Time:</b> 4:00 pm - 4:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Weighted tree transducers have been proposed as useful formal models for representing syntactic natural language pro- cessing applications, but there has been little description of inference algorithms for these automata beyond formal foundations. We give a detailed description of algorithms for application of cascades of weighted tree transducers to weighted tree acceptors, connecting formal theory with actual practice. Additionally, we present novel on-the-fly variants of these algorithms, and compare their performance on a syntax machine translation cascade based on (Yamada and Knight, 2001).
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Jun 2010</td>
<td align=left valign=top>Yoav Goldberg (Ben Gurion University of the Negev)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Jun_2010');">
Easy First Dependency Parsing and How Different Parsers Behave Differently
</a><br>
<span id=abs11_Jun_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> I will present a new kind of dependency parsing algorithm: easy first,
non directional dependency parsing.  This is a greedy, bottom up
parser, admitting an efficient O(nlogn) implementation.  Unlike
shift-reduce based greedy parsers, it does not analyze the sentence in
a fixed sequential order, but instead tries to make easier attachment
decisions between harder ones.  The parser performs well on both
Hebrew and English.  I also present evidence that the parsers produces
qualitatively different parses than either the Malt or the MST
parsers.  This observation give rise to an intriguing questions: why
do different parsers produce different parses? can we quantify this
kind of difference?  In the second part of the talk I will present my
attempts to answer these kinds of questions.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Jun 2010</td>
<td align=left valign=top>Mark Johnson (Macquarie University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Jun_2010');">
"Bayesian models of language acquisition" or "Where do the rules come from?" (continued from 7 Jun 2010)
</a><br>
<span id=abs10_Jun_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 4:00 pm - 5:00 pm<br>
<b>Location:</b> 10th Floor Conference Room<br>
<b>Abstract:</b> This talk will be a continuation of topics from Monday's talk.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 Jun 2010</td>
<td align=left valign=top>Steven Bird (University of Melbourne)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_Jun_2010');">
The Human Language Project: Building a Universal Corpus of the World's Languages
</a><br>
<span id=abs09_Jun_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:30 pm<br>
<b>Location:</b> 10th Floor Conference Room<br>
<b>Abstract:</b> We present a grand challenge to build a corpus that will include all of the world's languages, in a consistent structure that permits large-scale cross-linguistic processing, enabling the study of universal linguistics.  The focal data types, bilingual texts and lexicons, relate each language to one of a set of reference languages. We propose that the ability to train systems to translate into and out of a given language be the yardstick for determining when we have successfully captured a language.  We call on the computational linguistics community to begin work on this Universal Corpus, pursuing the many strands of activity described here, as their contribution to the global effort to document the world's linguistic heritage before more languages fall silent.
<p>
<p>
(This talk will present joint work with Steve Abney.)
<p>
<p>
Brief Bio:
<p>
Steven Bird is Associate Professor in the Department of Computer
Science and Software Engineering at the University of Melbourne, and
also Senior Research Associate at the Linguistic Data Consortium.  In
2009 he served as president of the Association for Computational
Linguistics, and he completed a textbook on Natural Language
Processing, published by O'Reilly.  Steven studies scalable,
semi-automatic methods for analyzing spoken and written language, and
for preserving endangered languages. This involves a mixture of
computational modelling and linguistic fieldwork.  For further details
and online publications, please visit http://stevenbird.me/
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>08 Jun 2010</td>
<td align=left valign=top>Reut Tsarfaty (Uppsala University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs08_Jun_2010');">
Morphology in Parsing: A Taxonomy-Based Approach
</a><br>
<span id=abs08_Jun_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 am - 11:30 am<br>
<b>Location:</b> 10th Floor Conference Room [1026]<br>
<b>Abstract:</b> It has been a prominent empirical fact in the last decade that languages which have properties that are different from those of English, for instance, languages with free word-order and rich morphological structure, do not lend themselves naturally to the application of statistical models developed for processing English. In this talk I focus on the parsing task and based on the kind of correspondence patterns between form and function that characterize richly inflected languages, I aim to identify the properties of models that can successfully cope with parsing such structures. I start by demonstrating complex many-to-many correspondence patterns in Natural Language using data from the Semitic language Modern Hebrew. I review properties of prominent models for morphological analysis (Stump 2001), and isolate the ones that are appropriate for modeling such complex patterns. I then propose to apply the same strategy to the syntactic domain, arguing that this provides not only for a streamlined interface to morphology, but also better yields a better framework for capturing morphosyntactic interactions on the whole. I illustrate this approach via a particular instantiation, the relational-realizational model of (Tsarfaty 2010), applied to parsing Modern Hebrew. I report significant improvements on various measures over competing alternatives and previously reported results. I finally suggest that other modeling frameworks may often be enhanced to cope better with rich morphosyntactic phenomena, by similarly analyzing their underlying properties and enhancing their relational, or realizational, component, accordingly.
<p>
Speaker website:
http://stp.lingfil.uu.se/~tsarfaty/
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Jun 2010</td>
<td align=left valign=top>Mark Johnson (Macquarie University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Jun_2010');">
"Bayesian models of language acquisition" or "Where do the rules come from?"
</a><br>
<span id=abs07_Jun_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 2:00 pm - 3:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Each human language contains an astronomically large (if not unbounded) number of different sentences.Â  How can something so large and complex possibly be learnt?Â  Over the past decade and a half we've figured out how to define probability distributions over grammars and the linguistic structures they generate, opening up the possibility of Bayesian models of language acquisition.Â  Bayesian approaches are particularly attractive because they can exploit "prior" (e.g., innate) knowledge as well as statistical generalizations from the input.Â  This opens the possibility of an empirical evaluation of the utility of various kinds of innate knowledge.Â  Structured statistical learners have two major advantages over other approaches.Â  First, because the generalizations they learn and the prior knowledge they utilize are both expressed in terms of explicit linguistic representations, it is clear what is learnt and what information is exploited during learning.Â  Second, because of the "curse of dimensionality", learners that identify and exploit structural properties of their input seem to be the only ones that have a chance of "scaling up" to learn real languages.Â  This talk describes Bayesian methods for learning Context-Free Grammars and a generalization of them that we call Adaptor Grammars, and applies them to problems of morphological acquisition and word segmentation.
<p>
Joint work with Tom Griffiths (Berkeley) and Sharon Goldwater (Edinburgh)
<p>
<p>
Speaker Bio:
<p>
Mark Johnson is a Professor of Language Science (CORE) in the Department of Computing at Macquarie University. He was awarded a BSc (Hons) in 1979 from the University of Sydney, an MA in 1984 from the University of California, San Diego and a PhD in 1987 from Stanford University. He held a postdoctoral fellowship at MIT from 1987 until 1988, and has been a visiting researcher at the University of Stuttgart, the Xerox Research Centre in Grenoble, CSAIL at MIT and the Natural Language group at Microsoft Research. He has worked on a wide range of topics in computational linguistics, but his main research area is parsing and its applications to text and speech processing. He was President of the Association for Computational Linguistics in 2003, and was a professor from 1989 until 2009 in the Departments of Cognitive and Linguistic Sciences and Computer Science at Brown University.
<p>
Professor Johnson's research area is computational linguistics, i.e., explicit computational models of language acquisition, comprehension and production. His recent work has focused on probabilistic models for syntactic parsing (identifying the way words combine to form phrases and sentences) and semantic interpretation, and on Bayesian models of the acquisition of phonology, morphology and the lexicon.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>21 May 2010</td>
<td align=left valign=top>Zornitsa Kozareva</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs21_May_2010');">
Not All Seeds Are Equal: Measuring the Quality of Text Mining Seeds
</a><br>
<span id=abs21_May_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Open-class semantic lexicon induction is of great interest for the current knowledge harvesting algorithms. We propose a general framework that uses patterns in bootstrapping fashion to learn open-class semantic lexicons for different kinds of relations. These patterns require seeds. To estimate the /goodness/ (the potential yield) of new seeds, we introduce a regression model that considers the connectivity behavior of the seed during bootstrapping. The generalized regression model is evaluated on six different kinds of relations with over 10000 different seeds for English and Spanish patterns. Our approach reaches robust performance of 90% correlation coefficient with 15% error rate for any of the patterns when predicting the /goodness/ of seeds.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 May 2010</td>
<td align=left valign=top>Jinho D. Choi (University of Colorado)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_May_2010');">
K-best, Transition-based Dependency Parsing using Robust Risk Minimization and Automatic Feature Reduction
</a><br>
<span id=abs19_May_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> In this paper, we introduce a way of improving the parsing accuracy of a transition-based dependency parsing model by using k-best ranking. Our approach uses a broader search space than beam search, yet keeps the parsing complexity near a quadratic average running time. In addition, we take a simple post-processing step to ensure the parsing output is a connected dependency tree. As an oracle, we use a high-performing but relatively under-explored machine learning algorithm, Robust Risk Minimization, which gives a higher parsing accuracy than the Perceptron algorithm in the experiments. We also use an automatic feature reduction technique that reduces the feature space by about 49% without compromising the parsing accuracy. We evaluate our approach on the CoNLL '09 shared task English data and improve the transition-based dependency parsing accuracy, showing a 0.64% higher accuracy than the best transition-based CoNLL '09 system.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Apr 2010</td>
<td align=left valign=top>Walter Daelemans (University of Antwerp)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Apr_2010');">
Robust features for Computational Stylometry
</a><br>
<span id=abs30_Apr_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Computational stylometry is the automatic assignment of
author properties (e.g., identity, gender, personality,
region, age, period, ideology, ...) to a text. Applications
range from forensic use to literary scholarship. The
methodology currently most successful is based on the well
known approach to  text categorization using training
data in the form of texts with known classes. The approach
works by extracting text features, selecting the best ones
using statistical methods, representing the text as a vector
of these features, and applying machine learning methods to
the resulting vectors with associated classes. The main
difference with the original text categorization approach is
that the extracted text features may be complex and
linguistically motivated (e.g. syntactic features).
I will describe some recent applications at the University
of Antwerp using this methodology: personality detection,
author assignment with many authors and short texts, scribe
detection  in medieval texts, provenance and ideology detection
in Kenyan news articles, etc.
I will then focus on an empirical comparison of the
robustness of different feature types in different
situations.
<p>
Bio:
<p>
Walter Daelemans (PhD in Computational Linguistics, University of Leuven, 1987). Trained as a linguist and psycholinguist at the Universities of Antwerpen and Leuven, he specialised in computational linguistics and held research posts at the University of Nijmegen and the AI Lab of the University of Brussels before becoming a lecturer in Computational Linguistics and Artificial Intelligence at Tilburg University where he founded an early research group on machine learning of language (ILK). Since 1999 he is full-time professor at the University of Antwerp where he also heads the computational linguistics group within the CLiPS research centre. His main
research interests are in machine learning of language (especially memory-based learning), text analytics, and computational psycholinguistics. He co-founded ACL's Special Interest Group on Natural Language Learning (SIGNLL) and its associated conference and shared task series (CoNLL).
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Apr 2010</td>
<td align=left valign=top>Rutu Mulkar-Mehta</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Apr_2010');">
Understanding Granularity in Natural Language Discourse (Ph.D. Proposal practice talk)
</a><br>
<span id=abs16_Apr_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Granularity is the task of breaking down a complex description into simpler concepts of finer detail, such that each of the simpler concepts can be collectively describe the main description. It can be thought of as a hierarchy of varying levels of information, with fine grained and specific information i.e. information with more detail at lower levels, and coarse grained and generic information i.e. information with less detail, at higher levels. Shifting in granularity from lower to higher levels leads to information loss or abstraction of certain fine details which become irrelevant at that level. Similarly, shifting granularity from a coarse level to a fine level involves more specific details as compared to the level above this.Humans can seamlessly shift between various granularity levels when interpreting discourse. Textual descriptions are usually written such that the reader gets to know the key features of fine-grained events, and then theoverall picture from the coarse-grained description of a process. This thesis proposal is towards identification and extraction of such structures from Natural Language Discourse.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Apr 2010</td>
<td align=left valign=top>Jonathan May</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Apr_2010');">
Weighted Tree Automata and Transducers for Syntactic Natural Language Processing (Ph.D. Defense practice talk)
</a><br>
<span id=abs14_Apr_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 4:00 pm - 5:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Weighted finite-state string transducer cascades are a powerful formalism for models of solutions to many natural language processing problems such as speech recognition, transliteration, and translation. Researchers often directly employ these formalisms to build their systems by using toolkits that provide fundamental algorithms for transducer cascade manipulation, combination, and inference. However, extant transducer toolkits are poorly suited to current research in NLP that makes use of syntax-rich models. More advanced toolkits, particularly those that allow the manipulation, combination, and inference of weighted extended top-down tree transducers, do not exist. In large part, this is because the analogous algorithms needed to perform these operations have not been defined. This thesis solves both these problems, by describing and developing algorithms, by producing an implementation of a functional weighted tree transducer toolkit that uses these algorithms, and by demonstrating the performance and utility of these algorithms in multiple empirical experiments on machine translation data.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Apr 2010</td>
<td align=left valign=top>Satoshi Sekine (NYU)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Apr_2010');">
On-Demand Information Extraction and Knowledge Discovery
</a><br>
<span id=abs05_Apr_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 am - 11:30 am<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> At present, adapting an Information Extraction system to new topics is
an expensive and slow process, requiring some knowledge engineering for
each new topic. We propose a new paradigm of Information Extraction
which operates 'on demand' in response to a user's query. On-demand
Information Extraction (ODIE) aims to completely eliminate the
customization effort. Given a user's query, the system will
automatically create patterns to extract salient relations in the text
of the topic, and build tables from the extracted information using
paraphrase discovery technology. It relies on recent advances in pattern
discovery, paraphrase discovery, and extended named entity tagging.
I will show you a demo system, which produce a table in less than a
minute for any give queries.
<p>
I will also explain the need of linguistic knowledge and introduce some
weakly supervised learning methods. I will show a demo of the ngram
search engine, which extracts ngrams and sentences which match to a
query with arbitrary wildcard.
<p>
Also, I will give a brief introduction about the Web People Search.
It is a task to disambiguate search results of people name and people
attribute extraction task. We organized WePS1 and 2, and currently
started the third evaluation, which includes 2 tasks: 1) the combined
task of people disambiguation and attribute extraction and 2)
organization disambiguation from twitter messages.
<p>
Brief Bio:
<p>
Satoshi Sekine is an Research Associate Professor at New York University.
He received his MSc at UMIST, UK in 1992 and his PhD in 1998 at NYU. He
has been working on various topics, including parsing, NE, Information
Extraction and minimally supervised knowledge discovery. He edited a book
about NE from John Benjamins, organized a JHU summer workshop 2009,
WePS task, NSF symposium on Semantic Knowledge Discovery, Organization
and Use in 2008, workshop on Textual Entailment and Parsing 2007 and so on.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>02 Apr 2010</td>
<td align=left valign=top>Eduard Hovy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs02_Apr_2010');">
Annotation
</a><br>
<span id=abs02_Apr_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Despite a lot of recent attention, corpus annotation remains somewhat of an art.  This talk is the main part of a tutorial intended to provide the attendee with an in-depth look at the procedures, issues, and problems in corpus annotation.  After describing some currently available resources, services, and frameworks (including the QDAP annotation center, Amazon's Mechanical Turk, annotation facilities in GATE, and UIMA), it addresses the open questions, pitfalls, and problems that the annotation manager should avoid, highlighting the seven major issues at the heart of annotation for which there are as yet no standard and fully satisfactory answers or methods.  For each of these it provides suggestions and a possibly helpful list of references.
<p>
Your participation in order to critique the tutorial is appreciated!
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>31 Mar 2010</td>
<td align=left valign=top>Haitao Mi (ICT China)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs31_Mar_2010');">
Lattice and Forest for SMT
</a><br>
<span id=abs31_Mar_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Statistical machine translation (SMT) has witnessed promising progress in recent years. Typically, an SMT system is characterized as a single-best pipeline, whose modules are independent to each other and only take as input single-best results from the previous module. With this assumption, each module will inevitably introduce errors in single-best outputs, which will propagate and accumulate along the pipeline, and eventually hurt the translation quality.
<p>
<p>
In order to alleviate this problem, we use compact structures such as lattices and forests instead of single-best results in each module, and then integrate both lattice and forest into a single tree-to-string system. We explore the algorithms of lattice parsing, lattice-forest-based rule extraction and decoding. Experiments show a statistically significant improvement over a start-of-the-art forest-based baseline. More interestingly, we observe a significant reduction in rule-set size when extracting with a lattice, which implies better generalizability (with a smaller model).
<p>
<p>
<p>
About the speaker:
<p>
Haitao Mi is an Assistant Researcher in the Institute of Computing Technology, Chinese Academy of Sciences (CAS/ICT). He received his Ph.D. from CAS/ICT in 2009. His main research interests include syntax-based machine translation and statistical parsing. Additional information about him and his group can be found at http://nlp.ict.ac.cn/~mihaitao/
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Mar 2010</td>
<td align=left valign=top>Victoria Fossum</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Mar_2010');">
Integrating Parsing and Word Alignment in Syntax-Based Machine Translation (Ph.D. Defense practice talk)
</a><br>
<span id=abs30_Mar_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 4:00 pm - 5:00 pm<br>
<b>Location:</b> 11th Floor Conference Room [1135]<br>
<b>Abstract:</b> Training a string-to-tree syntax-based statistical machine translation
system to translate from a source language (e.g. Chinese or Arabic)
into a target language (e.g. English) requires the following
resources: a parallel corpus (a large set of example sentences in the
source language that have been translated into the target language by
a human); a word alignment (a word-to-word correspondence between each
source-target sentence pair); and a parse tree (a syntactic
representation) of each sentence in the target language.  From these
training examples, the system learns to translate source-language
sequences of words into target-language trees.  In order to ensure
broad coverage, the parallel corpus of training examples must be
sufficiently large (on the order of millions of sentence pairs).
Manually annotating such large corpora would be prohibitively
time-consuming.  Instead, these corpora must be word-aligned and
parsed automatically.
<p>
There are two problems with existing approaches to automatic word
alignment and parsing for syntax-based machine translation.  First,
these processes are noisy and introduce errors which impact
translation quality.  Second, these processes are typically performed
independently of one another.  Since each process produces constraints
that can be used to guide the other, by more closely integrating them,
we can expect to improve the accuracy of each process.  In this
thesis, we address these two problems as follows: first, we improve
upon the accuracy of a state-of-the-art parser; second, we use word
alignments to improve parse accuracy; third, we use parses to improve
word alignment accuracy; and fourth, we optimize parses and word
alignments simultaneously.  We examine the impact of each of these
methods upon parse quality, alignment quality, and translation quality
in a downstream syntax-based machine translation system.  Our results
demonstrate that more closely integrating word alignment and syntactic
parsing can indeed improve the accuracy of each process, and in some
cases leads to an improvement in translation quality relative to a
state-of-the-art syntax-based statistical machine translation system.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Mar 2010</td>
<td align=left valign=top>Elsi Kaiser (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Mar_2010');">
Discourse coherence effects in language processing: A psycholinguistic approach
</a><br>
<span id=abs26_Mar_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> In this talk I will discuss some recent results from my lab on the relationship between reference resolution and coherence relations. Previous work found that pronoun interpretation is guided by the coherence relations between clauses (e.g., 'as a result', 'and then', 'and similarly'), e.g. Hobbs (1979), Kehler et al. (2008). For example, consider "Phil tickled Stan, and similarly Liz poked him" (preference to interpret 'him' as Stan) and "Phil tickled Stan, and as a result Liz poked him" (more consideration of Phil as the antecedent of 'him'). However, the linguistic and cognitive properties of these coherence representations are not yet fully understood, and it is also not yet clear whether this kind of coherence sensitivity extends straightforwardly to other kinds of reduced referring expressions in addition to pronouns (e.g. anaphoric demonstratives, which can in many languages be used to refer to humans as well). I will discuss experiments -- conducted using a visual-world eye-tracking paradigm as well as other methods -- that investigate the nature and generality of these coherence representations. In addition to investigating whether coherence effects extend to other reduced referring expressions, I have also explored the domain-generality of coherence representations, for example whether non-linguistic, visuo-spatial input (video clips of moving shapes) can prime (bias) subsequent reference resolution in a seemingly unrelated task. Time permitting, I will also discuss issues related to data analysis and the annotation of data collected through psycholinguistic experiments.
<p>
Brief bio:
<p>
Elsi Kaiser is an Assistant Professor of Linguistics at the University of
Southern California, with a specialization in Psycholinguistics. She
received her Ph.D. from the University of Pennsylvania in 2003, and was a
post-doc at the University of Rochester for two years before moving to
USC.  Her current research focuses on the comprehension of various
referential forms (including pronouns, reflexives and demonstratives) in
different languages, which she investigates using a range of tools,
including eye-tracking.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Mar 2010</td>
<td align=left valign=top>Liang Huang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Mar_2010');">
Incremental Parsing
</a><br>
<span id=abs05_Mar_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> (a 20-minute version of this talk was given at the ISD retreat, with no technical details.)
<p>
Parsing is the task of finding the most probable interpretation for a given sentence, and is a central problem in NLP because it serves as the basis of many downstream applications such as machine translation, summarization, paraphrasing, and question answering. Improving parsing efficiency and accuracy will greatly improve the applicability of those applications.
<p>
However, unlike human parsing which is amazingly efficient by scanning the sentence incrementally, current state-of-the-art parsers are either extremely slow (standard algorithms like CKY scale cubically with sentence length), or purely greedy in the search algorithm that only touches a tiny fraction of the (exponentially) large search space. We instead propose a dynamic programming algorithm that does incremental parsing and ambiguity packing along the way, such that the running time is (almost) linear, and yet searches over exponentially many trees. Empirical results are very good, but further details withheld -- come to the talk!
<p>
This is a joint work with Kenji Sagae, USC/ICT.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Feb 2010</td>
<td align=left valign=top>David Farwell (Universitat Politecnica de Catalunya)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Feb_2010');">
Knowledge Acquisition and Textual Entailment: a proposed research program
</a><br>
<span id=abs05_Feb_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> The aim of this presentation is to describe a program of research in the area of automatic knowledge acquisition which has been submitted in response to the European Information and Communication Technologies FP7 Call 5, Objective 4.3: Intelligent Information Management. The objective of this research program is to develop data-driven techniques and tools for extracting common sense knowledge from unstructured text and applying it for making the approximate inferences needed in order to interpret the ambiguities of human language communication.
<p>
The central activities include developing techniques and tools for:
- converting texts into representations of the particular events and entities they refer to,
- identifying relations between these entity and event instances such as shared participants, temporal and spatial juxtapositions, causal connections, entailments, and so on, thereby constructing representations of complex scenarios,
- inducing from sets of like entity, event and scenario instances, representations of entity, event and scenario types,
- using these entity, event and scenario types as background knowledge to support approximate inferencing (e.g., statistical inference rules such as poisoning probably entails death) within important interactive tasks such as Information Retrieval and web search.
<p>
The technologies developed will be validated by applying them to two broad NLU tasks: faceted search for Information Retrieval in the domain of health information and open-domain web search for web browsing and UI improvements.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Jan 2010</td>
<td align=left valign=top>David Chiang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Jan_2010');">
Towards Tree-to-Tree Translation
</a><br>
<span id=abs22_Jan_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Statistical translation models that try to capture the recursive structure of language have been widely adopted over the last few years. These models make use of varying amounts of information from linguistic theory: some use none at all, some use information about the grammar of the target language, some use information about the grammar of the source language. But progress has been slower on tree-to-tree translation models: models that are able to learn the relationship between the grammars of both the source and target language. I will discuss the reasons why tree-to-tree translation has been a challenge, review existing attempts at tree-to-tree models, and present some of our own work-in-progress on robustly modeling source and target language syntax for significant improvements in translation quality.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Jan 2010</td>
<td align=left valign=top>Min-Yen Kan (National University of Singapore)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Jan_2010');">
ForeCite: towards a more integrated scholarly digital library
</a><br>
<span id=abs15_Jan_2010 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Scholarly digital libraries (DLs) have managed to scale up
to handle millions of documents and now feature tools to track
citations and references between articles.  However, users of digital
libraries typically often access the DL merely to check references or to
download the PDF of the document.  What features will the
next-generation DL need to inspire scholars to use digital library for
more than accessing the document?  In ForeCite, our digital library
project at NUS, we believe part of the answer lies in integrating
common end user's concerns: annotation, sharing, off-and-online usage
and focusing on the intra-document processing.  I will describe and
demonstrate some of the preliminary components of the ForeCite system:
including its web based front end, ParsCit (a backend open-source
citation segmentation system), and ForeCiteNote (TiddyWiki based
research notetaking system) and ForeCiteReader (Google Books-like
interface for annotation and collaboration on notetaking, and FireCite
(browser extension for recognizing citations on webpages).
<p>
Speaker Bio:
<p>
Min-Yen Kan (BS;MS;PhD Columbia Univ.) is an associate professor at
the National University of Singapore.  His research interests include
digital libraries and applied natural language processing.  Specific
projects include work in the areas of scientific discourse analysis,
multiword expression extraction and understanding, machine translation
and applied text summarization.  Currently, he is an associate editor
for "Information Retrieval" and is the Editor for the ACL Anthology,
the computational linguistics community's largest archive of published
research. More information about him and his group can be found at the
WING homepage: http://wing.comp.nus.edu.sg/
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Dec 2009</td>
<td align=left valign=top>Anselmo Penas (UNED, Spain)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Dec_2009');">
Evaluating Question Answering Validation
</a><br>
<span id=abs11_Dec_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> During the last decade, Question Answering (QA) was redefined inside TREC as a kind of highly-precision-oriented Information Retrieval task where the introduction of NLP was necessary, specially for Answer Extraction purposes. The same general approach was activated at the Cross-Language Evaluation Forum (CLEF) in 2003, but for other European languages different than English, and with some different settings and subtasks. The talk will report the last 4-year cycle of the QA evaluation at CLEF, starting with the general methodology for long term QA evaluation at CLEF and the motivation for the Answer Validation task, continuing with the development of AVE in the three year campaign, and concluding with the goals, evaluation measure and results of the current QA evaluation setting after the AVE experience.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 Dec 2009</td>
<td align=left valign=top>Tomohide Shibata (Kyoto University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_Dec_2009');">
Introduction of Our Research (text analysis and IR)
</a><br>
<span id=abs09_Dec_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:30 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> I am Tomohide Shibata, an assistant professor at Kyoto University,
Japan. I am working with Prof. Kurohashi. I have been visiting
Prof. Hovy for three weeks. In this talk, I introduce our
research. Our research roughly consists of three fields: basic text
analysis, information retrieval and machine translation. Among them,
basic text analysis and information retrieval, which I am engaged in,
are introduced.
<p>
In basic text analysis, we have been developed Japanese morphological
analyzer and parser, which are widely used in research community. Case
frames, which describe the relation between a verb and its case
components, are automatically constructed from a large Web
corpus. Synonym and is-a relations are automatically extracted from a
dictionary and Web corpus.
<p>
In Information Retrieval, we are running a search engine
infrastructure called TSUBAKI. The features of TSUBAKI are (i) the
sentence structure (dependency relation) is considered in the document
ranking, and (ii) the expression divergence between a query and a
document is assimilated. We are also running a search result
clustering system based on TSUBAKI.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>04 Dec 2009</td>
<td align=left valign=top>Donald Metzler (Yahoo! Research)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs04_Dec_2009');">
Learning Query Concept Importance Using a Weighted Dependence Model
</a><br>
<span id=abs04_Dec_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Modeling query concepts through term dependencies has been shown to have a significant positive effect on retrieval performance, especially for tasks such as Web search, where relevance at high ranks is particularly critical. Most previous work, however, treats all concepts as equally important, an assumption that often does not hold, especially for longer, more complex queries. In this talk, I will describe the state-of-the-art practices for modeling query term dependencies for information retrieval using Markov random fields. Within this context I will discuss why many NLP-inspired approaches to the problem, such as query segmentation, have failed to show consistent improvements when applied to information retrieval tasks. Experimental results carried out on a number of TREC and Yahoo! Web search test collections will be presented showing the effectiveness of various types of term (in)dependence models.
<p>
Brief bio:
Donald Metzler is a Research Scientist in the Search and Computational Advertising group at Yahoo! Research. He obtained his Ph.D. from the University of Massachusetts in 2007. He is an active member of the information retrieval and web search communities, having served on the program committees of SIGIR, ECIR, HLT, EMNLP, WSDM, WWW, and ICML. He has published over 35 research papers, has 13 patents pending, and is the co-author of Search Engines: Information Retrieval in Practice. His research interests include information retrieval, web search, computational advertising, and applications of machine learning to large-scale text problems.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>20 Nov 2009</td>
<td align=left valign=top>Marco Pennacchiotti (Yahoo! Research)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_Nov_2009');">
Entity Extraction via Ensemble Semantics
</a><br>
<span id=abs20_Nov_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> In this talk I will present Ensemble Semantics (ES), a new general framework for information extraction developed at Yahoo!, that combines multiple sources of information and extractors. The ES framework is based on the hypothesis that although distributional and pattern-based extraction algorithms are complementary, they do not exhaust the semantic space; other sources of evidence can be leveraged to better combine them.  In this presentation, I will focus on a specific implementation of ES for the task of entity extraction. I will report experimental results showing large gains in performance, by combining state-of-the-art distributional and pattern-based systems with a large set of features from a document webcrawl, one year of query logs, and a snapshot of Wikipedia. I will also propose an analysis of feature correlations and interactions showing the value of the different feature sets. I will conclude discussing some issues that can impact on the overall performance of entity extraction algorithms.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Oct 2009</td>
<td align=left valign=top>Steve DeNeefe</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Oct_2009');">
Tree Adjoining Machine Translation (thesis proposal practice talk)
</a><br>
<span id=abs23_Oct_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Tree Adjoining Grammars have well-known advantages but are typically considered too difficult for practical systems.  We propose that, when done right, adjoining improves translation quality without becoming computationally intractable.  Using adjoining to model optionality allows general translation patterns to be learned without the clutter of endless variations of optional material.  The appropriate modifiers can later be spliced in as needed to translate details.
<p>
In this proposal, we describe challenges encountered by phrase-based and syntax-based machine translation (MT) systems today, and present an in-depth, quantitative comparison of both models. Then, we describe a novel model for statistical MT which addresses these challenges using a Synchronous Tree Adjoining Grammar.  We introduce a method of converting these grammars to a weakly equivalent tree transducer for decoding.   And we present a method for learning the rules and associated probabilities of this grammar from aligned tree/string training data.
<p>
Finally, our initial results show that adjoining already delivers an end-to-end improvement of +0.8 BLEU over a baseline statistical syntax-based MT model on a medium-scale Arabic/English MT task.  Furthermore, we demonstrate it is a competitive entry in the Urdu-English track of the 2009 NIST MT evaluation.  We then propose improvements to the model, decoding, and extraction that promise to allow this new, linguistically-motivated MT model to surpass its syntax-based and phrase-based cousins in a wide range of scenarios and language pairs.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>21 Oct 2009</td>
<td align=left valign=top>Douglas W. Oard (Maryland)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs21_Oct_2009');">
Who 'Dat? Identity resolution in large email collections
</a><br>
<span id=abs21_Oct_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> Automated techniques that can support the human activities of search and
sense-making in large email collections are of increasing importance for a
broad range of uses, including historical scholarship, law enforcement and
intelligence applications, and lawyers involved in "e-discovery" incident
to civil litigation.  In this talk, I'll briefly describe some of the work
to date on searching large email collections, and then for most of the
talk I will focus on the more challenging task of support for
sense-making.  Specifically, I'll describe joint work with Tamer Elsayed
to automatically resolve the identity of people who are mentioned
ambiguously (e.g., just by first name) in a collection of email from a
failed corporation (Enron).  Our results indicate that for people who are
well represented in the collection we can use a generative model to guess
the right identity about 80% of the time, and for others we are right
about half the time.  I'll conclude the talk with a few remarks on our
next directions for techniques, evaluation, and additional types of
collections to which similar ideas might be applied.
<p>
About the Speaker:
<p>
Douglas Oard is an Associate Professor at the University of Maryland,
College Park, with joint appointments in the College of Information
Studies and the Institute for Advanced Computer Studies; he is on
sabbatical at Berkeley's iSchool for the Fall 2009 semester.  Dr. Oard
earned his Ph.D. in Electrical Engineering from the University of
Maryland, and his research interests center around the use of emerging
technologies to support information seeking by end users.  His recent work
has focused on interactive techniques for cross-language information
retrieval and techniques for search and sense-making in conversational
media.  Additional information is available at
http://www.glue.umd.edu/~oard/.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 Oct 2009</td>
<td align=left valign=top>Nandakishore Kambhatla (IBM India)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_Oct_2009');">
Extracting Social Networks and Biographical Facts from Conversational Speech Transcripts
</a><br>
<span id=abs09_Oct_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> We present a general framework for automatically extracting social networks and biographical facts from conversational
speech. Our approach relies on fusing the output produced by multiple information extraction modules, including entity
recognition and detection, relation detection, and event detection modules. We describe the specific features and algorithmic
refinements effective for conversational speech. These cumulatively increase the performance of social network
extraction from 0.06 to 0.30 for the development set, and from 0.06 to 0.28 for the test set, as measured by f-measure on the
ties within a network. The same framework can be applied to other genres of text -- we have built an automatic biography
generation system for general domain text using the same approach.
<p>
--
<p>
Brief Bio:
Nanda Kambhatla has nearly 17 years of research experience in the areas of
Natural Language Processing (NLP), text mining, information extraction, dialog systems, and
machine learning. He holds 6 U.S patents and has authored over 30 publications in books,
journals, and conferences in these areas. Nanda holds a B.Tech in Computer Science and Engineering
from the Institute of Technology, Benaras Hindu University, India, and a Ph.D in Computer
Science and Engineering from the Oregon Graduate Institute of Science & Technology, Oregon, USA.
<p>
Currently, Nanda is the manager of the Data Analytics Group at IBM's India Research Lab (IRL), Bangalore. The group is focused on research on machine translation, Natural Language Processing, text analysis and machine learning techniques for developing analytics
solutions to help IBM's services divisions. Most recently, Nanda was the manager of the Statistical
Text Analytics Group at IBM's T.J. Watson Research Center, the Watson co-chair of the Natural
Language Processing PIC, and the task PI for the Language  Exploitation Environment (LEE) subtask
for the DARPA GALE project. He has been leading the development of information extraction
tools/products and his team has achieved top tier results in successive Automatic Content Extraction
(ACE) evaluations conducted by NIST for extracting entities, events and relations from text from
multiple sources, in multiple languages and genres.
<p>
Earlier in his career, Nanda has worked on natural language web-based and spoken dialog systems at IBM. Before joining IBM, he has worked on information retrieval and filtering algorithms as a senior research scientist at WiseWire Corporation, Pittsburgh and on image compression algorithms while working as a postdoctoral fellow under Prof. Simon Haykin at McMaster University, Canada.
<p>
Nanda's research interests are focused on NLP and technology solutions for creating, storing, searching, and processing large volumes of unstructured data (text, audio, video, etc.) and specifically on applications of statistical learning algorithms to these tasks.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Sep 2009</td>
<td align=left valign=top>David Chiang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Sep_2009');">
Tutorial on HPC
</a><br>
<span id=abs11_Sep_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th Floor Large Conference Room [1135]<br>
<b>Abstract:</b> This tutorial will be a short introduction to using the Linux cluster at
USC's High-Performance Computing (HPC) facility. Topics will include:
(1) basics of starting jobs on the cluster using Torque/PBS,
(2) dealing with common problems like jobs not starting or
spontaneously dying,
(3) maximizing the performance of your jobs (both yours and other
people's), e.g., using the correct filesystem and tuning it for better speed,
(4) embarrassingly parallel processing and poor-man's workflows.
<p>
It will NOT cover
Hadoop,
MPI,
real workflow management tools like Condor.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>28 Aug 2009</td>
<td align=left valign=top>Adam Pauls (UC Berkeley) <br> Michael Auli (Edinburgh)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs28_Aug_2009');">
Intern Final Talks
</a><br>
<span id=abs28_Aug_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Tree-to-String Alignment Models
<p>
Machine translation systems typically rely on some form alignment as a
preprocessing step. Typically, these alignments take the form of
word-to-word alignments. In this talk, we will introduce several
models aimed at aligning foreign words to either English words or
nodes in the English parse tree. Such word-to-node alignments offer
several potential advantages over traditional word-to-word
alignments. Firstly, since the extraction process for some syntactic
systems explicitly considers the English trees, we expect that also
considering the trees at alignment time will produce alignments that
will better suit the extraction process. Secondly, aligning foreign
function words to English tree nodes can admits highly desirable
syntactic transfer rules which cannot be directly as word-to-word
alignments. Finally, word-to-node alignments can effectively model
many-to-one alignments.  We present four models of increasing
complexity and show preliminary results for each model.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 Aug 2009</td>
<td align=left valign=top>Erica Greene (Haverford) <br> Paramveer Dhillon (Penn)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_Aug_2009');">
Intern Final Talks
</a><br>
<span id=abs27_Aug_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> TALK 1: Erica Greene
<p>
Title: A Statistical Foray into Poetry
<p>
Abstract: Although the analysis and generation of poetry is often considered an
exclusively human task, we have taken some initial steps to automate
the process.  We build a series of finite state transducers to analyze
poetic meter and train them on a handmade corpus of poetry. We then
use these trained transducers to generate poetry.  Specifically, we
focus on generating sonnets and limericks.
<p>
------------------------------------------
<p>
TALK 2: Paramveer Dhillon
<p>
Title: Learning to simplify target language for MT + Unsupervised log-linear
models for Word Alignment
<p>
Abstract: We consider the Machine Translation task for the language pair
(Chinese and English), where English is the target language. There are
lots of redundancies in English language, e.g. It has capitalization,
i.e. the first word of each sentence is capitalized, and it has
different morphology i.e. it has noun and verb endings; none of which
are present in Chinese. In a way, due to these redundancies, we are
learning that a single Chinese word "tamen" translates to "They" and
"they" and another Chinese word translates to "run", "runs" and
"running". We present generative models which learn to "cluster" the
target language vocabulary, by removing the above redundancies, namely
(Capitalization and Different morphology). We show results on how this
"clustering" affects the translation quality in end-to-end MT
experiments.
<p>
In the last part of the talk, I would talk about using unsupervised
log-linear(discriminative) models for improving word alignments. There
are very few precedents of using discriminative models for word
alignment in totally unsupervised settings. (Taskar et. al. '05) and
(Lacoste-Julien et. al. '06) used maximum weight bipartite matching in
"nearly" unsupervised setting and (Blunsom et. al. '06) used CRFs for
supervised word alignment. We use log-linear models in totally
unsupervised settings to do word alignments. Speicifically we use
Contrastive Estimation (Smith et. al. '05) to shift the probability
mass to the correct set of alignments from a well-chosen
"neighborhood" of those alignments. In the end I will show some
preliminary word alignment results using our approach.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Aug 2009</td>
<td align=left valign=top>Sujith Ravi</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Aug_2009');">
Natural Language Decipherment: Solving Problems in Natural Language Processing without Labeled Data (Thesis Proposal practice talk)
</a><br>
<span id=abs26_Aug_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Natural Language Decipherment: Solving Problems in Natural Language Processing without Labeled Data (Thesis Proposal practice talk)
<p>
A wide variety of problems in NLP require parallel data to train supervised models to perform different tasks. For example, in machine translation (where the task is to translate between two languages automatically) parallel data containing source/target language sentence pairs is required to train various models which can then be used to translate new sentences or documents. The dependency on parallel data for many of these NLP tasks limits their applications to specific domains, or language pairs for which a lot of training data is readily available. On the other hand, collecting parallel data for new domains, language pairs, etc. is a costly as well as time-intensive operation. For such tasks, the development of novel unsupervised approaches which require only {\em non-parallel} data for training can enable their application to new domains and potentially broaden the impact and benefits of NLP research to wider areas.
<p>
A similar problem has been tackled by cryptographers and archaeologists in a different context---for "decipherment" purposes. During the 1940's and 1950's, mathematicians and scientists worked on code-breaking operations, which spurred the development of many research ideas for modern computer science. For such problems, it is highly unlikely to assume the availability of parallel data relating the ciphertext and plaintext, yet cryptographers and archaeologists have attempted to solve such tasks using various decipherment techniques along with other non-parallel sources of information.
<p>
In this thesis proposal practice talk, I will show how we combine the two ideas (decipherment and unsupervised learning for NLP problems) together and present a unified decipherment-based approach for modeling a wide range of problems in NLP. Instead of relying on parallel data, I propose to use alternate sources of linguistic knowledge and large quantities of readily available monolingual data to induce strong bilingual connections in problems such as machine transliteration and translation. The talk will describe how various NLP problems such as unsupervised part-of-speech tagging, word alignment, transliteration, and machine translation can be formulated as decipherment tasks. I will present decipherment algorithms for tackling many of these problems and show that it is possible to achieve good results for many problems of interest in NLP without using any parallel data at all.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>21 Aug 2009</td>
<td align=left valign=top>Liang Huang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs21_Aug_2009');">
Bilingually-Constrained (Monolingual) Shift-Reduce Parsing
</a><br>
<span id=abs21_Aug_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:15 pm<br>
<b>Location:</b> 4th Floor Conference Room<br>
<b>Abstract:</b> <p>
Jointly parsing two languages has been shown to improve accuracies on
either or both sides. However, its search space is much bigger than
the monolingual case, forcing existing approaches to employ
complicated modeling and crude approximations. Here we propose a much
simpler alternative, bilingually-constrained monolingual parsing,
where a source-language parser learns to exploit reorderings as
additional observation, but not bothering to build the target-side
tree as well. We show specifically how to enhance a shift-reduce
dependency parser to use alignment features to resolve shift-reduce
conflicts. Experiments on the bilingual portion of Chinese Treebank
show that, with just 3 bilingual features, we can improve parsing
accuracies by 0.6% for both English and Chinese, with negligible (~6%)
efficiency overhead, thus much faster than biparsing.
<p>
http://www.cis.upenn.edu/~lhuang3/biparsing.pdf
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 Jul 2009</td>
<td align=left valign=top>Adam Pauls (UC Berkeley) <br> Ulf Hermjakob</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_Jul_2009');">
Practice talks for EMNLP
</a><br>
<span id=abs24_Jul_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:15 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> <p>
K-Best A* Parsing (Adam Pauls)
<p>
A* parsing makes 1-best search efficient by
suppressing unlikely 1-best items. Existing k-
best extraction methods can efficiently search
for top derivations, but only after an exhaus-
tive 1-best pass. We present a unified algo-
rithm for k-best A* parsing which preserves
the efficiency of k-best extraction while giv-
ing the speed-ups of A* methods. Our algo-
rithm produces optimal k-best parses under the
same conditions required for optimality in a
1-best A* parser. Empirically, optimal k-best
lists can be extracted significantly faster than
with other approaches, over a range of gram-
mar types.
<p>
------------------------------------------
<p>
Improved Word Alignment with Statistics and Linguistic Heuristics (Ulf Hermjakob)
<p>
We present a method to align words in a bitext that combines elements
of a traditional statistical approach with linguistic knowledge.
We demonstrate this approach for Arabic-English, using an alignment
lexicon produced by a statistical word aligner, as well as linguistic
resources ranging from an English parser to heuristic alignment rules
for function words. These linguistic heuristics have been generalized
from a development corpus of 100 parallel sentences.
Our aligner, UALIGN, outperforms both the commonly used GIZA++ aligner
and the state-of-the-art LEAF aligner on F-measure and produces
superior scores in end-to-end statistical machine translation,
+1.3 BLEU points over GIZA++, and +0.7 over LEAF.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Jul 2009</td>
<td align=left valign=top>Mark Hopkins (Language Weaver)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Jul_2009');">
Cube Pruning as Heuristic Search (Practice talk for EMNLP)
</a><br>
<span id=abs23_Jul_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:45 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Cube pruning is a fast inexact method for generating the items of a
beam decoder.  Here we show that cube pruning is essentially
equivalent to A* search on a specific search space with specific
heuristics.  We use this insight to develop faster and exact variants
of cube pruning.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Jul 2009</td>
<td align=left valign=top>Paramveer Dhillon (Penn)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Jul_2009');">
Transfer Learning for WSD & Non-local constraints for Named Entity Recognition
</a><br>
<span id=abs17_Jul_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This talk will be divided into two parts. In the first part I will
talk about using Transfer Learning techniques to improve the task of
Word Sense Disambiguation (WSD).
Usually in supervised WSD, we suffer due to paucity of labeled data as
there are some words that occur less frequently in the data and its
very difficult to get enough labeled data for these words. In such
cases it is very difficult to build high accuracy supervised learning
models for these words. So, we propose an approach called TransFeat
(based on the MDL principle) which ``transfers information", from
similar words in the form of a feature relevance prior to get improved
accuracies on these rare words. Besides this, our experiments show
that we also get decent improvement in accuracy for words that have
more amount of labeled data available. TransFeat gives accuracies that
are in the worst case comparable to state-of-the-art on ONTONOTES and
SENSEVAL-2 datasets.
<p>
In the second part of the talk I will talk about incorporating
non-local constraints in Named Entity Recognition (NER) systems. The
main idea is that some linguistic constraints (e.g. every occurrence
of the word ``Einstein" in the data should have the tag PER
i.e. person ) are very useful and can give improved performance but
they are non - local and hence are intractable and can not be
efficiently modeled using state-of-the-art sequence modeling methods
like CRFs. Though people have used Skip-chain CRFs (with Loopy
BP)(Sutton and McCallum '04) and Gibbs Sampling (Finkel and Manning
'05) to enforce these non-local constraints, but they turn out to be
really inefficient and custom-tailored to one particular kind of
constraints (say) consistency constraints of the type mentioned
above. We propose a constrained version of EM in which a general set
of constraints (not limited to consistency constraints!) can be
incorporated into the model. In the end I will show some results of
this approach on CoNLL 03 English and CoNLL 02 Spanish NER shared tasks.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Jul 2009</td>
<td align=left valign=top>Yang Liu (ICT China)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Jul_2009');">
Weighted Alignment Matrices for Statistical Machine Translation
</a><br>
<span id=abs16_Jul_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 am - 11:30 am<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Current statistical machine translation systems usually
extract rules from bilingual
corpora annotated with 1-best alignments. They are prone to learn
noisy rules due to alignment mistakes. We propose a new structure
called weighted alignment matrix
to encode all possible alignments for a parallel text compactly. The
key idea is to assign a probability to each word pair to indicate how
well they are aligned. We design new algorithms for extracting phrase
pairs from weighted alignment matrices and estimating their
probabilities. Our experiments on multiple language pairs show that
using weighted matrices achieves consistent improvements over using
n-best lists in significant less extraction time.
<p>
About the speaker:
<p>
Yang Liu is an Assistant Researcher at Institute of Computing
Technology (ICT), Chinese Academy of Sciences. He received his PhD
degree in Computer Science from ICT in 2007. His major research
interests include statistical machine translation and Chinese
information processing. He has been working on syntax-based modeling,
word alignment, and system combination. His paper on tree-to-string
translation won the Meritorious Asian NLP Paper Award of COLING/ACL
2006. He served as Reviewers for TALIP, TSLP, JNLE, ACL, EMNLP, AMTA, and SSST.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Jul 2009</td>
<td align=left valign=top>Yang Liu (ICT China)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Jul_2009');">
An Overview of Tree-to-String Translation Models
</a><br>
<span id=abs15_Jul_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 4:00 pm - 5:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Recent research on statistical machine translation has lead to the rapid development of syntax-based translation models, which
exploit syntactic information to direct translation. In this talk, I
will give an overview of tree-to-string translation models, one of the
state-of-the-art syntax-based models. In a tree-to-string model, the source side is a phrase structure parse tree and the target side is a
string. This talk includes the following topics: (1) tree-based tree-to-string model, (2) tree-sequence based tree-to-string model,
(3) forest-based tree-to-string model, and (4) context-aware
tree-to-string model. Experimental results show that the forest-based
tree-to-string system outperforms Hiero significantly on Chinese-to-English translation.
<p>
About the speaker:
<p>
Yang Liu is an Assistant Researcher at Institute of Computing
Technology (ICT), Chinese Academy of Sciences. He received his PhD
degree in Computer Science from ICT in 2007. His major research
interests include statistical machine translation and Chinese
information processing. He has been working on syntax-based modeling,
word alignment, and system combination. His paper on tree-to-string
translation won the Meritorious Asian NLP Paper Award of COLING/ACL
2006. He served as Reviewers for TALIP, TSLP, JNLE, ACL, EMNLP, AMTA, and SSST.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Jul 2009</td>
<td align=left valign=top>Kevin Knight</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Jul_2009');">
Excerpts from ACL-09 Tutorial on "Topics in Machine Translation"
</a><br>
<span id=abs10_Jul_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Philipp Koehn and I will do a machine translation tutorial at ACL.
Instead of an introductory tutorial, we'll do short 15-minute segments
on various hot topics in MT research.  For the ISI NL seminar, I'll
present 3 or 4 of those topics, determined by audience vote.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Jun 2009</td>
<td align=left valign=top>Steve DeNeefe</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Jun_2009');">
Synchronous Tree Adjoining Machine Translation (Practice talk for EMNLP)
</a><br>
<span id=abs26_Jun_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Tree Adjoining Grammars have well-known advantages, but are typically
considered too difficult for practical systems.  We demonstrate that,
when done right, adjoining improves translation quality without
becoming computationally intractable.  Using adjoining to model
optionality allows general translation patterns to be learned without
the clutter of endless variations of optional material, with extra
information spliced in as needed.
<p>
In this paper, we describe a novel method for learning a type of
Synchronous Tree Adjoining Grammar and associated probabilities from
aligned tree/string training data.  We introduce a method of
converting these grammars to a weakly equivalent tree transducer for
efficient decoding.  Finally, we show that adjoining results in an
end-to-end improvement of +0.8 BLEU over a baseline statistical
syntax-based MT model on a large-scale Arabic/English MT task.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 Jun 2009</td>
<td align=left valign=top>Adam Pauls (UC Berkeley)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_Jun_2009');">
Hierarchical Search for Parsing (and Machine Translation)
</a><br>
<span id=abs19_Jun_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Both coarse-to-fine and A* parsing use simple grammars to guide search in
complex ones. We compare the two approaches in a common, agenda-based
framework, demonstrating the tradeoffs and relative strengths of each
method. Overall, coarse-to-fine is much faster for moderate levels of search
errors, but below a certain threshold A* is superior. In addition,
we present the first experiments on hierarchical A* parsing, in
which computation of heuristics is itself guided by
meta-heuristics. Multi-level hierarchies are helpful in both
approaches, but are more effective in the coarse-to-fine case because
of accumulated slack in A* heuristics.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>29 May 2009</td>
<td align=left valign=top>Marta Recasens Potau (Universitat de Barcelona)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs29_May_2009');">
Learning-based Coreference Resolution for Spanish and Catalan
</a><br>
<span id=abs29_May_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> The task of coreference resolution identifies those expressions in a text that point to the same discourse entity. Natural language applications such as information extraction, question answering and machine translation can greatly benefit from its output (the different pieces of information in connection with the same entity are linked, pronouns are disambiguated, etc.). The task is extremely complex since a number of knowledge sources come into play, from morphology to discourse structure and world knowledge. In this talk I present the results of my PhD research up to now, including the development of two 400k-word corpora for Spanish and Catalan (AnCora) annotated at various levels (morphology, syntax, semantics, pragmatics), a 100k-word corpus for English, and a series of experiments towards building a learning-based coreference resolution system. More specifically, I'll discuss issues concerning the definition of the annotation scheme, the selection of features for machine learning, the effect of sample selection, and I'll introduce CISTELL, the new learning-approach we propose for coreference resolution.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 May 2009</td>
<td align=left valign=top>Victoria Fossum <br> Dirk Hovy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_May_2009');">
Practice talks for NAACL HLT
</a><br>
<span id=abs22_May_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11th flr CR<br>
<b>Abstract:</b> Combining Constituent Parsers (Victoria Fossum: 3:00pm -- 3:30pm)
<p>
Combining the 1-best output of multiple parsers via parse selection or
parse hybridization improves f-score over the best individual parser
(Henderson and Brill, 1999; Sagae and Lavie, 2006). We propose three ways to improve upon existing methods for parser combination.
<p>
---------------------------------------------------------
<p>
Disambiguation of Preposition Sense Using Linguistically Motivated
Features (Dirk Hovy: 3:30pm -- 4:00pm)
<p>
Classifying polysemous words into their proper sense classes is
potentially useful to any NLP application that needs to extract
information from text or build a semantic representation of the
textual information. Like instances of other word classes, many
prepositions are ambiguous, carrying different semantic meanings
(including notions of instrumental, accompaniment, location, etc.)
In this paper, we present a supervised classification approach for
disambiguation of preposition senses. We use the SemEval 2007
Preposition Sense Disambiguation datasets to evaluate our system and
compare its results to those of the systems participating in the
workshop. We derived linguistically motivated features from both sides
of the preposition. Instead of restricting     these to a fixed window
size, we utilized the phrase structure. Testing with five different
classifiers, we can report an increased accuracy (76.4%) that
outperforms the best system in the SemEval task.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 May 2009</td>
<td align=left valign=top>David Chiang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_May_2009');">
Practice talks for NAACL HLT
</a><br>
<span id=abs15_May_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 4th flr CR<br>
<b>Abstract:</b> 11,001 New Features for Statistical Machine Translation (David Chiang)
- Winner of Best Paper Award at NAACL/HLT 2009
<p>
We use the Margin Infused Relaxed Algorithm of Crammer et al. to add a
large number of new features to two machine translation systems: the
Hiero hierarchical phrase based translation system and our
syntax-based translation system. On a large-scale Chinese-English
translation task, we obtain statistically significant improvements of
+1.5 BLEU and +1.1 BLEU, respectively. We analyze the impact of the new features and the performance of the learning algorithm.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 May 2009</td>
<td align=left valign=top>Sujith Ravi</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_May_2009');">
Practice talks for NAACL HLT
</a><br>
<span id=abs14_May_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 4th flr CR<br>
<b>Abstract:</b> Talk-1: Learning Phoneme Mappings for Transliteration without Parallel Data
<p>
We present a method for performing machine transliteration without any parallel resources. We frame the transliteration task as a decipherment problem and show that it is possible to learn cross-language phoneme mapping tables using only monolingual resources.  We compare various methods and evaluate their accuracies on a standard name transliteration task.
<p>
This is joint work with Kevin Knight.
<p>
----------------------------------------------------
<p>
Talk-2: A New Objective Function for Word Alignment
<p>
We develop a new objective function for word alignment that measures the size of the bilingual dictionary induced by an alignment. A word alignment that results in a small dictionary is preferred over one that results in a large dictionary.  In order to search for the alignment that minimizes this objective, we cast the problem as one of integer linear programming.  We then extend our objective function to align corpora at the sub-word level, which we demonstrate on a small Turkish-English corpus.
<p>
This is joint work with Tugba Bodrumlu and Kevin Knight.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>08 May 2009</td>
<td align=left valign=top>Andrew Kehler (UCSD)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs08_May_2009');">
Coherence and the (Psycho-) Linguistics of Pronoun Interpretation
</a><br>
<span id=abs08_May_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> More than three decades of research has sought to uncover the
principles that determine how hearers interpret pronouns in context.
This work has focused predominantly on identifying so-called
'preferences' or 'heuristics' that hearers utilize based on linguistic
properties of antecedent expressions.  This focus is a departure from
the type of approach outlined in Hobbs (1979), which argues that the
mechanisms that drive pronoun interpretation are driven predominantly
by semantics, world knowledge, and inference, with particular
reference to how these are used to establish the coherence of
discourses.
<p>
In this talk, I report on new experimental evidence in support of a
coherence-driven analysis, and describe how the analysis can
accommodate a range of previous findings suggestive of conflicting
preferences and biases.  Case studies of four commonly-cited
preferences are described, specifically (i) the parallel grammatical
role preference (e.g., Smyth 1994), (ii) thematic role preferences
(e.g., Stevenson et al. 1994), (iii) implicit causality biases (e.g.,
Caramazza et al. 1977), and (iv) the subject assignment strategy
(e.g., Crawley et al. 1990).  In each case, the experimental results
offer an explanation of what the underlying source of the bias is, and
predicts in what contexts evidence for it will surface.
<p>
These results suggest that pronoun interpretation is incrementally
influenced in part by the probabilistic expectations that hearers have
about how the discourse will be coherently continued.  They are also
argued to leave various myths by the roadside, e.g., that pronoun
interpretation can be profitably thought of as a 'search and match'
procedure, and that coherence relations need not be controlled for in
experimental stimuli.
<p>
This talk includes joint work with Laura Kertz, Hannah Rohde, and
Jeffrey Elman.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Apr 2009</td>
<td align=left valign=top>Rahul Bhagat</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Apr_2009');">
Learning Paraphrases from Text (Ph.D. Defense practice talk)
</a><br>
<span id=abs17_Apr_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Paraphrases are textual expressions that convey the same meaning using different surface forms. Capturing the variability of language, they play an important role in many natural language applications including question answering, machine translation, and multi-document summarization. In linguistics, paraphrases are characterized by approximate conceptual equivalence. Since no automated semantic interpretation systems available today can identify conceptual equivalence, paraphrases are difficult to acquire without human effort. The aim of this thesis is to develop methods for automatically acquiring and filtering phrase-level paraphrases using a monolingual corpus.
<p>
Noting that the real world uses far more quasi-paraphrases than the logically equivalent ones, we first present a general typology of quasi-paraphrases together with their relative frequencies. To our knowledge the first one ever. We then present a method for automatically learning the contexts in which quasi-paraphrases obtained from a corpus are mutually replaceable. Knowing that quasi-paraphrases are often inexact because they contain semantic implications which can be directional, we present an algorithm called LEDIR to learn the directionality of quasi-paraphrases. Since semantic classes play a crucial role in our work, we also investigate the use of a semi-supervised clustering algorithm for learning semantic classes.
<p>
We next investigate the task of learning surface paraphrases, i.e., paraphrases that do not require the use of any syntactic interpretation. Since one would need a very large corpus to find enough surface variations, we use a really large but unprocessed corpus of 150GB (25 billion words) obtained from Google News to do this learning. We show that these paraphrases can be used to learn surface patterns for relation extraction. Finally, we use paraphrases to learn patterns for domain-specific information extraction.
<p>
Thus, in this thesis we define quasi-paraphrases, present methods to learn them from a corpus, and show that quasi-paraphrases are useful for information extraction.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 Mar 2009</td>
<td align=left valign=top>David Chiang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_Mar_2009');">
Tutorial on Hadoop
</a><br>
<span id=abs27_Mar_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Hadoop is an open-source implementation of the Map/Reduce framework introduced by Google Research. It is a simple abstraction for describing parallelizable algorithms that admits very efficient execution: in one case, one of my (poorly implemented) algorithms was improved from a typical runtime of 72 hours to 3 hours. I will give a short introduction to Hadoop that is highly colored by my experiences with it and the likely experiences of other natural language processing researchers at ISI. I will show how to run Hadoop on HPC, how to use Hadoop Streaming (which allows implementation in any language you choose), and how to define Map/Reduce algorithms for a few incarnations of a typical NLP task, relative-frequency estimation of a large probability distribution. Input from others who are more experienced with Hadoop than I am is welcome!
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 Mar 2009</td>
<td align=left valign=top>Rutu Mulkar</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_Mar_2009');">
Discovering Causal and Temporal Relations in Biomedical Texts (practice talk for AAAI Spring Symposium)
</a><br>
<span id=abs19_Mar_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 2:00 pm - 2:30 pm<br>
<b>Location:</b> 4th floor CR<br>
<b>Abstract:</b> In previous work on "Learning by Reading" we successfully extracted entities, states and events from technical natural language descriptions of processes. The research described here is aimed at the automatic discovery of causal and temporal ordering relations among states and events, specifically, among molecular and other events in biomedical articles. We have annotated causal and temporal relations in articles on the cell cycle, and we discuss our annotation guidelines and the issue of inter-annotator agreement. We then describe the natural language parsing and the inference system we use to extract these relations. We have created axioms manually for this system, focusing on temporal, causal and aspectual information and we have used semi-automatic means to augment these axioms. We have evaluated the performance of this system, and the results are promising.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 Mar 2009</td>
<td align=left valign=top>Andreas Maletti</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_Mar_2009');">
Minimizing Deterministic Weighted Tree Automata
</a><br>
<span id=abs06_Mar_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Weighted tree automata are equivalent to weighted tree grammars, which can be used, for example, to easily model weighted context-free grammars. In constrast to context-free grammars, tree automata work directly on a tree representation and not on strings. We will introduce weighted tree automata and review the important results on minimization of them. For example, it is known that deterministic devices over commutative semifields (commutative semirings with multiplicative inverses) can be effectively minimized. In the main part of the talk, we present the first efficient algorithm for this minimization. If the operations can be performed in constant time, then our algorithm constructs an equivalent minimal (with respect to the number of states) deterministic automaton in time linear in the maximal rank of the input symbols, the number of (useful) transitions, and the number of states of the input automaton.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 Feb 2009</td>
<td align=left valign=top>Carlos Busso (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_Feb_2009');">
Multimodal Processing of Human Behavior in Intelligent Instrumented Spaces: A Focus on Expressive Human Communication
</a><br>
<span id=abs27_Feb_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Advances in technologies to capture and process multimedia signals are enabling new opportunities for understanding and modeling human behavior, and designing new human-centered applications. Intelligent environments equipped with a range of audio-visual sensors provide suitable means for automatically monitoring and tracking the behavior, strategies and engagement of the participants in multiperson interactions such as meetings, at various levels of interest. We describe a case study of a "Smartroom" being developed at USC in which high-level features are calculated from active speaker segmentations, automatically annotated by our system, to infer the interaction dynamics between the participants. The results show that it is possible to accurately estimate in real-time not only the flow of the interaction, but also how dominant and engaged each participant was during the discussion.
<p>
Additionally, we describe analysis of human expressive behavior that can be afforded by such audio-visual data. We describe an analysis of the interrelation between facial gestures and speech using a multimodal approach. Using a controlled setting, motion capture technology was used to simultaneously acquire speech and detailed facial information. Our results indicate that the verbal and non-verbal channels of human communication are internally and intricately connected. The interplay is observed across the different communication channels such as various aspects of speech, facial expressions, and movements of the hands, head and body, and is greatly affected by the linguistic and emotional content of the message being communicated. As a result of the analysis, applications in automatic emotion recognition and synthesis of expressive communication are presented.
<p>
[This research was supported in part by funds from the NSF, NIH, and the Department of the Army]
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>13 Feb 2009</td>
<td align=left valign=top>Joseph Tepperman (Signal Analysis and Interpretation Laboratory, USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs13_Feb_2009');">
Estimating Subjective Judgments of Speech on Multiple Levels
</a><br>
<span id=abs13_Feb_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> People make explicit subjective judgments of speech when doing things like tutoring students in a foreign language, or testing a child's reading skills.  On what do we base these judgments, and how can they be made automatically?  The "quality" of speech does not exist on any one scale alone, and is not limited strictly to pronunciation - it is manifested through a multiplicity of simultaneous and interacting cues of various sizes.  In this talk I'll discuss modeling strategies for categorical pronunciation on several scales, cognitive models for estimating student knowledge demonstrated through speech, and applications in the fields of education and speech synthesis.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Jan 2009</td>
<td align=left valign=top>Kevin Knight</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Jan_2009');">
Sixty Years of Statistical Machine Translation
</a><br>
<span id=abs30_Jan_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This high-level survey will describe the results of statistical machine translation (SMT) research since 1948. Part of the survey will cover the explosion of work in the past few years that has resulted from intense interest on the part of scientists, funders, and industry. We will also examine the roots of SMT in World War II decipherment activities. Some of the concepts from that era have become core to the field, while others still remain to be picked up.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Jan 2009</td>
<td align=left valign=top>Roger Levy (UCSD)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Jan_2009');">
Noise and memory in rational human language comprehension
</a><br>
<span id=abs23_Jan_2009 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Considering the adversity of the conditions under which linguistic communication takes place in everyday life---ambiguity of the signal, environmental competition for our attention, speaker error, our limited memory, and so forth---it is perhaps remarkable that we are as successful at it as we are.  Perhaps the leading explanation of this success is that (a) the linguistic signal is redundant, (b) diverse information sources are generally available that can help us obtain infer the intended message (or something close enough) when comprehending an utterance, and (c) we use these diverse information sources very quickly and to the fullest extent possible.  This explanation can be thought of as treating language comprehension as a rational, evidential process.  Nevertheless, there are number of prominent phenomena reported in the sentence processing literature that remain clear puzzles for the rational approach.  In this talk I address three such phenomena, whose common underlying thread is an apparent failure to use information available in a sentence appropriately in global or incremental inferences about the correct interpretation of a sentence.  I argue that the apparent puzzle posed by these phenomena for models of rational sentence comprehension may derive from the failure of existing models to appropriately account for the environmental and cognitive constraints---namely, noisy input and limited memory---under which comprehension takes place.  I present two new probabilistic models of language comprehension under noisy input and limited memory, and show that these models lead to solutions to the above puzzles.  More generally, these models suggest how appropriately accounting for environmental and cognitive constraints can lead to a more nuanced and ultimately more satisfactory picture of key aspects of human cognition.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Dec 2008</td>
<td align=left valign=top>Liang Huang (UPenn => Google Research)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Dec_2008');">
Tree-based and Forest-based Translation
</a><br>
<span id=abs17_Dec_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 4th Floor CR<br>
<b>Abstract:</b> What is in common, and what is different, between translating from English to Chinese and compiling C++ into machine code?
<p>
In this talk I will first introduce a tree-based (aka syntax-directed) paradigm for machine translation, inspired by both human translators and compilers. In this paradigm, a source language sentence is first parsed into a syntactic tree, which is then recursively converted into a target language sentence via tree-to-string transformation rules. Since the translation process is driven by the syntax, this approach resembles the classical "syntax-directed translation" method in compiling theory.
<p>
However, natural languages are crucially different from programming languages in that they are fundamentally ambiguous. So we don't (and will probably never) have perfect parsers, and parsing errors adversely affect translation quality. To alleviate this problem, an obvious idea is to use the top-k parses, rather than a single 1-best, but this only helps a little bit due to the limited scope of the k-best list. We instead propose a "forest-based approach", which translates a packed forest encoding *exponentially* many parses in a compact (polynomial) space by sharing common subtrees. Large-scale experiments showed very significant improvements (over the 1-best baseline) in terms of translation quality, which outperforms the best reported systems to date. More interestingly, translating a forest of millions of trees is even faster than translating on top-30 individual trees thanks to dynamic programming.
<p>
This talk includes joint work with Kevin Knight and Aravind Joshi (first part), and with Haitao Mi and Qun Liu (second/third parts).
<p>
<p>
Short Bio:
<p>
Liang Huang recently completed his PhD study at the University of Pennsylvania, co-supervised by Aravind Joshi and Kevin Knight (USC/ISI). He is mainly interested in the theoretical aspects of computational linguistics, in particular, efficient algorithms in parsing and machine translation, generic dynamic programming, and formal properties of synchronous grammars. His thesis develops a set of "forest-based methods" that have been applied to many problems in NLP including k-best parsing, forest rescoring and reranking, and forest-based translation. His awards include an Outstanding Paper Award at ACL 2008, and a University Teaching Award at Penn in 2005.
<p>
http://www.cis.upenn.edu/~lhuang3/
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Nov 2008</td>
<td align=left valign=top>Daniel Marcu</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Nov_2008');">
The best/worst Speech Recognition, Language Modeling, and Machine Translation ideas
</a><br>
<span id=abs07_Nov_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> A group of 60 researchers have been asked to comment on what they perceive to be
<p>
- the most important contributions in the fields of speech recognition, language modeling, and machine translation;
<p>
- past ideas that failed to lead to substantial improvements;
<p>
- and contributions that are most likely to have a material impact in the future.
<p>
This talk summarizes the perceptions and trends identified in the collection of answers provided by the researchers.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Oct 2008</td>
<td align=left valign=top>Jens Voeckler</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Oct_2008');">
Parsing XRS with(out) regular expressions
</a><br>
<span id=abs17_Oct_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> If you ever needed to extract information, e.g. LHS, RHS words, features, etc., from an XRS rules, this talk is for you. Over the years, a variety of regular expressions have been used to obtain data from XRS rules. However, in light of recent pipeline efforts, the copy-n-paste culture lead to expressions that were sometimes too complex for the task at hand, unnecessarily slowing down processing steps, or too trivial to work correctly on boundary cases. A unified effort by Steve, David, Wei, Michael and Jens culminated in the NLPRules module for Perl. While the talk centers on the Perl module, and some surprising benchmark results, any language supporting libpcre (perl compatible regular expression) will benefit from the insights, and from knowing the right regular expression for the task at hand.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Oct 2008</td>
<td align=left valign=top>Victoria Fossum + David Chiang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Oct_2008');">
Practice talks for AMTA/EMNLP
</a><br>
<span id=abs14_Oct_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:15 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Using Bilingual Chinese-English Word Alignments to Resolve PP-Attachment Ambiguity in English (practice talk for AMTA)
<p>
Errors in English parse trees impact the quality of syntax-based MT systems trained using those parses. Frequent sources of error for English parsers include PP-attachment ambiguity, NP-bracketing ambiguity, and coordination ambiguity. Not all ambiguities are preserved across languages. We examine a common type of ambiguity in English that is not preserved in Chinese: given a sequenc "VP NP PP", should the PP be attached to the main verb, or to the object noun phrase? We present a discriminative method for exploiting bilingual Chinese-English word alignments to resolve this ambiguity in English. On a heldout test set of Chinese-English parallel sentences, our method achieves 86.3% accuracy on this PP-attachment disambiguation task, an improvement of 4% over the accuracy of the baseline Collins parser (82.3%).
<p>
Online Large-Margin Training of Syntactic and Structural Translation Features (practice talk for EMNLP)
<p>
Minimum-error-rate training (MERT) is a bottleneck for current development in statistical machine translation because it is limited in the number of weights it can reliably optimize. Building on the work of Watanabe et al., we explore the use of the MIRA algorithm of Crammer et al. as an alternative to MERT. We first show that by parallel processing and exploiting more of the parse forest, we can obtain results using MIRA that match or surpass MERT in terms of both translation quality and computational cost. We then test the method on two classes of features that address deficiencies in the Hiero hierarchical phrase based model: first, we simultaneously train a large number of Marton and ResnikÂ’s soft syntactic constraints, and, second, we introduce a novel structural distortion model. In both cases we obtain significant improvements in translation performance. Optimizing them in combination, for a total of 56 feature weights, we improve performance by 2.6 Bleu on a subset of the NIST 2006 Arabic-English evaluation data.
<p>
(Joint work with Yuval Marton and Philip Resnik)
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Oct 2008</td>
<td align=left valign=top>Sujith Ravi + Steve DeNeefe</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Oct_2008');">
Practice talks for AMTA/EMNLP
</a><br>
<span id=abs10_Oct_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:15 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Automatic Prediction of Parser Accuracy (practice talk for EMNLP)
<p>
Statistical parsers have become increasingly accurate, to the point where they are useful in many natural language applications. However, estimating parsing accuracy on a wide variety of domains and genres is still a challenge in the absence of gold-standard parse trees.
<p>
We propose a technique that automatically takes into account certain characteristics of the domains of interest, and accurately predicts parser performance on data from these new domains. As a result, we have a cheap (no annotation involved) and effective recipe for measuring the performance of a statistical parser on any given domain.
(Joint work with Kevin Knight and Radu Soricut)
<p>
<p>
<p>
Overcoming Vocabulary Sparsity in MT Using Lattices  (practice talk for AMTA)
<p>
Source languages with complex word formation rules present a challenge for statistical machine translation (SMT). In this paper, we take on three facets of this challenge: (1) common stems are fragmented into many different forms in training data, (2) rare and unknown words are frequent in test data, and (3) spelling variation creates additional sparseness problems. We present a novel, lightweight technique for dealing with this fragmentation, based on bilingual data, and we also present a combination of linguistic and statistical techniques for dealing with rare and unknown words. Taking these techniques together, we demonstrate +1.3 and +1.6 BLEU increases on top of strong baselines for Arabic-English machine translation.
(Joint work with Ulf Hermjakob and Kevin Knight)
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Sep 2008</td>
<td align=left valign=top>Eugene Charniak (Brown University)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Sep_2008');">
EM Works for Pronoun-Anaphora Resolution
</a><br>
<span id=abs26_Sep_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> EM (the Expectation Maximization Algorithm) is a well known technique for unsupervised learning (where one does not have any hand labeled solutions available, but instead one must learn from the raw text). Unfortunately EM is known to fail to find good solutions in many (most?) applications on which it is tried.  In this talk we present some recent work on using EM to learn how to resolve pronoun-anaphora: determining that "the dog" is the antecedent of "he" and "his" in "When Sally fed the dog he wagged his tail". For this application EM works strikingly well, determining tens of thousands of parameters and resulting in a program that probably produces state of the art results, although because this is preliminary work, and pronoun-anaphora has no standard evaluation metrics, this is just a guess.
<p>
<p>
About the Speaker:
<p>
Eugene Charniak is Professor of  Computer Science. and Cognitive Science at Brown University. He received an A.B. degree in Physics from University of Chicago and a Ph.D. from M.I.T. in Computer Science. He has published four books: Computational Semantics, with Yorick Wilks (1976); Artificial Intelligence Programming (now in a second edition) with Chris Riesbeck, Drew McDermott, and James Meehan (1980, 1987); Introduction to Artificial Intelligence with Drew McDermott (1985); and Statistical Language Learning (1993). He is a Fellow of the American Association of Artificial Intelligence and was previously a Councilor of the organization. His research has always been in the area of language understanding or technologies which relate to it, such as knowledge representation, reasoning under uncertainty, and learning. Over the last few years he has been interested in statistical techniques for language understanding. His research in this area has included work in the subareas of part-of-speech tagging, probabilistic context-free grammar induction, and, more recently, syntactic disambiguation through word statistics, efficient syntactic parsing, and lexical resource acquisition through statistical means.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 Sep 2008</td>
<td align=left valign=top>Fei Sha (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_Sep_2008');">
Large margin based parameter estimation for hidden Markov models
</a><br>
<span id=abs19_Sep_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> In many application domains, we face the task of characterizing the distribution of continuous random variables.  For instance, in automatic speech recognition (ASR), these variables are acoustic properties of speech signals.  For such tasks, Gaussian mixture models (GMMs) are widely used as an very effective density estimator. Particularly, in the context of ASR, they are embedded in continuous-density hidden Markov models (CD-HMMs) to yield emission probabilities, i.e., the likelihoods of acoustic observations conditioned on hidden states such as phonemes. Meanwhile, the transition probabilities in CD-HMMs attempt to capture temporal properties of speech signals. Similar modeling choices arise in other applications, for instance, in activity recognition.
<p>
Various techniques have been developed to estimate the parameters of CD-HMMs. In particular, discriminative techniques such as conditional maximum likelihood and minimum classification error have attracted significant research attention. When carefully and skillfully implemented, they often lead to lower error rates (in speech recognition) than traditional techniques of maximum likelihood estimation.
<p>
In this talk, I will describe a new discriminative technique that is based on the principle of large margin, a key framework in many machine learning algorithms including support vector machines and boosting. The new technique differs from previous discriminative methods for ASR in the goal of margin maximization. In particular, in our large margin training of CD-HMMs, model parameters are optimized to maximize the gap (or the margin)  between correct and incorrect classifications.  I will present an extensive empirical evaluation of our approach on two benchmark problems in speech recognition: phonetic classification and recognition on the TIMIT speech database.  In both tasks, large margin systems obtain significantly better performance than systems trained by maximum likelihood estimation or competing discriminative frameworks.  An in-depth analysis also reveals some
interesting features of our approach, which contribute to the superior performance.
<p>
Towards the end of the talk, I will discuss briefly the connection of our work to the structured prediction problems in the machine learning community. I will also discuss the future direction of this line of work and other application potentials.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Aug 2008</td>
<td align=left valign=top>Amittai Axelrod (UW)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Aug_2008b');">
Intern Final Talk: Structural constraints for efficient decoding.
</a><br>
<span id=abs22_Aug_2008b style="display:none;">
<font size=-1>
<b>Time:</b> 3:45 pm - 4:15 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> String-to-tree machine translation decoders are effective but very slow, especially compared to other decoding approaches.  We explore various methods to identify constraints on the search space, with the aim of improving the efficiency of the syntax-based decoder.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Aug 2008</td>
<td align=left valign=top>Catalin Tirnauca (Univ. Rovira i Virgili)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Aug_2008');">
Intern Final Talk:  On the Consistency of Probabilistic Context-Free Grammars
</a><br>
<span id=abs22_Aug_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Probabilistic context-free grammars can describe probability distributions over strings, i.e., the sum of probabilities of all generated strings is 1.This condition is often  called consistency. It has applications in fields of natural language processing such as probabilistic parsing (disambiguate by picking the parse with the highest score), or speech recognition (rank hypotheses returned by a speech recognizer).
<p>
The talk is a survey of some of the previous results. We investigate how we can determine if a probabilistic context-free grammar is consistent, and if such a test can always be done. Also, we study a method, namely normalization, which guarantees consistent probabilistic context-free grammars. Moreover, we mention briefly some techniques that train probabilistic context-free grammars and guarantee consistency.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>20 Aug 2008</td>
<td align=left valign=top>Kyle Gorman (Penn)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_Aug_2008');">
Intern Final Talk:  The Entropy of English given French
</a><br>
<span id=abs20_Aug_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11 Small<br>
<b>Abstract:</b> The fundamental task in statistical machine translation (SMT) is to
characterize the probability of a target sentence given its source
translation; for translating French as English, P(f | e). By applying
Bayes Rule, we derive the fundamental theorem of SMT: e maximizing
P(e) P(f | e). Advances in SMT come from improving estimations of
these two terms, or from more efficient ways of searching for optimal
solutions (Brown et al. 1993).
<p>
In the case of language modeling, Shannon (1949) and Brown et al.
(1992) identified upper and lower bounds for the per-character entropy
of English, H(e), for humans and machines, respectively. We ask the
same question for SMT, H(e | f), comparing the results for human
translators and a simple machine baseline based on IBM Model 1. These
numbers are the upper and lower bounds for SMT systems trained on
parallel data.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>20 Aug 2008</td>
<td align=left valign=top>John DeNero (Berkeley)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_Aug_2008b');">
Intern Final Talk: Minimum Risk Decoding over Forests
</a><br>
<span id=abs20_Aug_2008b style="display:none;">
<font size=-1>
<b>Time:</b> 3:45 pm - 4:15 pm<br>
<b>Location:</b> 11 Small<br>
<b>Abstract:</b> Minimum Bayes risk (MBR) decoding improves the output of
machine translation systems by selecting a translation that matches a
large proportion of the k-best hypotheses of a system.  We extend this
idea to apply to packed forests by selecting an output sentence that
matches a large proportion of all hypotheses in the pruned forest of
derivations from a syntax-based translation system.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 Jul 2008</td>
<td align=left valign=top>Sujith Ravi</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_Jul_2008');">
Deciphering Ciphers Optimally Using Only Minimal Knowledge of the Source Language
</a><br>
<span id=abs18_Jul_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> I will be talking about deciphering letter-substitution ciphers *optimally* using only minimal knowledge (bigrams, trigrams, etc.) of the source language, instead of relying on large look-up dictionaries. We also plan to show how our empirical results compare with Shannon's predictions on the equivocation curves and unicity distance measure.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Jul 2008</td>
<td align=left valign=top>Jonathan May</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Jul_2008');">
Thesis Proposal Practice Talk:  A Weighted Tree Transducer Toolkit for Syntactic Natural Language Processing Models
</a><br>
<span id=abs11_Jul_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Solutions for many natural language processing problems such as speech recognition, transliteration, and  translation have been described as weighted finite-state transducer cascades. The transducer formalism is very useful for researchers, not only for its ability to expose the deep similarities between seemingly disparate models, but also because expressing models in this formalism allows for rapid implementation of real, data-driven systems. Finite-state toolkits can interpret and process transducer chains using generic algorithms and many real-world systems have been built using these toolkits. Current research in NLP makes use of syntax-rich models that are poorly suited to extant transducer toolkits, which process linear input and output. Tree transducers can handle these models, and a weighted tree transducer toolkit with appropriate generic algorithms will lead to the sort of gains in syntax-based modeling that were achieved with string transducer toolkits. In this thesis proposal practice talk I will briefly trace the history of finite-state transducers and automata as they relate to natural language processing and the evolution of formalisms and the toolkits that support them, leading up to motivation for the design and creation of Tiburon, the toolkit referenced in this talk's title. I will describe previous, current, and future work on Tiburon's algorithms and the effectiveness of both algorithms and  software at cleanly representing syntax-based NLP models from the literature and at constructing and evaluating novel models.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>13 Jun 2008</td>
<td align=left valign=top>Ellen Riloff</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs13_Jun_2008');">
Effective Information Extraction with Relevant Regions and Semantic Affinity Patterns
</a><br>
<span id=abs13_Jun_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> I will briefly overview the landscape of event-oriented information
extraction (IE) systems and explain why it is especially challenging
to learn IE systems without annotated training data. Then I will
describe one attempt to do so by decoupling the tasks of finding
relevant text regions and applying extraction patterns. First, a
self-trained relevant sentence classifier identifies relevant regions
in documents. Second, a "semantic affinity" measure identifies
domain-relevant extraction patterns.  We further distinguish between
"primary" patterns and "secondary" patterns and apply the patterns
selectively in the relevant regions.  This approach is weakly
supervised, requiring only a few seed patterns plus relevant and
irrelevant (but unannotated) documents for training.  The resulting IE
system achieves reasonably good performance, despite the fact that the
relevant region classifier leaves a lot to be desired.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 Jun 2008</td>
<td align=left valign=top>Tom Murray (USC)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_Jun_2008');">
Knowledge as a Constraint on Uncertainty for Unsupervised Classification
</a><br>
<span id=abs06_Jun_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This talk investigates the use of domain knowledge to constrain and improve the unsupervised learning of a classifier, by placing limits or biases on the possible hypotheses for each input. Theoretically, we view the contribution of the knowledge source as a reduction in the uncertainty of the model's decisions, quantified by the resulting conditional entropy of the label distribution given the input corpus. Evaluating on the simple case of an unsupervised HMM tagger, we find surprising levels of improvement from little knowledge, with more stable and efficient training convergence and label assignment, and a high degree of correlation between classification entropy and model performance. We conclude that, while we should always seek better generic models and techniques, for applications in an unsupervised setting, knowledge may still be key.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 May 2008</td>
<td align=left valign=top>Steve DeNeefe</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_May_2008');">
BLEU Sway Issues: one way to get statistical significance, two ways to get a better score, and three ways to thwart them
</a><br>
<span id=abs30_May_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> BLEU the de facto standard for evaluation and development of statistical machine translation systems.  We describe three real-world situations involving comparisons between different versions of the same systems where one can obtain improvements in BLEU scores that are questionable or even absurd. We propose a very conservative modification to BLEU that addresses these issues while improving correlation with human judgements, then explore some deeper modifications that alleviate the problems further.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 May 2008</td>
<td align=left valign=top>David Newman (UCI)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_May_2008');">
Theory and Applications of Topic Modeling
</a><br>
<span id=abs16_May_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Topic models, a class of Bayesian probabilistic models for discrete
data, have recently gained popularity in applications ranging from
document modeling to computer vision.  Since the introduction of
Latent Dirichlet Allocation (LDA) in 2003, there have been numerous
extensions to this archetype.  I will review the theory behind LDA,
and discuss subsequent models, including (some of): Correlated Topic
Model, Dynamic Topic Model, Hierarchical Topic Model, Special Words
Topic Model, Hierarchical Dirichlet Process Model, Pachinko Allocation
Machine, Topics and Syntax Model, Bi-LDA, Author-Topic Model,
Supervised Topic Model, Spatial LDA, etc.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 May 2008</td>
<td align=left valign=top>John DeNero (Berkeley)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_May_2008');">
Inference in phrase alignment models
</a><br>
<span id=abs09_May_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Models that align phrases instead of words offer an
appealing alternative to the standard relative frequency estimates of
phrase translation probabilities.  But, while some effective word
alignment models (Model 1, Model 2 & HMM) can be estimated tractably
with EM, phrase alignment models cannot.  I'll talk about how to show
that estimation and inference under these models is intractable.
Then, I'll present two useful approximation techniques.
<p>
First, I'll talk about how to cast phrase alignment search as an
integer linear programming (ILP) problem and find the optimal
alignment reliably and quickly with off-the-shelf ILP software.  Some
applications of this technique include training phrase alignment
models and interpreting the output of word alignment models.
<p>
Second, we'll look at how to estimate translation probabilities under
a phrase alignment model using a Gibbs sampling procedure.  The
sampler has some nice asymptotic convergence properties and also seems
to produce good results in practice. I'll walk through the different
models we've trained and how they performed.
<p>
Time permitting, I'll also talk about some of the ways in which we
could potentially extend this work to syntactic MT.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>02 May 2008</td>
<td align=left valign=top>Zornitsa Kozareva</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs02_May_2008');">
Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs
</a><br>
<span id=abs02_May_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We present a novel approach to weakly supervised semantic class learning from
the web, using a single powerful hyponym pattern combined with graph
structures, which capture two properties associated with pattern-based
extractions: popularity and productivity. Intuitively, a candidate is popular
if it was discovered many times by other instances in the hyponym pattern. A
candidate is productive if it frequently leads to the discovery of other
instances. Together, these two measures capture not only frequency of
occurrence, but also cross-checking that the candidate occurs both near the
class name and near other class members. We developed two algorithms that begin
with just a class name and one seed instance and then automatically generate a
ranked list of new class instances. We conducted experiments on four semantic
classes and consistently achieved high accuracies.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Apr 2008</td>
<td align=left valign=top>David Chiang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Apr_2008');">
Tutorial: Randomized data structures for large statistical NLP models
</a><br>
<span id=abs25_Apr_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Randomized algorithms are those which use randomness to achieve efficient performance with a bounded probability of error; typically, the bound is adjustable and the performance depends on the bound. Randomized data structures, likewise, use randomness to achieve efficient storage with a bounded probability of error. I will give an overview of the use of such data structures, namely, Bloom filters and "Bloomier" filters, for storing very large n-gram language models, and will discuss possibilities for using randomized data structures for other purposes as well.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 Apr 2008</td>
<td align=left valign=top>Rahul Bhagat</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_Apr_2008');">
Learning Paraphrases from Text
</a><br>
<span id=abs18_Apr_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Paraphrases are textual expressions that convey the same meaning using different words. They capture variability, which is a common phenomenon in language. Given this, paraphrases have been shown to be useful in many natural language applications like Question-Answering, Machine Translation, Summarization and Information Retrieval. In this talk, I'll discuss the phenomenon paraphrasing and focus on methods for automatically acquiring paraphrases from text.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Apr 2008</td>
<td align=left valign=top>Jonathan May</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Apr_2008');">
Syntactic Re-Alignment Models for Machine Translation
</a><br>
<span id=abs11_Apr_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We present a method for improving word alignment for statistical syntax-based machine translation that employs a syntactically informed alignment model closer to the translation model than commonly-used word alignment models. This leads to extraction of more useful linguistic patterns and improved BLEU scores on translation experiments in Chinese and Arabic.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>04 Apr 2008</td>
<td align=left valign=top>Ulf Hermjakob</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs04_Apr_2008');">
Name Translation in Statistical Machine Translation: Learning When to Transliterate
</a><br>
<span id=abs04_Apr_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We present a method to transliterate names in the framework of
end-to-end statistical machine translation. The system is trained to
learn when to transliterate.
<p>
For Arabic to English MT, we developed and trained a transliterator on a
bitext of 7 million sentences and Google's English terabyte ngrams and
achieved better name translation accuracy than 3 out of 4 professional
translators. The talk also includes a discussion of challenges in name
translation evaluation.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Mar 2008</td>
<td align=left valign=top>Jason Riesa</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Mar_2008');">
Tutorial on Arabic Orthography
</a><br>
<span id=abs25_Mar_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 am - 11:30 am<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This tutorial is intended to provide attendees with working knowledge of the Arabic writing system. No previous experience with Arabic is required. At the end of this tutorial you should be able to read and segment individual Arabic characters, read common ligatures, identify possible affixes on stems, and understand the various lexical normalizations used in Arabic text preprocessing. The focus will be on the formal writing system in printed text for Modern Standard Arabic, although handwriting will be briefly discussed.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 Jan 2008</td>
<td align=left valign=top>Victoria Fossum</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_Jan_2008');">
Using Syntax to Improve Word Alignment Precision for Syntactic Machine Translation
</a><br>
<span id=abs18_Jan_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Automatically word-aligning a parallel bitext in the source and target languages constitutes the first stage of most statistical machine translation pipelines.  Automatic word alignment is error-prone, and produces many incorrect links.  Incorrect links that violate syntactic correspondences interfere with the extraction of string-to-tree transducer rules for syntactic machine translation.  We present an algorithm for identifying and deleting incorrect word alignment links, using features of the extracted rules.  We obtain gains in both alignment quality and translation quality in Chinese-English and Arabic-English translation experiments, relative to a GIZA++ union baseline.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Jan 2008</td>
<td align=left valign=top>Kevin Knight</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Jan_2008');">
How to Make EM Do What You Want
</a><br>
<span id=abs11_Jan_2008 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> I'll talk about some unsupervised learning experiments -- how I was satisfied with the initial results, how I became very dissatisfied, and how I became (somewhat) satisified again.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Dec 2007</td>
<td align=left valign=top>Marieke van Erp</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Dec_2007');">
MITCH: Mining for Information in Texts from the Cultural Heritage
</a><br>
<span id=abs14_Dec_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Naturalis, the Dutch National Museum of Natural History, harbours one of the largest treasures of the world: the key specimens of millions of animals found throughout the world through centuries of biological expeditions. While the depot where the animals are stored is a technical marvel, Noah's ark of the 21st century, it is hard to search through it. Research in taxonomy, the evolution of life and biodiversity revolves around the specimens in the depot. The main key to accessing the depot are(mostly) handwritten expedition logs and registration books, which are currently being photographed and keyed in to be stored in searchable digital archives. Such digital logs already enable a kind of "Biogoogle" search, but actual research questions are more complicated ("how did this kind of frog develop over the last century in the Amazon rainforests?"), and demand more intelligent handling. This is where the MITCH project comes in.
<p>
The goal of MITCH is to turn the field logs and registration books into a populated semantic network, in which concepts such as animal specimens are related to all other concepts that define them: where, when, under which circumstances and by whom were they found, who described them first in the academic literature, who prepared them for storage in the Naturalis depot, which registration number was assigned to them, etc. This means that all textual descriptions of a specimen need to be parsed into exactly these concepts and their relations. All of this needs to be  done at a scale that goes far beyond the human capacity, as tens of thousands of digitized but unanalysed textual records are waiting for semantic analysis. This necessitates the use of state-of-the-art machine learning methods that learn from examples automatically.
<p>
The project addresses its goals on three levels. The basic level is the development and application of automatic data cleaning and markup tools. On top of this, semi-structured textual material such as fieldbook logs and scientific papers, are semi-automatically converted to a searchable knowledge base. Search results are visualised by displaying maps and specimen photos. The conversion phase assumes the active intervention of domain experts, such as collection managers, to correct and steer the automatic extraction  procedure. At the top level, information resources are cross-linked using a domain ontology, populating a semantic network that can be hooked up to any other standardised cultural heritage knowledge base or to a search engine.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>02 Nov 2007</td>
<td align=left valign=top>Bill Rounds (Michigan and Stanford)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs02_Nov_2007');">
Constructions, Constraints, Transducers, and TAGs: A unifying view through Feature Logic
</a><br>
<span id=abs02_Nov_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> The value of mathematical formalisms for speech recognition, language generation, and machine translation has long been recognized. Not so much work, though, has been spent reconciling these formalisms with linguistic theories. In this talk I'll propose a theoretical descriptive mechanism based on feature logic, which is central to construction and constraint-based linguistic theories like construction grammar and HPSG, and which  can be used to view tree transducers and tree-adjoining grammars as giving rise to a construction-based framework.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 Oct 2007</td>
<td align=left valign=top>Slav Petrov (Berkeley)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_Oct_2007');">
Learning and Inference for Hierarchically Split PCFGs
</a><br>
<span id=abs19_Oct_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 am - 11:30 am<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Treebank parsing can be seen as the search for an optimally refined
grammar consistent with a coarse training treebank. We describe a
method in which a minimal grammar is hierarchically refined using EM
to give accurate, compact grammars. The resulting grammars are
extremely compact compared to other high-performance parsers, yet the
parser gives the best published accuracies on several languages, as
well as the best generative parsing numbers in English. In addition,
we give an associated coarse-to-fine inference scheme which vastly
improves inference time with no loss in test set accuracy.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Oct 2007</td>
<td align=left valign=top>Jon Patrick (Univ. of Sydney)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Oct_2007');">
Enhancement Technologies for ICU Information Systems
</a><br>
<span id=abs17_Oct_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> The School of Information Technologies at the University of Sydney has
had a 3 year partnership with the Intensive Care Unit at the Royal
Prince Alfred Hospital, Sydney. In that time they have managed 8 joint
projects aimed at producing software solutions that enhance
productivity in the Unit and in some cases enabled entirely new
functionalities in their information systems. The principle motivation
for the research is the processing of the narratives in clinical notes
but concomitant problems in information systems have also been tackled
and the combination of the two disciplines have led to the two related
processing systems to be described in this presentation.
<p>
<p>
- Ward Rounds Information Systems (WRIS) & Handovers -
The WRIS is designed to support the work of all clinical staff in
their ward rounds activities. The system, when activated,
automatically populates from the resident clinical database a pro
forma report with the most recent relevant data about the patient,
such as vital signs, pathology reports, and other diagnostic
measurements, presented as a web page. The clinical staff then write
their progress notes into the web page which converts the text to
SNOMED CT codes and other relevant concepts and entities. The
clinician is given the opportunity to change any analyses done by the
processor. This clinician approved data is loaded to the patient
record. The essential elements of this system, that is computing an
extract of the patient record, accepting narrative input, and
analysing the text for coding, is a productivity gain of itself, but
more importantly, also constitutes the beginning of a hospital wide
Handovers System for use throughout each step in the patient
journey. This system is being tested at the RPAH ICU in readiness for
ward usage. The impact of this system in improving the quality and
safety of handovers has the potential to be very significant.
<p>
<p>
- Clinical Data Analytics Language (CDAL) -
General purpose access to data from clinical information systems,
beyond retrieval for point of care work, is needed for many aspects of
the hospital's work particularly for clinical research, logistics &
operational planning, and auditing patient safety. Most current
clinical systems only provide access to data identified in standard
reports with no flexibility to make ad hoc enquiries or to pursue new
directions of enquiry. The clinical data analytics language developed
enables the expression of any question that can be answered from the
data in the database in a restricted natural language. A prototype of
the language has been developed for the CareVue information system
used in the ICU at the Royal Prince Alfred Hospital. It provides for
the use of local medical dialects, SNOMED CT terminology including all
forms of collective expressions in SNOMED (e.g. infectious diseases),
specification of patient groups, a variety of statistical functions,
and constraints over any medical variable, Time, and Location. CDAL is
general in that it can be bolted on to any clinical information system
and is applicable to any clinical specialisation.
<p>
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>12 Oct 2007</td>
<td align=left valign=top>David Talbot (Edinburgh)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs12_Oct_2007');">
Scalable Language Modeling: Breaking the Curse of Dimensionality
</a><br>
<span id=abs12_Oct_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Randomized data structures can help us scale discrete models encountered in NLP. This talk will describe their use in language modeling and present some more general related results.
<p>
N-gram language models are fundamental to speech recognition and machine translation. Unfortunately, the n-gram parameter space grows exponentially with the dimension of the feature vector. I will describe how randomization can be used to remove the space-dependency of such models on the a priori parameter space.
<p>
The novel extensions of the Bloom filter that I will present are able to take advantage of the entropy of the distribution of values assigned to feature vectors to save space in a discrete statistical model. I will review some results applying these models to language modeling in machine translation and relate their space-requirements to a novel lower bound on the general problem of querying a map of key/value pairs.
<p>
No prior knowledge of randomized data structures will be assumed.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Oct 2007</td>
<td align=left valign=top>Sujith Ravi</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Oct_2007');">
Will this parser work with my data? - Predicting Parser Accuracy without Gold-Standard information
</a><br>
<span id=abs05_Oct_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> There are many tools available to the NLP community for Natural Language Parsing, (i.e converting a raw sentence in to a parse-tree). NLP researchers usually use some "off-the-shelf" parser which has been trained on the Wall Street Journal (WSJ) corpora and then apply the WSJ-trained parser to their data. This works in many cases, especially for systems which use data from WSJ or similar corpora. However, in real life applications, the data may be compiled from many different sources and span different genres, and may not be similar to the WSJ corpora in terms of sentence structure, etc . A particular parser might parse well on some corpora and not so well on others. Choosing the right parser for your data may have an impact on the performance of the NLP system as a whole. But in order to measure the accuracy of any parser for a given corpus, we require a set of gold-standard parse trees corresponding to the sentences within the corpus. Generating gold-standard set takes a lot of manual work and in many real-life applications, it is not a feasible  task to generate gold-standard parses for large corpora.
<p>
We attempted to build a system which can predict the accuracy (in terms of f-measure value) of the Charniak parser (a popular parsing tool) on any given sentence corpus. Without using any additional information (i.e gold std. parses), our system predicts "how accurately the Charniak parser could parse the given corpus". In order to evaluate our system's predictions on a particular corpus, we compute the "Correlation" measure between the "actual accuracies (using Gold-standard)" vs. "predicted accuracies (from our system)" for the given corpus. We tested our system on different corpora and using different methods and will present these results.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>29 Aug 2007</td>
<td align=left valign=top>Carmen Heger (Dresden) <br> Michael Bloodgood (Delaware)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs29_Aug_2007');">
Summer Intern Presentations: Composition of Tree Transducers AND Using the Perceptron Algorithm to Tune Large Numbers of Feature Weights for Syntax-Based Statistical Machine Translation
</a><br>
<span id=abs29_Aug_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Composition of Tree Transducers
<p>
Since finite state (string) transducers are not expressive enough for many NLP
applications, computational linguistics started to investigate tree
transducers for the task of machine translation, for example. Quite some
successful work has been done on generalizing results from string transducers
to tree transducers. But when it comes to composition results are not
satisfying because generally tree transducers are not closed under
composition. Still we think that most of the tree transducers used in NLP are
composable and that is why we defined the problem of the composition for two
individual transducers instead of the whole class. During the summer we
started with linear nondeleting tree transducers with epsilon rules and
approached an algorithm to decide for two such transducers whether their
composition is again in the same class.
<p>
Using the Perceptron Algorithm to Tune Large Numbers of Feature Weights for Syntax-Based Statistical Machine Translation
<p>
Current state-of-the-art syntax-based statistical machine translation
systems produce many candidate translations out of which the output translation
is selected by taking the argmax over all candidates i of &lt;w,f_i&gt; where w is a
weight vector and f_i is a vector of the feature values for candidate i. The
features used by the system and their corresponding weights have a major impact
on a system's performance.  Currently, Minimum Error Rate Training (MERT) is used to
tune the weights of the features.  A drawback of this is that it isn't tractable
to tune large numbers of feature weights.  I will discuss using the perceptron
algorithm to tune feature weights for statistical machine translation.  If I get interesting
results before my talk, I may also dicsuss new classes of features (potentially very large
numbers of features) that can be used for improving MT performance.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 Aug 2007</td>
<td align=left valign=top>Wei Ho (Princeton) <br> Jennifer Gillenwater (Rice)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_Aug_2007');">
Summer Intern Presentations: Noisy Language Models AND Context for Syntax-Based Translation Rules
</a><br>
<span id=abs24_Aug_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 5:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Noisy Language Models
<p>
The language models used in statistical machine translation are often
quite large, requiring significant memory and sometimes pre-processing
in order to be utilized effectively. It would be desirable to have a
more compact representations of language models while minimizing the
impact on translation quality. Various quantization methods and lossy
storage of language models will be presented.
<p>
Context for Syntax-Based Translation Rules
<p>
The rules that a translation system employs should be applicable in
many contexts.  This ensures that a rich language is expressible with
a minimum number of rules.  However, when rules that are applicable in
too many contexts are combined, they result in nonsensical
translations.  How can we keep rules general but constrain the context
of their use?  This summer we explored the approach of constraining
the context by conditioning on various neighboring elements of each
rule.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Aug 2007</td>
<td align=left valign=top>Anoop Sarkar (Simon Fraser)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Aug_2007');">
Extensions of Regular Tree Grammars and their relation to Tree Adjoining Grammars
</a><br>
<span id=abs16_Aug_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> There is a hierarchy of generative devices that generate trees:
starting with regular tree languages (RTLs), which are contained
within context-free tree languages (CFTLs), and so on.  The string
yield of the RTLs is exactly the set of Context-Free Languages,
while the yield of the CFTLs is exactly the set of Indexed Languages.
In this talk we introduce Adjoining Tree Languages (ATLs) which sit
in between RTLs and CFTLs.
<p>
The yield of ATGs is exactly the set of Tree-Adjoining Languages.
Just like RTGs are stronger than CFGs, ATGs are stronger than TAGs.
In addition we will show that the ATG notation simplifies many of
the foundational proofs for TAGs including proofs of the closure
properties. In particular, ATLs do not use adjunction constraints,
and thus are much easier to understand than TAGs.
<p>
We compare ATGs with previously proposed simplifications of CFTGs,
called monadic simple CFTGs, which also have been shown to be weakly
equivalent to TAG (i.e. they generate the same set of string
languages). We consider the question of whether these two weakly
equivalent formalisms are strongly equivalent (i.e. generate exactly
the same set of tree languages).
<p>
Finally, we will show that the standard definition used for
probabilistic TAG is (surprisingly) very different from the natural
definition of probabilistic ATL. Using an example of PP-attachment
ambiguity we show that the two probabilistic models are different
from each other.
<p>
About the speaker:
<p>
Anoop Sarkar is an assistant professor in the Department of Computing
Science at Simon Fraser University. He received his PhD in 2002
from the Department of Computer and Information Science at the
University of Pennsylvania, with Prof. Aravind Joshi as his advisor.
His research work is on machine learning, especially semi-supervised
learning, applied to the processing of natural language and stochastic
formal grammars.
<p>
Anoop Sarkar's web-page: <a href=http://www.cs.sfu.ca/~anoop/> http://www.cs.sfu.ca/~anoop</a>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Jun 2007</td>
<td align=left valign=top>Alex Fraser</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Jun_2007');">
Getting the structure right for word alignment: LEAF
</a><br>
<span id=abs15_Jun_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 am - 11:00 am<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Automatic word alignment is the problem of automatically annotating
parallel text with translational correspondence. Previous generative
word alignment models have made structural assumptions such as the
1-to-1, 1-to-N, or phrase-based consecutive word assumptions, while
previous discriminative models have either made one of these
assumptions directly or used features derived from a generative model
using one of these assumptions. We present a new generative alignment
model which avoids these structural limitations, and show that it is
effective when trained using both unsupervised and semi-supervised
training methods. Experiments show strong improvements in word
alignment accuracy and usage of the generated alignments in
hierarchical and phrasal SMT systems improves the BLEU score.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Jun 2007</td>
<td align=left valign=top>Donghui Feng</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Jun_2007b');">
Extracting Data Records from Unstructured Biomedical Full Text
</a><br>
<span id=abs15_Jun_2007b style="display:none;">
<font size=-1>
<b>Time:</b> 11:00 am - 11:30 am<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> In this paper, we address the problem of extracting data records and
their attributes from unstructured biomedical full text. There has
been little effort reported on this in the research community. We
argue that semantics is important for record extraction or
finer-grained language processing tasks. We derive a data record
template including semantic language models from unstruc-tured text
and represent them with a dis-course level Conditional Random Fields
(CRF) model. We evaluate the approach from the perspective of
Information Extrac-tion and achieve significant improvements on system
performance compared with other baseline systems.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>08 Jun 2007</td>
<td align=left valign=top>Jonathan May</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs08_Jun_2007b');">
Bisimulation Minimisation for Weighted Tree Automata
</a><br>
<span id=abs08_Jun_2007b style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We describe existing forward and backward bisimulation minimisation
algorithms for nondeterministic automata and extend these algorithms
to weighted tree automata. The extended algorithms, which work for all
semirings, retain the time complexity of their counterparts for
unweighted tree automata for additively cancellative semirings, and
are only slightly higher (linear instead of logarithmic in the number
of states) on other semirings. We describe the effectiveness of an
implementation of these algorithms on a typical task in natural
language processing.
<p>
This is joint work with Johanna Hogberg, Umea University and Andreas
Maletti, Technische Universitat Dresden.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>08 Jun 2007</td>
<td align=left valign=top>Liang-Chih Yu (Cheng Kung U)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs08_Jun_2007');">
Topic Analysis for Psychiatric Document Retrieval (Practice Talk for ACL)
</a><br>
<span id=abs08_Jun_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Psychiatric document retrieval attempts to help people to efficiently
and effectively locate the consultation documents relevant to their
depressive problems. Individuals can understand how to alleviate their
symptoms according to recommendations in the relevant documents. This
work proposes the use of high-level topic information extracted from
consultation documents to improve the precision of retrieval
results. The topic information adopted herein includes negative life
events, depressive symptoms and semantic relations between symptoms,
which are beneficial for better understanding of users'
queries. Experimental results show that the proposed approach achieves
higher precision than the word-based retrieval models, namely the
vector space model (VSM) and Okapi model, adopting word-level
information alone.
<p>
About the speaker:
<p>
Liang-Chih Yu (<a href=http://www.isi.edu/~liangchi>http://www.isi.edu/~liangchi</a>) is
now a visiting student in the Information Sciences Institute (ISI) of
University of Southern California (USC). My host advisor is Dr. Eduard
Hovy. I am also a PhD candidate in the Department of Computer Science
and Information Engineering, National Cheng Kung University, Tainan,
Taiwan. My advisor is Dr. Chung-Hsien Wu. My research interests
include natural language processing, text mining, information
retrieval, ontology construction, spoken dialogue system.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>01 Jun 2007</td>
<td align=left valign=top>Jingbo Zhu</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs01_Jun_2007');">
Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem
</a><br>
<span id=abs01_Jun_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> In this paper, we analyze the effect of resampling techniques,
including under-sampling and over-sampling used in active learning for
word sense disambiguation (WSD). Experimental results show that
under-sampling causes negative effects on active learning, but
over-sampling is a relatively good choice. To alleviate the
within-class imbalance problem of over-sampling, we propose a
bootstrap-based over-sampling (BootOS) method that works better than
ordinary over-sampling in active learning for WSD. Finally, we
investigate when to stop active learning, and adopt two strategies,
max-confidence and min-error, as stopping conditions for active
learning. According to experimental results, we sug-gest a prediction
solution by considering max-confidence as the upper bound and
min-error as the lower bound for stopping conditions.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>01 Jun 2007</td>
<td align=left valign=top>Andrew S. Gordon</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs01_Jun_2007b');">
Generalizing Semantic Role Annotations Across Syntactically Similar Verbs
</a><br>
<span id=abs01_Jun_2007b style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Large corpora of parsed sentences with semantic role labels (e.g. PropBank)
provide training data for use in the creation of high-performance automatic
semantic role labeling systems. Despite the size of these corpora,
individual verbs (or rolesets) often have only a handful of instances in
these corpora, and only a fraction of English verbs have even a single
annotation. In this paper, we describe an approach for dealing with this
sparse data problem, enabling accurate semantic role labeling for novel
verbs (rolesets) with only a single training example. Our approach involves
the identification of syntactically similar verbs found in PropBank, the
alignment of arguments in their corresponding rolesets, and the use of their
corresponding annotations in PropBank as surrogate training data.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 May 2007</td>
<td align=left valign=top>Wei Wang (Language Weaver)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_May_2007');">
Binarizing Syntax Trees to Improve Syntax-Based Machine Translation Accuracy
</a><br>
<span id=abs25_May_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We show that phrase structures in Penn Treebank style parses
are not optimal for syntax-based machine translation. We
exploit a series of binarization methods to restructure the
Peen Treebank style trees such that syntactified phrases
smaller than Penn Treebank constituents can be acquired and
exploited in translation. We find that by employing the EM
algorithm for determining the binarization of a parse tree
among a set of alternative binarizations gives us the best
translation result.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 May 2007</td>
<td align=left valign=top>Feng Pan</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_May_2007');">
Computing Semantic Similarity between Skill Statements for Approximate Matching
</a><br>
<span id=abs18_May_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> (This will be an extended version of the talk for NAACL-HLT 2007. It's
based on my summer internship work at IBM T.J. Watson Research Center
last year.)
<p>
The project aimed to address the problems encountered when trying to
match available employees to open job positions, based on skill
matches. Currently, job search applications, like IBM's Professional
Marketplace, only find exact matches. A skill affinity computation is
desired to allow searches to be expanded to related/similar skills,
and return more potential matches.
<p>
In this talk, I will explore the problem of computing text similarity
between verb phrases describing skilled human behavior for the purpose
of finding approximate matches. Four parsers (Charniak's parser,
Stanford's parser, IBM XSG slot grammar parser, and Lin's MINIPAR) are
evaluated on a corpus of skill statements extracted from an
enterprise-wide expertise taxonomy. A similarity measure utilizing
common semantic role features extracted from parse trees was found
superior to an information-theoretic measure of similarity and
comparable to the level of human agreement.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 May 2007</td>
<td align=left valign=top>Steve DeNeefe</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_May_2007');">
What Can Syntax-based MT Learn from Phrase-based MT?
</a><br>
<span id=abs11_May_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We compare and contrast the strengths and weaknesses of a syntax-based
machine translation model with a phrase-based machine translation
model on several levels.  We briefly describe each model, highlighting
points where they differ.  We include a quantitative comparison of the
phrase pairs that each model has to work with, as well as the reasons
why some phrase pairs are not learned by the syntax-based model.  We
then propose improvements to the syntax-based extraction techniques to
capture more phrases.  We also compare the translation accuracy for
all variations.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>04 May 2007</td>
<td align=left valign=top>Sheelagh Carpendale (Calgary)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs04_May_2007');">
Information Visualization and Collaboration
</a><br>
<span id=abs04_May_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Consider Donald Norman's quote, "The power of the unaided mind is
highly overrated. Without external aids, memory, thought, and
reasoning are all constrained. But human intelligence is highly
flexible and adaptive, superb at inventing procedures and objects that
overcome its own limits. The real powers come from devising external
aids that enhance cognitive abilities." (Norman, 1993) Common methods
for externalization include making sketches on whatever happens to be
handy -- paper napkins, program margins, etc. -- and/or finding a
colleague or two to discuss the problem with. It would seem then, that
visualization and collaboration are natural possibilities for creating
positive cognitive aids. I will discuss our approach to developing
interactive information visualizations both to support individuals and
small groups of collaborators and briefly describe some of our recent
results.
<p>
About the speaker:
<p>
Sheelagh Carpendale holds a Canada Research Chair in Information
Visualization at the University of Calgary. Her research focuses on
the visualization, exploration and manipulation of information;
visualizing such topics as ecological dynamics, uncertainty in
information, social and communication information and investigating
the development of information visualization environments that support
collaboration. Dr. Carpendale's research in information visualization
and interaction design draws on her dual background in Computer
Science (BSc. and Ph.D. Simon Fraser University) and Visual Arts
(Sheridan College, School of Design and Emily Carr, College of Art).
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>20 Apr 2007</td>
<td align=left valign=top>Christopher Collins (Toronto)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_Apr_2007');">
Information Visualization to Support Computational Linguistics
</a><br>
<span id=abs20_Apr_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We present a survey of resent research into using information visualization
to reveal new insights about linguistic data.  Our recent work includes
using WordNet hyponymy as a basis for document visualization and visualizing
the uncertainty in machine translation in an instant messaging chat
context.  We will present our preliminary findings and prototype
visualization for machine translation data resulting from a week of
collaboration with ISI researchers.
<p>
About the speaker:
<p>
Christopher Collins is a PhD candidate in information visualization and
computational linguistics at the University of Toronto.  He works with Prof.
Gerald Penn and Prof. Sheelagh Carpendale (University of Calgary).
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Mar 2007</td>
<td align=left valign=top>Ido Dagan (Bar-Ilan U)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Mar_2007');">
Textual entailment as a framework for applied semantics
</a><br>
<span id=abs30_Mar_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We have recently proposed Recognizing Textual Entailment (RTE) as a
generic task that captures major semantic inferences across different
natural language processing applications. The talk will first review
the motivation and definition of the textual entailment task and the
PASCAL RTE-1,2&3 Challenges benchmarks. Then we will demonstrate
directions for building textual entailment systems, based on knowledge
acquisition and inference, and for utilizing them within concrete
applications. Furthermore, we suggest that textual entailment modeling
may become a comprehensive framework for applied semantics
research. Such framework introduces useful variants of known semantic
problems and highlights important tasks which were hardly investigated
so far at an applied computational level. The semantic modeling
perspective will be illustrated in more detail by a case study for an
entailment-based variant of word sense disambiguation.
<p>
About the speaker:
<p>
Ido Dagan is a Senior Lecturer at the Department of Computer Science
at Bar Ilan University, Israel. His areas of interest are largely
within empirical NLP, particularly empirical approaches for applied
semantic processing. In the last few years Ido and his colleagues
introduced <i>textual entailment</i> as a generic framework for applied
semantic inference and have organized the first three rounds of the
PASCAL Recognizing Textual Entailment Challenges. Ido received his
Ph.D. from the Technion. He has been a research fellow at the IBM
Haifa Scientific Center and a Member of Technical Staff at AT&T Bell
Laboratories. During 1998-2003 he was co-founder and CTO of
FocusEngine and VP of Technology of LingoMotors.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Mar 2007</td>
<td align=left valign=top>Hermann Helbig (U at Hagen, Germany)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Mar_2007');">
Multilayered Extended Semantic Networks as a Knowledge Representation Paradigm and Interlingua for Meaning Representation
</a><br>
<span id=abs23_Mar_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 4 CR<br>
<b>Abstract:</b> The talk gives an overview of  Multilayered Extended Semantic Networks
(abbreviated MultiNet), which is one of the most comprehensively
described knowledge representation paradigms used as a semantic
interlingua in large-scale NLP applications and for linguistic
investigations into the semantics and pragmatics of natural
language. As with other semantic networks, concepts are represented in
MultiNet by nodes, and relations between concepts are represented as
arcs between these nodes. Additionally to that, every node is
classified according to a predefined conceptual ontology forming a
hierarchy of sorts, and the nodes are embedded in a multidimensional
space of layer attributes and their values. MultiNet provides a set of
about 150 standardized relations and functions which are described in
a very concise way including an axiomatic apparatus, where the axioms
are classified according to predefined types. The representational
means of MultiNet claim to fulfill the criteria of universality,
homogeneity, and cognitive adequacy. In the talk, it is also shown,
how MultiNet can be used for the semantic representation of different
semantic phenomena. To overcome the quantitative barrier in building
large knowledge bases and semantically oriented computational lexica,
MultiNet is associated with a set of tools including a semantic
interpreter NatLink for automatically translating natural language
expressions into MultiNet networks, a workbench LIA for the computer
lexicographer, and a workbench MWR for the knowledge engineer for
managing and graphically manipulating semantic networks. The
applications of MultiNet as a semantic interlingua range from natural
language interfaces to the Internet and to dedicated databases, over
question-answering systems, to systems for automatic knowledge
acquisition.
<p>
About the speaker:
<p>
Prof. Helbig is head of the chair Intelligent Information and Communication
Systems at the University of Hagen, Germany. His main research areas are
Knowledge Representation, Semantic Natural Language Processing, and
Question-Answering.
<p>
A CV can be found <a href="slides/CV-En-HH.pdf"> here</a>.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 Mar 2007</td>
<td align=left valign=top>Kevin Knight</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_Mar_2007');">
The Voynich Manuscript
</a><br>
<span id=abs09_Mar_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> The medieval Voynich Manuscript has been called "the most
mysterious document in the world".  Its pages contain bizarre drawings
of strange plants and astrological diagrams, as well as an undeciphered
script of 20,000 running words, written in a character set that has never
been seen elsewhere.  Its origin is also controversial, with many theories
abounding.  I will describe the document, show samples, explain where it
may have come from, and present some properties of the text.
<p>
This will more of a history/mystery talk than
a computer science talk.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Jan 2007</td>
<td align=left valign=top>Gerald Penn (Toronto)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Jan_2007');">
The Quantitative Study of Writing Systems
</a><br>
<span id=abs26_Jan_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> If you understood all of the world's languages, you would still not be
able to read many of the texts that you find on the world wide web,
because they are written in non-Roman scripts -- often ones that have
been arbitrarily encoded for electronic transmission in the absence of
an accepted standard.  This very modern nuisance reflects a dilemma as
ancient as writing itself: the association between a language as it is
spoken and its written form has a sort of internal logic to it that we
can comprehend, but the conventions are different in every individual
case --- even among languages that use the same script, or between
scripts used by the same language.  This conventional association
between language and script, called a <i>writing system</i>, is indeed
reminiscent of the Saussurean conception of language itself, a
conventional association of meaning and sound, upon which modern
linguistic theory is based.  Despite linguists' reliance upon writing
to present and preserve linguistic data, however, writing systems were
a largely forgotten corner of linguistics until the 1960s, when Gelb
presented their first classification.
<p>
This talk will describe recent work that aims to place the study of
writing systems upon a sound computational and statistical foundation.
While archaeological decipherment may eternally remain the holy grail
of this area of research, it also has applications to speech
synthesis, machine translation, and multilingual document retrieval.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>12 Jan 2007</td>
<td align=left valign=top>Kevin Knight</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs12_Jan_2007');">
Capturing Natural Language Transformations
</a><br>
<span id=abs12_Jan_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 2:00 pm - 3:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Knowledge representation is hard.  As natural language scientists and
engineers, we'd like something that
<p>
- is expressive enough to capture how natural language works
<p>
- permits tractable inference
<p>
- admits learning algorithms for automatic knowledge acquisition
<p>
- leads to modular system construction
<p>
This talk will look at knowledge representation for capturing natural
language transformations.  A lot of what we do falls into this
category.  Examples of transformations include language translation
(French to English), question answering (Question to Answer),
transliteration (foreign script to Roman alphabet), summarization
(long text to short text), parsing (string to tree), language
generation (meaning to string), etc.
<p>
I'll show various knowledge formats (starting with simple finite-state
transducers) and show how they stack up on the 4 criteria above, using
theorems and examples.  We'll see that different types of tree and
string automata lead to good behavior on various subsets of the 4
criteria, but getting 4 out of 4 is still elusive.
<p>
This is a Krazy Theory talk -- since this kind of talk should not go
on and on, I promise to finish within 50 minutes.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Jan 2007</td>
<td align=left valign=top>Beata Klebanov (Hebrew U)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Jan_2007');">
Experimental and Computational Investigation of Lexical Cohesion in Texts
</a><br>
<span id=abs05_Jan_2007 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Lexical cohesion refers to structure created in a text by use of words with
related meanings. Apart from its importance in theoretical and applied
linguistics, lexical cohesion detection is used in NLP tasks like topic
segmentation, extractive summarization, spelling correction, etc.  However, the
intuitive potential of lexical cohesion for such tasks is often not realized in
practice, possibly due to shortcomings of detection algorithms.
<p>
I will briefly describe an experiment with readers aimed at providing reliable
data for a computational investigation of lexical cohesion. We then discuss a
number of informative features for cohesion detection, drawing on sources like
WordNet, distributional information, free associations, and the structure of
information in  the text itself.  Finally, I report experiments
with supervised learning of lexical cohesion.
<p>
About the speaker:
<p>
Beata Beigman Klebanov is a PhD candidate at the Hebrew University of Jerusalem,
Israel, currently a visiting scholar at Northwestern University. Beata's
interests are in experimental, computational and applied research in text
pragmatics.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Dec 2006</td>
<td align=left valign=top>Jerry Hobbs</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Dec_2006');">
When Will Computers Understand Shakespeare?
</a><br>
<span id=abs15_Dec_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> In this talk I will examine problems encountered in coming to some
kind of understanding of one sonnet by Shakespeare (his 64th), ask
what it would take to solve these problems computationally, and
suggests routes to the solution.  The general conclusion is that we
are closer to this goal as one might think.  Or are we?
<p>
Bio:
<p>
Jerry Hobbs is famous primarily for having an office next to Kevin
Knight's and a parking space next to Ed Hovy's.  He has read
everything of Shakespeare's that survives, including his will and
plays of dubious authorship.  But that was all a long time ago.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Dec 2006</td>
<td align=left valign=top>Liang Huang (Penn)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Dec_2006');">
Faster Decoding with Synchronous Grammars and n-gram Language Models
</a><br>
<span id=abs14_Dec_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 1:30 pm - 3:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> A major obstacle in syntax-based machine translation is the
prohibitively large search space for decoding with an integrated
language model. We develop faster approaches for this problem based
on lazy algorithms for k-best parsing. When comparing against
Chiang's technique of cube pruning, our method runs up to twice as
fast without making more search errors or decreasing translation
accuracy as measured by BLEU. We demonstrate the effectiveness of the
algorithm on a large-scale translation system.
<p>
Interestingly, these techniques can be applied to speed up bilexical
parsing as well, where the (bi-) lexical probabilities can be viewed
as n-gram probabilities that causes non-monotonicity. This method
fits naturally into the coarse-to-fine grained multi-pass parsing
schemes.
<p>
To push this direction even further, we can generalize cube and lazy
cube pruning as generic tools for reducing complicated search spaces,
as alternatives to the well-known A* and annealing techniques.
<p>
This is joint work with David Chiang (ISI).
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 Nov 2006</td>
<td align=left valign=top>Mark Hopkins (Potsdam)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_Nov_2006');">
Towards the Effective Exploitation of Syntax in Machine Translation
</a><br>
<span id=abs27_Nov_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We discuss preliminary work on a possible approach to exploiting
syntax in an effective way for machine translation. The driving
guideline is to devise a machine translation system that can perform
effectively, given a very limited quantity of parsed training data.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Nov 2006</td>
<td align=left valign=top>David DeVault (Rutgers)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Nov_2006');">
Scorekeeping in an Uncertain Language Game
</a><br>
<span id=abs17_Nov_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Practical dialogue systems must exploit context to interpret user
utterances correctly.  Received views of context and coordination in
pragmatic theory equate utterance context with the occurrent
subjective states of interlocutors using notions like common knowledge
or mutual belief.  We argue that these views are not well suited for
practical modeling due to the uncertainty and robustness of context
dependence in human-human dialogue.  We present an alternative
characterization of utterance context as objective and normative.  On
this view, an interlocutor's representation of context reflects
private uncertainty about the true objective context as determined by
prior speaker meanings.  As conversation moves forward, new utterances
provide interlocutors with retrospective insight about each other's
prior meanings and therefore about what the true context really is.
This view reconciles the need for uncertainty with received intuitions
about coordination, and can directly inform computational approaches
to dialogue.
<p>
Joint work with Matthew Stone, Rutgers and Rich Thomason, Michigan
<p>
About the Speaker:
<p>
David DeVault is a Ph.D. candidate in the Department of Computer
Science at Rutgers University.  He holds a B.S. in Engineering and
Applied Science from the California Institute of Technology and an
M.A. in Philosophy from Rutgers University.  David's research aims to
develop techniques to allow computational agents to participate in
flexible task-oriented conversations with human beings.  His recent
work has drawn on design challenges encountered in building such an
agent to try to articulate practical, learnable, and theoretically
satisfying representations of context, utterance meaning, and speaker
intention for implemented conversational systems.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 Nov 2006</td>
<td align=left valign=top>Jens-Soenke Voeckler</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_Nov_2006');">
perl part 2 - advanced magick
</a><br>
<span id=abs03_Nov_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 5:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Since part 1 of the Perl tutorial didn't cover the juicy bits (like a
unique function in Perl), based on feedback from participants, I am
offering a part 2 "Perl - Advanced Magick" covering:
<p>
o the slides from roughly page 40
- The Schwartzian Transform
- Dissecting a program
o What to do, if you do need popen or backticks?
o OO Perl - a start
o C embedding - definitely only a "start here"
o Useful recipes, e.g. interpolating variables in configuration
scripts from Perl values.
<p>
If there is something you are especially interested in seeing, please
send me an email
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Oct 2006</td>
<td align=left valign=top>Jens-Soenke Voeckler</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Oct_2006');">
perl - how to use it, not abuse it
</a><br>
<span id=abs23_Oct_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 12:00 pm - 1:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> If you speak a little perl, are an occasional perl-scripter, and
would like to know more about how to use it as a (p)ortable, (e)
fficient, and (r)eadible (l)anguage, you may be interested in my
brown bag (read: bring your own) lunch seminar:
<p>
I will talk about using Perl in a portable fashion, the environment
it is run in, and how avoid common mistakes and misconceptions. Perl
offers more than a thousand ways to solve a problem, but some are
more portable or more efficient than others. If time permits, simple
hands-on examples can be tried out during the talk, so power for
laptops will be provided.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>29 Sep 2006</td>
<td align=left valign=top>Ashish Venugopal (CMU)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs29_Sep_2006');">
Delayed LM Intersection and Left-to-Right N-Best Extraction for Syntax-Based MT
</a><br>
<span id=abs29_Sep_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We begin by describing a set of pruning constraints that are applied
in the literature to effectively restrict the search space of
synchronous PCFGs intersected with target language model contexts. We
apply these constraints to non-binarized grammars with a large number
of non-terminals and demonstrate effective parsing within the
framework of Wu, 97.
<p>
We then present a novel parsing approach that avoids language model
context intersection during parsing in favor of language model driven
n-best list extraction. The parsing step produces a sentence
spanning parse forest which is explored in left-to-right target order
by the N-Best extraction method.
<p>
This method avoids lossy pruning during the parsing process, searching
a much larger effective parse space than practically possible in the
full intersection scenario, and has the important benefit of allowing
integration of a high order language within the N-Best search process,
rather than only in parse re-scoring.
<p>
We demonstrate the impact of this parsing approach using the SPCFG
approach described in Zollmann, Venugopal, Vogel 06, which is similar
to Galley et al., 04 and compare performance against full
intersection.
<p>
This is joint work with Andreas Zollmann
<p>
About the Speaker:
<p>
Ashish Venugopal is a Ph.D candidate at the Language Technologies
Institute at Carnegie Mellon University, and holds B.S (SCS,
Univ. Honors), M.S degrees from the same institution. He is a Seibel
Scholar and has received the annual Graduate Student Teaching Award at
Carnegie Mellon. His research focus is on syntax augmented machine
translation.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Sep 2006</td>
<td align=left valign=top>Eduard Hovy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Sep_2006');">
Toward a 'Science' of Annotation: Experiences from OntoNotes
</a><br>
<span id=abs22_Sep_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> As machine learning algorithms and their application for NLP become
better understood, attention turns toward the production of annotated
corpora to which they can be applied.  Numerous phenomena present
themselves for annotation, including aspects in lexical semantics,
discourse, pragmatics, and dialogue.  But several questions
immediately must be answered:
<p>
1. How does one obtain a balanced corpus to annotate?  What is a
balanced corpus?
<p>
2. How does one decide which aspects to annotate? How does one
adequately express the theory behind the phenomena in simple annotation steps?
<p>
3. Which annotators does one hire?  How does one ensure that they are adequately trained?
<p>
4. How does one establish a simple, fast, and trustworthy annotation
procedure?  What interfaces does one build?  How does one ensure that
the interfaces do not affect the annotation results?
<p>
5. How does evaluate the results? What are the appropriate agreement
measures?  At which cutoff points should one re-do the annotations?
How does one ensure improvement?
<p>
6. How should one formulate and store the results?  How does one
ensure compatibility with other existing resources?  How does one make
results available for best impact?
<p>
7. How does one report the annotation effort and results?  How does
one actually get a paper on this work published at an important
conference?  What should the paper contain?
<p>
Despite their being so basic, there is almost no established procedure
or standard set of answers to these questions today.  In this talk I
discuss some of these aspects, pointing to the lessons learned in the
ongoing OntoNotes project (joint with BBN, the University of Colorado
(PropBank), the University of Pennsylvania (Treebank), and ISI).
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Aug 2006</td>
<td align=left valign=top>Victoria Fossum (Michigan)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Aug_2006');">
Improving Precision of Word Alignments Using GHKM Syntax-Based Rule Extraction
</a><br>
<span id=abs25_Aug_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Noisy word alignments negatively affect the quality of the translation
rules extracted by the ISI syntax-based MT system.  In the literature,
alignment is typically treated as a separate process from subsequent
stages in the MT pipeline.  By contrast, we allow rule extraction to
guide the alignment process.
<p>
We present an unsupervised algorithm for identifying and removing "bad"
links using GHKM syntax-based rule extraction.  We show that
we can improve upon the precision of GIZA union (measured against a gold
standard set of manually aligned Chinese-English sentence pairs),
while only decreasing recall slightly.
<p>
<p>
(Note: This is part of the Summer Intern Series)
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Aug 2006</td>
<td align=left valign=top>Jason Riesa</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Aug_2006b');">
Minimally Supervised Morphological Segmentation with Applications to Machine Translation
</a><br>
<span id=abs25_Aug_2006b style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Inflected languages in a low-resource setting present a data sparsity problem for
statistical machine translation. In this work, we present a minimally
supervised algorithm for morpheme segmentation on Arabic dialects
which reduces unknown words at translation time by over 50%, total
vocabulary size by over 40%, and yields a significant increase in
BLEU score over a previous state-of-the-art phrase-based statistical MT system.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Aug 2006</td>
<td align=left valign=top>Oana-Diana Postolache</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Aug_2006');">
Towards combining Searn and Syntax-Based Machine Translation (SBMT)
</a><br>
<span id=abs23_Aug_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This talk is about modeling the Syntax-Based Machine Translation
(SBMT) problem within the Searn (Search & Learn) framework developed by Hal Daume in
his PhD thesis. I will present the way we define the states, actions
and the search space and how to implement the cost function.
<p>
<p>
(Note: This is part of the Summer Intern Series)
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Aug 2006</td>
<td align=left valign=top>Joseph Turian (NYU)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Aug_2006b');">
Speeding-up Syntax-based Decoding
</a><br>
<span id=abs23_Aug_2006b style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> TBA
<p>
(Note: This is part of the Summer Intern Series)
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 Aug 2006</td>
<td align=left valign=top>Chenhai Xi</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_Aug_2006');">
Name Entity Transliteration Discovery from Large Bilingual Comparable Corpora
</a><br>
<span id=abs18_Aug_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 3:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> In this summer project, we investigate a scalable method to extract
Chinese-English name transliterations from large comparable corpora,
which consist of two languages discussing same or similar topics. We show
that bigram Jaccard coefficient is a good similarity method to compare English
and Chinese names, at Chinese pronunciation (Pinyin) level. Based on this phonetic
similarity score, an efficient randomized algorithm is then used to
find name pair candidates from English and Chinese lists. Finally, context
information, such as dates, frequency, place and titles are combined with the
phonetic similarity to improve the accuracy of the name pairs list.
<p>
(Note: This is part of the Summer Intern Series)
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Aug 2006</td>
<td align=left valign=top>Idan Szpektor (Bar-Ilan U)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Aug_2006');">
Textual Entailment: Framework, Learning and Applications
</a><br>
<span id=abs11_Aug_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Textual Entailment has been proposed recently as a generic framework
for modeling semantic variability in many Natural Language Processing
applications, such as Question Answering, Information Extraction,
Information Retrieval and Document Summarization. The Textual
Entailment relationship holds between two text fragments, termed text
and hypothesis, if the truth of the hypothesis can be inferred from
the text.
<p>
In this talk, the Textual Entailment framework will be introduced.
I'll then present an algorithm for large-scale Web-based acquisition
of entailment rules, a type of knowledge needed for robust inference.
Finally, I will present an unsupervised Relation Extraction approach
based on the Textual Entailment framework.
<p>
About the speaker:
<p>
Idan Szpektor is a PhD student under the supervision of Dr. Ido Dagan
at Bar Ilan University, Israel. His current research activity is in
acquisition of knowledge for textual entailment.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>04 Aug 2006</td>
<td align=left valign=top>Shou-de Lin</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs04_Aug_2006');">
Ph.D. defense practice talk
</a><br>
<span id=abs04_Aug_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This is a practice talk for my Ph.D. defense, which
will be held on Aug 24th 3-5pm, SAL 322.
<p>
An important problem in the area of homeland security and fraud
detection is to identify abnormal entities in large datasets.
Although there are methods from knowledge discovery and data mining
focusing on finding anomalies in numerical datasets, there has been
little work aimed at discovering abnormal or suspicious instances in
large and complex semantic graphs whose nodes are richly connected
with many different types of links. In this talk, I will describe a
novel, domain-independent and unsupervised framework to identify such
instances.  Besides discovering suspicious instances, we believe that
to complete the discovery process and to deal with the "curse of
false positives", a system has to convince the users by providing
explanations for its findings. Therefore, in the second part of the
talk I will describe an explanation mechanism to automatically
generate human-understandable explanations for the discovered
results. Experimental results show that our discovery system
outperforms state-of-the-art unsupervised network algorithms used to
analyze the 9/11 terrorist network by a large margin. Additionally, a
human study we conducted demonstrates that our explanation system,
which provides natural language explanations for its findings,
allowed human subjects to perform complex data analysis in a much
more efficient and accurate manner
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>28 Jul 2006</td>
<td align=left valign=top>Qin Iris Wang (Alberta)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs28_Jul_2006');">
Improved Large Margin Dependency Parsing via Local Constraints and Laplacian Regularization
</a><br>
<span id=abs28_Jul_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This talk is about an improved approach for learning dependency parsers
from treebank data. Our technique is based on two ideas for improving
large margin training in the context of dependency parsing.  First, we
incorporate local constraints that enforce the correctness of each
individual link, rather than just scoring the global parse tree. Second,
to cope with sparse data, we smooth the lexical parameters according to
their underlying word similarities using Laplacian Regularization.  To
demonstrate the benefits of our approach, we consider the problem of
parsing Chinese treebank data using only lexical features, that is,
without part-of-speech tags or grammatical categories.  We achieve state
of the art performance, improving upon current large margin approaches.
<p>
Here is the link for the paper:
http://www.cs.ualberta.ca/~wqin/papers/depar_margin_conll06.pdf
<p>
About the speaker:
<p>
Qin Iris Wang is a Ph.D. student from the University of Alberta,
working with Dekang Lin and Dale Schuurmans. Her research interests
are in natural language processing and machine learning. Specifically,
she has been working on dependency parsing using both generative and
discriminative methods.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Jul 2006</td>
<td align=left valign=top>Dragos Munteanu + Joseph Turian</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Jul_2006');">
Practice Talks for ACL
</a><br>
<span id=abs11_Jul_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 2:30 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Extracting Parallel Sub-Sentential Fragments from Non-Parallel Corpora
Dragos Munteanu
<p>
We present a novel method for extracting parallel sub-sentential fragments
from comparable bilingual corpora. Currently, the state of the art in
comparable corpus mining is only able to extract full sentence pairs which
are judged to be parallel. We advance the state of the art by showing how
to obtain useful data even from not-fully-parallel sentences. By analyzing
sentence pairs using a signal-processing-inspired approach, we detect
which segments of the source sentence are translated into segments of the
target sentence, and which are not. We evaluate the quality of the
extracted data by showing that it improves the performance of a
state-of-othe-art machine translation system.
<p>
<p>
Advances in Discriminative Parsing
Joseph Turian
<p>
The present work advances the accuracy and training speed of
discriminative parsing. Our discriminative parsing method has no
generative component, yet surpasses a generative baseline on constituent
parsing, and does so with minimal linguistic cleverness. Our model can
incorporate arbitrary features of the input and parse state, and performs
feature selection incrementally over an exponential feature space during
training. We demonstrate the flexibility of our approach by testing it
with several parsing strategies and various feature sets.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Jun 2006</td>
<td align=left valign=top>David Chiang and Kevin Knight</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Jun_2006');">
Synchronous Grammars and Tree Transducers
</a><br>
<span id=abs30_Jun_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 2:00 pm - 5:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> (Practice tutorial for ACL/COLING 2006)
<p>
Once upon a time, synchronous grammars and tree transducers were esoteric
topics in formal language theory, far removed from the practice of
building real, large-scale natural language systems. However, these tools
are now rapidly becoming essential for modeling machine translation and
other complex language transformations. It has therefore become practical
and important to understand the basic properties of tree transformation
systems, which we cover in this tutorial.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Jun 2006</td>
<td align=left valign=top>Joseph Turian (NYU)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Jun_2006');">
Discriminative Training for Large-Scale NLP
</a><br>
<span id=abs23_Jun_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Parsing and translating natural languages can be viewed as
structured-prediction problems. We outline the crucial design
decisions that must be made to build a machine to solve structured
prediction problems, and explain our particular choices for these two
large-scale NLP problems.  Our approach uses a purely discriminative
learning method that scales up well to problems of this size.  Unlike
currently popular methods, this one does not require a great deal of
feature engineering a priori, because it performs feature selection
over a compound feature space as it learns.  Accuracy on constituent
parsing was at least as good as other comparable methods.  To our
knowledge, it is the first purely discriminative learning algorithm
for translation with tree-structured models.  Experiments demonstrate
the method's versatility, accuracy, and efficiency.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 May 2006</td>
<td align=left valign=top>Radu Soricut and Hal Daume III</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_May_2006');">
Defense Practice Talks: Generation and Learning
</a><br>
<span id=abs26_May_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 5:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> These are two practice talks for our upcoming thesis defenses.  The titles
and abstracts are:
<p>
--------------------------------------------------------------------------
<p>
NATURAL LANGUAGE GENERATION FOR TEXT-TO-TEXT APPLICATIONS USING AN INFORMATION-SLIM REPRESENTATION
<p>
Radu Soricut
<p>
In this talk, I describe a new natural language generation paradigm, based
on direct transformation of textual information into well-formed textual
output.  I support this language generation paradigm with theoretical
contributions in the field of formal languages, new algorithms, empirical
results, and software implementations. At the core of this work is a novel
representation formalism for probability distributions over finite
languages. Due to its convenient representation and computational
properties, this formalism supports a wide range of language generation
needs, from sentence realization to text planning.
<p>
Based on this formalism, I describe, implement, and analyze theoretically
a family of algorithms that perform language generation using direct
transformations of text. These algorithms use stochastic models of
language to drive the generation process. I perform extensive empirical
evaluations using my implementation of these algorithms. These evaluations
show state-of-the-art performance in automatic translation, and
significant improvements in state-of-the-art performance in abstractive
headline generation and coherent discourse generation.
<p>
<p>
--------------------------------------------------------------------------
<p>
PRACTICAL STRUCTURED LEARNING FOR NATURAL LANGUAGE PROCESSING
<p>
Hal Daume III
<p>
Natural language processing is replete with problems whose outputs are
highly complex and structured.  The current state-of-the-art in machine
learning is not yet sufficiently general to be applied to general problems
in NLP.  In this thesis, I present Searn (for "search" + "learn"), an
approach to learning for structured outputs that is applicable to the wide
variety of problems encountered in natural language.  Searn operates by
transforming structured prediction problems into a collection of
classification problems, to which any standard binary classifier may be
applied.  From a theoretical perspective, Searn satisfies a strong
fundamental performance guarantee: given a good classification algorithm,
Searn yields a good structured prediction algorithm.  To demonstrate
Searn's general applicability, I present applications in such diverse
areas as automatic document summarization and entity detection and
tracking.  In these applications, Searn is empirically shown to achieve
state-of-the-art performance.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 May 2006</td>
<td align=left valign=top>Hal Daume III</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_May_2006');">
Beyond EM: Bayesian Techniques for Human Language Technology Researchers
</a><br>
<span id=abs24_May_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 9:00 am - 12:00 pm<br>
<b>Location:</b> 4th Floor<br>
<b>Abstract:</b> This is a practice tutorial for one I am giving at HLT/NAACL one week
later.  Comments/feedback are very welcome.
<p>
----------------------------------------------------------------------
<p>
Expectation Maximization (EM) has proved to be a great and useful
technique for unsupervised learning problems in speech and language
processing.  Unfortunately, its range of applications is limited either by
intractable E- or M-steps, or by its reliance on the maximum likelihood
estimator.  The natural language processing community typically resorts to
ad-hoc approximation methods to get (some reduced form of) EM to apply to
NLP tasks.  However, many of the problems that plague EM can be solved
with Bayesian methods, which are theoretically more well justified.  In
this tutorial, I discuss Bayesian methods as they can be used in natural
language processing.  The two primary foci of this tutorial are specifying
prior distributions and performing the necessary computations to perform
inference in Bayesian models.  I focus on unsupervised techniques (for
which EM is the obvious choice), but discuss supervised and discriminative
techniques at the conclusion with pointers to relevant literature.
<p>
Depending on one's inference technique of choice, the math required to
build Bayesian learning models can be difficult.  Compounding this problem
is the fact that current written tutorials on Bayesian techniques tend to
focus on continuous-valued problems, a poor match for the high-dimension
discrete world of text.  This combination makes the cost of entrance to
the Bayesian learning literature often too high.  The goal of this
tutorial is to provide sufficient motivation, intuition and vocabulary
mapping so that one can easily understand recent papers in Bayesian
learning that are published at conferences like NIPS, and increasingly at
ACL.  In addition to the standard tutorial materials (slides), this
tutorial is accompanied by a technical report that spells out all the
mathematic derivations in great detail, for those who wish to start
research projects in this fields.
<p>
This tutorial should be accessible to anyone with a basic understanding of
statistics.  I use a query-focused summarization task as a motivating
running example for the tutorial, which should be of interest to
researchers in natural language processing and in information retrieval.
Additionally, though the tutorial does not focus on speech problems, those
attendees interested in graphical modeling techniques for automatic speech
recognition might also find the tutorial of interest.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 May 2006</td>
<td align=left valign=top>Patrick Pantel</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_May_2006');">
Espresso: Making Use of Generic Patterns for Mining Relations from Small and Large Corpora
</a><br>
<span id=abs19_May_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> In the past decade, researchers have explored many approaches to
automatically extract large collections of knowledge from text. In this
talk, we present Espresso, a weakly-supervised, general-purpose, and
broad-coverage algorithm for harvesting binary semantic relations. The
main contributions are: i) a method for exploiting generic patterns by
filtering incorrect instances using the Web; and ii) a principled measure
of pattern and instance reliability enabling the filtering algorithm. We
present an empirical comparison of Espresso with various state of the art
systems, on different size and genre corpora, on extracting various
general and specific relations. Experimental results show that our
exploitation of generic patterns substantially increases system recall
with small effect on overall precision.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>12 May 2006</td>
<td align=left valign=top>Nick Mote and Donghui Feng</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs12_May_2006');">
Pedagogical Contextualization of Language Learner Speech Errors AND Learning to Detect Conversation Focus of Threaded Discussions
</a><br>
<span id=abs12_May_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This is two practice talks.
<p>
-----------------------------------------------------------------------------
FIRST TALK:
<p>
The traditional approach to diagnosing learner speech errors in Computer
Aided Language Learning is to create a linguistic profile of the
learner/user. We, however, propose that work must also be done to model
the linguistic profile of a typcial native listener.
<p>
Not all errors in second langage learner speech are created equal.
Different errors sound more "severe" or "harsh" to native speaker ears and
should therefore be treated with more emphasis in pedagogical interaction.
<p>
The Tactical Language Training System (TLTS) is a speech-enabled
virtual-reality based computer learning environment designed to teach
Arabic spoken communication to American English speakers. This talk
addresses the ways the TLTS contextualizes non-native speech errors, and
how this contextualization fits in the corrective exchanges between a
non-native learner and a pedagogical agent built to model a native
listener.
<p>
The pedagogical system used in TLTS includes:
<p>
Â  * Automatic Speech Recognition (ASR) models which are built on a
combination of both annnotated and unannotated non-native speech with
native speech data.
<p>
Â  * A stochastic generative model for errors in learner speech that
creates mispronunciation grammars for the ASR
<p>
Â  * Reweighting of system-perceived mispronunciation severity based on
aggregate native speaker judgements of quality pronunciation and
intelligiblity.
<p>
Â  * Contextualization of feedback based on lexical and phonetic
inventories of the native and non-native languages.
<p>
<p>
-----------------------------------------------------------------------------
SECOND TALK:
<p>
We present a novel feature-enriched approach that learns to detect the
conversation focus of threaded discussions by combining NLP analysis and
IR techniques. Using the graph-based algorithm HITS, we integrate
different features such as lexical similarity, poster trustworthiness, and
speech act analysis of human conversations with featureoriented link
generation functions. It is the first quantitative study to analyze human
conversation focus in the context of online discussions that takes into
account heterogeneous sources of evidence. Experimental results using a
threaded discussion corpus from an undergraduate class show that it
achieves significant performance improvements compared with the baseline
system.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 May 2006</td>
<td align=left valign=top>Namhee Kwon</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_May_2006');">
Recognizing Argument Structures in Texts
</a><br>
<span id=abs05_May_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> I present our approach to identify an argument structure defined as a
simple hierarchical structure of claim and reasons.  The claim is also
classified into "in favor of" or "against" the topic. The experiment is
performed on the comments from the general public sent to government
officials in response to proposed regulations.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>28 Apr 2006</td>
<td align=left valign=top>Feng Pan</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs28_Apr_2006');">
Learning Event Durations from Event Descriptions
</a><br>
<span id=abs28_Apr_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> The research of extracting event duration information from texts is
potentially very important in applications in which the time course of
events is to be extracted from news.  For example, whether two events
overlap or are in sequence often depends very much on their durations.  If
a war started yesterday, we can be pretty sure it is still going on today.
If a hurricane started last year, we can be sure it is over by now.
<p>
In the talk, I will first present our work on constructing an annotated
corpus for extracting information about the typical durations of events
from texts, including the annotation guidelines, the event classes we
categorized, the way we use normal distributions to model such vague and
implicit temporal information, and how we evaluate inter-annotator
agreement. I will then show that machine learning techniques applied to
this data yield coarse-grained event duration information, considerably
outperforming a baseline and approaching human performance.
<p>
At the beginning of the talk, I will also give a brief overview of the
time ontology (OWL-Time, formerly DAML-Time) we have developed, which is
represented in both first-order logic and the OWL web ontology language.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>21 Apr 2006</td>
<td align=left valign=top>Soo-Min Kim</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs21_Apr_2006');">
Identifying and Analyzing Judgment Opinions
</a><br>
<span id=abs21_Apr_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> In this talk, we introduce a methodology for analyzing judgment opinions.
We define a judgment opinion as consisting of a valence, a holder, and a
topic. We decompose the task of opinion analysis into four parts: 1)
recognizing the opinion; 2) identifying the valence; 3)  identifying the
holder; and 4) identifying the topic. We evaluate our methodology using
both intrinsic and extrinsic measures.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Apr 2006</td>
<td align=left valign=top>Radu Soricut</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Apr_2006');">
Natural Language Generation for Text-to-Text Applications using an Information-Slim Representation
</a><br>
<span id=abs14_Apr_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Although a considerable number of generic Natural Language Generation
(NLG) systems has been produced over the years, none of them is usually
employed in end-to-end, text-to-text applications such as Machine
Translation, Summarization, Question Answering, etc. In this talk, we
identify the likely reasons for this state of affairs, and propose
WIDL-expressions as a flexible formalism that facilitates the integration
of a generic NLG engine within end-to-end language processing
applications.
<p>
WIDL-expressions represent compactly probability distributions over finite
sets of candidate realizations, and have optimal algorithms for text
realization via interpolation with language model probability
distributions. We show the effectiveness of our WIDL-based NLG engine for
both sentence realization and document realization tasks. By employing
language models that capture sentence-level properties, we perform Machine
Translation and Headline Generation at state-of-the-art levels or better.
By employing language models that capture document-level properties such
as text coherence, we synthesize output for Multi-document Summarization
that displays both high content selection performance and increased
coherence.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 Mar 2006</td>
<td align=left valign=top>Dragos Munteanu</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_Mar_2006');">
Automatic creation of parallel corpora
</a><br>
<span id=abs24_Mar_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Parallel texts -- texts that are translations of each other -- are an
important resource in many cross-lingual NLP applications, such as lexical
acquisition, cross-language IR, and annotation projection. However, their
importance is paramount for Statistical Machine Translation (SMT), as they
provide the training data from which all the translation knowledge is
learned. The state of the art in SMT is advanced enough that, given
sufficient parallel data (i.e. a few million words) for any language pair
in a given domain, a generic SMT system trained on it will achieve a
reasonable translation performance in that domain. The main reason why SMT
systems exist only for a handful of languages is that, for most language
pairs, parallel training data is simply not available.
<p>
One way to alleviate this lack of parallel data is to exploit a much
richer and more diverse resource: comparable corpora, texts which are not
strictly parallel but related. The prototypical example of comparable
texts are two news articles in different languages which report on the
same event. I will present methods for automatic extraction of parallel
data from such corpora. I will show how to detect parallel data at various
levels of granularity: parallel documents, parallel sentences, and even
parallel sub-sentence fragments. The parallel corpora obtained using these
methods help improve translation performance for both resource-scarce
language pairs (such as Romanian-English) and resource-rich ones (such as
Arabic-English).
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Mar 2006</td>
<td align=left valign=top>Jonathan May</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Mar_2006');">
Tiburon: A Finite State Tree Automata Toolkit
</a><br>
<span id=abs17_Mar_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 4th Floor<br>
<b>Abstract:</b> In the 1990s, researchers applied their new developments in transducer
theory using widely available easy-to-use toolkits for string transducers,
and made well-known advances in parsing, machine translation, and other
areas. Rapid prototyping via software such as the AT&T toolkit and carmel
was useful for proofs of concept and in many cases led to unforseen
developments in novel areas. In the current nlp research environment tree
based strategies and new models have shown promising results in advancing
the state of the art, and recent developments in weighted tree automata
theory are enriching the bedrock created 40 years ago, but as of yet there
is no toolkit available with the necessary capabilities to turn promise
into solution.
<p>
Tiburon is the first probablistic tree transducer toolkit. Similar in form
and function to the string-based toolkits of yesteryear, it is designed to
be easy to use, with simple but expressive definitions of tree automata
and a concise set of vital operations that can be used to construct many
useful tree-based nlp projects. Although a work in progress, Tiburon is
already a usable tool with active users between the ages of 6 and 41. I
will describe the current status of the system, demonstrate ease of use
and potential power, and discuss the challenges ahead.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Mar 2006</td>
<td align=left valign=top>Mark Hopkins</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Mar_2006');">
Exploring the Potential of Intractable Parsers
</a><br>
<span id=abs10_Mar_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 10th Floor<br>
<b>Abstract:</b> We revisit the idea of history-based parsing, and present a history-based
parsing framework that strives to be simple, general, and flexible.Â  We
also provide a decoder for this probability model that is linear-space,
optimal, and anytime.Â  A parser based on this framework, when evaluated on
Section 23 of the Penn Treebank, compares favorably with other
state-of-the-art approaches, in terms of both accuracy and speed.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 Mar 2006</td>
<td align=left valign=top>Liang Huang (Penn)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_Mar_2006');">
Syntax-Directed Translation with Extended Domain of Locality
</a><br>
<span id=abs03_Mar_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11th Floor (Large)<br>
<b>Abstract:</b> (note: this is a very tentative title -- comments welcome!)
<p>
We present a novel extension of syntax-directed translation for
statistical MT. Formally speaking, our model is based on tree-to- string
transducers that recursively convert a parse-tree in the source-language
into a string in the target-language. These transduction rules have
multi-level trees on the source-side, giving this system more
transformational power due to the extended domain of locality. We also
present efficient algorithms for decoding based on dynamic programming.
Initial experiments on English-to-Chinese translation show promising
results in both speed and the translation quality.
<p>
Joint work with Kevin Knight and Aravind Joshi.
<p>
Bio:
<p>
Liang Huang is a 3rd-year PhD student from the University of Pennsylvania.
He is mainly interested in algorithms and formalisms for parsing and
syntax-based machine translation. His recent work has been on k-best
parsing algorithms (with David Chiang) and synchronous binarization for MT
(with Hao Zhang, Dan Gildea, and Kevin Knight).
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 Feb 2006</td>
<td align=left valign=top>Hal Daume III</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_Feb_2006');">
Search-based Structured Prediction
</a><br>
<span id=abs24_Feb_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> I present an algorithm, Searn (for "search-learn") that is designed to
solve structured prediction problem: problems whose goal is to learn to
predict complex objects such as parts-of-speech, parse trees,
translations, etc...  Searn functions by "breaking apart" structured
prediction problems into classification problems in the process of search.
I analyze Searn in the framework of learning reductions and show that good
performance on the underlying classification problems implies good search
performance.  Moreover, Searn is computationally efficient in a superset
of the settings where previous algorithms are efficient and is not limited
by conditional independence assumptions (as in CRFs).  This excessively
simple and general algorithm turns out to have excellent state-of-the-art
performance.
<p>
This is joint work with John Langford (TTI-C) and Daniel Marcu; and, to a
lesser extent, with Drew Bagnell (CMU) and Bianca Zadrozny (IBM TJ
Watson).
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Feb 2006</td>
<td align=left valign=top>David Chiang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Feb_2006');">
Parsing Arabic Dialects
</a><br>
<span id=abs10_Feb_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> The Arabic language exhibits diglossia, i.e., the coexistence of two forms
of language, a variety with standard orthography and sociopolitical clout
which is not natively spoken by anyone (Modern Standard Arabic, MSA) and
varieties that are primarily spoken and lack writing standards (Arabic
dialects). There are important resources currently available for MSA with
much on-going NLP work; for example, there is an Arabic Treebank and
several syntactic parsers for MSA.  However, Arabic dialect resources and
NLP research are still at an infancy stage. I will present work done at
the Johns Hopkins CLSP Summer Workshop on parsing of Arabic dialects, in
particular, Levantine Arabic.  We have experimented with three approaches
to leveraging MSA resources to create a parser for Levantine Arabic, as
well as methods for induction of MSA-Levantine translation lexicons and a
Levantine part-of-speech tagger. Using these methods we obtain error
reductions of up to 15% compared with applying an MSA parser directly to
Levantine text.
<p>
Rambow et al. Parsing Arabic Dialects: Final Report. Johns Hopkins
University Center for Language and Speech Processing Workshop 2005.
http://www.clsp.jhu.edu/ws2005/groups/arabic/documents/finalreport.pdf
<p>
Chiang et al. Parsing Arabic Dialects. To appear in Proc. EACL 2006.
<p>
This is joint work with O. Rambow, M. Diab, N. Habash, R. Hwa, K. Sima'an,
V.  Lacey, R. Levy, C. Nichols and S. Shareef.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 Feb 2006</td>
<td align=left valign=top>Alex Fraser</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_Feb_2006');">
Measuring Word Alignment Quality for Statistical Machine Translation
</a><br>
<span id=abs03_Feb_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Automatic word alignment plays a critical role in statistical machine
translation. Unfortunately the relationship between alignment quality and
statistical machine translation performance has not been well understood.
In the recent literature the alignment task has frequently been decoupled
from the translation task, and assumptions have been made about measuring
alignment quality for machine translation which, it turns out, are not
justified. In particular, none of the tens of papers published over the
last five years has shown that significant decreases in Alignment Error
Rate (AER) result in significant increases in translation quality. I will
explain this state of affairs and present steps towards measuring
alignment quality in a way which is predictive of statistical machine
translation quality.
<p>
I will also provide a brief overview of some of my other work on training
and search for word alignment.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 Jan 2006</td>
<td align=left valign=top>John Conroy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_Jan_2006');">
Multi-Document Summary Space:What do People Agree is Important?
</a><br>
<span id=abs27_Jan_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> A multi-document summary gives the "gist" of what is contained in a
collection of related documents. But how can we define a "gist?" We
explore this question by analyzing human written summaries for clusters of
document sets. In particular, we estimate the probability that word will
be chosen by a human to be included in a summary. We demonstrate that if
this probability model were given by an oracle, then a simple automatic
method of summarization can produce extract summaries which are
statistically indistinguishable from the human summaries.
<p>
About the Speaker:
<p>
John M. Conroy received a B.S. in Mathematics from Saint Joseph's
University in 1980 and a Ph.D. in Applied Mathematics from the University
of Maryland in 1986. Since then he has been a research staff member for
the IDA Center for Computing Sciences in Bowie, MD. His research interest
is applications of numerical linear algebra and statistics. He is a member
of the Society for Industrial and Applied Mathematics, Institute of
Electrical and Electronics Engineers (IEEE), and the Association for
Computational Linguistics.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Jan 2006</td>
<td align=left valign=top>Tim Chklovski</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Jan_2006');">
GrainPile: Deriving Quantitative Overviews of Free Text Assessments on the Web
</a><br>
<span id=abs26_Jan_2006 style="display:none;">
<font size=-1>
<b>Time:</b> 1:00 pm - 2:00 pm<br>
<b>Location:</b> 4th floor<br>
<b>Abstract:</b> Many research efforts are addressing the problem of enabling automatic
summarization of opinions and assessments stated on the web in product
reviews, discussion forums, and blogs. One key difficulty is that relevant
assessments scattered throughout web pages are obscured by variations in
natural language. In this paper, we focus on a novel aspect of enabling
aggregations of assessments of degree to which a given property holds for
a given entity (for instance, how touristy is Boston). We present
GrainPile, a user interface for extracting from the web, aggregating and
quantifying degree assessments of unconstrained topics. The interface
provides a variety of functions: a) identification of dimensions of
comparison (properties) relevant to a particular entity or set of
entities, b) comparisons of like entities on user-specified properties
(for example, which university is more prestigious, Yale or Cornell), c)
tracing the derived opinions back to their sources (so that the reasons
for the opinions can be found). A central contribution in GrainPile is the
evaluated demonstration of feasibility of mapping the recognized
expressions (such as fairly, very, extremely, and so on) to a common scale
of numerical values and aggregating across all the extracted assessments
to derive an overall assessment of degree. GrainPile&#8217;s novel
assessment and aggregation of degree expressions is shown to strongly
outperform an interpretation-free, co-occurrence based method.
<p>
Full paper:
<p>
http://www.isi.edu/~timc/papers/IUI06-grainpile-chkl.pdf
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Dec 2005</td>
<td align=left valign=top>Jonathan May</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Dec_2005');">
A Better N-Best List - Practical Determinization of Weighted Finite Tree Automata
</a><br>
<span id=abs16_Dec_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Ranked lists of output trees from syntactic statistical NLP applications
frequently contain multiple repeated entries. This redundancy leads to
misrepresentation of tree weight and reduced information for debugging and
tuning purposes. It is chiefly due to nondeterminism in the weighted
automata that produce the results. I will introduce an algorithm that
determinizes such automata while preserving proper weights, returning the
sum of the weight of all multiply derived trees. I will also report
results of the application of the algorithm to machine translation and
Data Oriented Parsing.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Sep 2005</td>
<td align=left valign=top>David Chiang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Sep_2005');">
Some Computational Complexity Results for Synchronous Context-Free Grammars
</a><br>
<span id=abs30_Sep_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 4 Large<br>
<b>Abstract:</b> (This is a practice talk for a paper by Giorgio Satta and Enoch Peserico)
<p>
This paper investigates some computational problems associated with
probabilistic translation models that have recently been adopted in the
literature on machine translation. These models can be viewed as pairs of
probabilistic context-free grammars working in a `synchronous' way. Two
hardness results for the class NP are reported, along with an exponential
time lower-bound for certain classes of algorithms that are currently used
in the literature.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>29 Sep 2005</td>
<td align=left valign=top>Tim Chklovski</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs29_Sep_2005');">
Previews of my talks for K-CAP
</a><br>
<span id=abs29_Sep_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> <p>
The topics & approximate start times:
<p>
(3:00 sharp) My 7-10 min bit for panel discussion on "Manual vs. Automated
Knowledge Acquisition"
<p>
Will touch on web extraction vs. learning from volunteers -- strengths and
weaknesses, new thoughts on synergies
<p>
(3:15) Designing Intelligent Acquisition Interfaces for Collecting World
Knowledge from Web Contributors
(paper by Timothy Chklovski, Yolanda Gil)
<p>
(3:55) Collecting Paraphrase Corpora from Volunteer Contributors (paper by
Timothy Chklovski)
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Aug 2005</td>
<td align=left valign=top>Fossum, Huang and Zhang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Aug_2005');">
Summer Student Presentations
</a><br>
<span id=abs26_Aug_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> 3:00pm  Victoria Fossum (Michigan)
Exploring the Continuum between Phrase-based and Syntax-based Machine Translation
<p>
State-of-the-art statistical machine translation systems use lexical
phrases as the basic unit of translation.  Phrase-based systems can
capture those aspects of translation that are sensitive to local context.
Syntax-based systems, on the other hand, make use of linguistically
motivated syntactic structure, can capture long-distance dependencies and
reorderings, and offer greater generalization in translation rules.
However, their performance lags that of phrase-based systems.
<p>
Hierarchical phrase-based translation, introduced by [Chiang 05], provides
an elegant framework for exploring the continuum between phrase-based and
syntax-based translation.  This system combines the "formal machinery" of
syntax-based systems without any "linguistic commitment" to a particular
syntactic structure [Chiang 05].
<p>
I will present results from my re-implementation of Chiang's hierarchical
phrase-based system, and (if time permits) compare those results with the
following systems on Chinese-English translation: ISI's phrase-based
system, and ISI's syntax-based system.  Between now and December 2005, I
plan to incrementally explore the space between phrase-based and
syntax-based systems by augmenting these hierarchical phrase-based rules
with richer syntactic annotation.
<p>
<p>
3:30pm  Liang Huang (Penn) and Hao Zhang (Rochester)
Efficient Integration of n-gram Language Models with Syntax-based Decoding
<p>
We first give an overview of the ISI syntax-based MT system which is based
on tree-to-string (xRs) translation rules. The biggest problem at this
stage is the inefficiency of the integration of n-gram models.  Without
n-gram models, the xRs translation rules can be easily binarized with
respect to the foreign language to ensure cubic-time decoding. With n-gram
models, however, binarization without considering both languages will lead
to exponential complexity.
<p>
Inspired by Inversion Transduction Grammar (ITG) (Wu, 97), we will focus
on the so-called ITG binarizable rules which count for over 99% of the
whole rule set. A simple linear-time algorithm will be presented to do the
binarization. Decoding with ITG-like rules is of low polynomial complexity
in both time and space. We will discuss experimental results on both
efficiency and accuracy of decoding with the new binarization.  If time
permits, we will also present the "hook trick" (inspired by (Eisner and
Satta, 99)) to even further reduce the polynomial complexity of the
decoding process.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 Aug 2005</td>
<td align=left valign=top>Hopkins, Riesa, and Nakov</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_Aug_2005');">
Summer Student Presentations
</a><br>
<span id=abs24_Aug_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:30 pm - 5:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> 3:30pm  Mark Hopkins (UCLA)
Tree Sequence Automata: A Unifying Framework for Tree Relation Formalisms
<p>
There exist a wide variety of competing formalisms for representing a
language of ordered tree pairs.  These include (bottom-up and top-down)
tree transducers, synchronous tree-substitution grammars (STSGs),
synchronous tree-adjoining grammars (STAGs), and inversion transduction
grammars (ITGs).  Since these formalisms have all developed independently
of one another, it is difficult to compare their respective
representational power.  This work seeks to make this task simpler by
viewing these formalisms as instances of a general unifying formalism,
which we call tree sequence automata (TSA).  By casting these different
formalisms in a single framework, we can compare them directly by studying
the specific subclass of TSA that they fall into.
<p>
4:00pm  Jason Riesa (Johns Hopkins)
A case study in building a cost-effective speech-to-speech machine translation system with sparse resources: English - Iraqi Arabic
<p>
The Arabic spoken dialect of Iraq is a language deprived of the vast
resources that researchers enjoy when working with its written
counterpart, Modern Standard Arabic (MSA). The Iraqi Arabic lexicon and
grammar are also sufficiently distinct so that the use of existing tools
or corpora for MSA yield little or no positive effect on machine
translation output quality.  One can see that building a machine
translation system normally dependent on a large parallel corpus is a
particularly difficult task when given just a 37,000 line translated
parallel text based on transcribed speech. This talk will explore the
constraints involved in working with this type of data, how we endeavored
to mitigate such problems as a non-standard orthography and a highly
inflected grammar, and propose a cost- effective way for dealing with such
projects in the future.
<p>
4:30pm  Preslav Nakov (UC Berkeley)
Multilingual Word Alignment
<p>
Recently there has been a growing number of available multilingual
parallel texts. One such source is the European Union, which publishes its
official documents in the official languages of all member states
(sometimes also in the languages of the candidates). Another source are
the United Nations. These corpora are a great source of training data for
machine translation between new language pairs. But they also offer the
opportunity to obtain better pairwise word alignments by looking at
multiple languages in parallel. In this talk I will present my research as
a summer intern at ISI on getting better French (Fr) to English (En) word
alignments using an additional language (Xx). First, I will introduce two
heuristics which start with pairwise alignments between Fr-Xx, En-Xx and
Fr-En and then combine them probabilistically (in a linear model) or
graph-theoretically (by looking at in- and out-degrees for each word).
Then I will present two Model1 inspired alignment models: (a) from "Fr and
Xx" to En; and (b) from Fr to "En and Xx".
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Aug 2005</td>
<td align=left valign=top>Doug Oard (Maryland)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Aug_2005');">
The CLEF Cross-Language Speech Retrieval Test Collection
</a><br>
<span id=abs05_Aug_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Test collections for information retrieval tasks have traditionally
assumed that what we are searching for are documents (e.g., Web pages,
news stories, or academic documents).  Most information that is generated
is, however, not in originally generated as part of a document, but rather
as what we might refer to as "conversational media" (e.g., email, speech,
or instant messaging).  In this talk, I'll describe the creation of two
test collections for conversational media, an email collection being
created in the TREC Enterprise Search track and a spoken word test
collection for the the Cross-Language Evaluation Forum (CLEF).  I'll spend
most of the talk describing the details of the CLEF test collection,
illustrating the issues with some of the results that we have obtained
from our experiments with that collection.  I'll conclude with a few
remarks about the implications of what we are learning for DARPA's new
GALE program.  This is joint work with Charles University, the IBM TJ
Watson Research Center, the Johns Hopkins University, the Survivors of the
Shoah Visual History Foundation, and the University of West Bohemia.
<p>
<p>
About the speaker:
<p>
Douglas Oard is an Associate Professor at the University of Maryland,
College Park, with a joint appointment in the College of Information
Studies and the Institute for Advanced Computer Studies.  He holds a Ph.D.
in Electrical Engineering from the University of Maryland, and his
research interests center around the use of emerging technologies to
support information seeking by end users.  In 2002 and 2003, Doug spent a
year in paradise here at USC-ISI.  His recent work has focused on
interactive techniques for cross-language information retrieval and on
searching conversational text and speech.  Additional information is
available at http://www.glue.umd.edu/~oard/.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Aug 2005</td>
<td align=left valign=top>Jan Hajic (Charles U)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Aug_2005b');">
The Family of Prague Dependency Treebanks
</a><br>
<span id=abs05_Aug_2005b style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 am - 12:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> The Prague Dependency Treebank project is aimed at a linguistically
complex, multi-tier annotation of relatively large amounts of naturally
occuring sentences of natural language. There are four tiers at present:
the basic token tier (level 0), and the morphological, surface-syntacic,
and semantic (called "tectogrammatics") tiers. The syntactic and
tectogrammatic tiers are based on a richly labelled dependency
representation principle. So far, the project produced three corpora: the
Czech-language-only Prague Dependency Treebank, the Prague Czech-English
Dependency Treebank and the Prague Arabic Dependency Treebank. In the
talk, the principles of the Prague Dependency Treebank linguistic
annotation scheme will be presented. Some technical details will also be
discussed, as well as some of the tools developed both for the manual
annotation itself and for corpus-based NLP of Czech, English and Arabic.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Jul 2005</td>
<td align=left valign=top>Victoria Li Fossum (Michigan)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Jul_2005');">
Inducing POS Taggers by Projecting from Multiple Source Languages
</a><br>
<span id=abs15_Jul_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> (Yarowsky et al., 2001) present an algorithm for bootstrapping a POS
tagger for an arbitrary target language, using an existing POS tagger for
a source language and a parallel corpus in the source and target
languages.  The source text is annotated with the POS tagger; the parallel
corpus is word-aligned; the POS tags are "projected" from source to target
language; and finally smoothing is performed before training a POS tagger
for the target language on the projected annotations.
<p>
I will talk about my work (jointly with my advisor, Steve Abney, at U. of
Michigan) in which we extend this algorithm by projecting from multiple
source languages onto a target language, then combining the outputs to
compute a consensus POS tagger.  Our hypothesis is that systematic
transfer errors from different source-target pairs can be reduced by using
multiple source languages.  I will present experimental results for three
different source languages (English, German, and Spanish), and two
different target languages (French and Czech).  Our results indicate that
using multiple source languages improves performance.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Jul 2005</td>
<td align=left valign=top>Radu Soricut</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Jul_2005');">
Natural Language Generation for Text-to-Text Applications Using an Information-Slim Representation
</a><br>
<span id=abs07_Jul_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Small<br>
<b>Abstract:</b> Text-to-text applications -- Machine Translation, Summarization, Question
Answering -- do not usually involve generic Natural Language Generation
(NLG) systems in their generation components, but rather use
application-specific algorithms. The main reason for this state of affairs
is that virtually all the formalisms used by current generic NLG systems
require information that cannot be reliably extracted from unrestricted
text.
<p>
This thesis proposal is about meeting the demand for natural language
generation in the context of text-to-text applications. I introduce a new
representation formalism (WIDL-expressions), propose generation algorithms
that operate on representations specific to this formalism, and discuss a
generic sentence realization framework for text-to-text applications. The
generation mechanism is based on algorithms for intersecting
WIDL-expressions with probabilistic language models. I present both
theoretical and empirical results concerning the correctness and
efficiency of these algorithms. I also discuss the practical aspects
arising from implementing this generation mechanism.
<p>
In a concrete application of the proposed generation mechanisms, I present
an end-to-end Machine Translation application. I also discuss another
possible application for Automated Summarization, namely automated
headline generation.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 Jul 2005</td>
<td align=left valign=top>Alessandro Moschitti (Rome)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_Jul_2005');">
Kernel Methods for Semantic Role Labeling
</a><br>
<span id=abs06_Jul_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 2:00 pm - 3:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Automatic Natural Language applications often require the processing of
structured data. Traditional machine learning approaches attempt to
represent structured syntactic/semantic objects by means of flat feature
representations, i.e. attribute-value vectors. This raises two problems:
<p>
1. There is no well defined theoretical motivation for such feature model.
Structural properties may not fit in any flat feature representation.
<p>
2. To define effective flat features, a deep knowledge about the
linguistic phenomenon is required.
<p>
Kernel methods for Natural Language Processing aim to solve both the above
problems as kernel functions can be used to define similarities between
linguistic objects without explicitly defining the target feature space.
In this way, a linguistic phenomenon can be modeled at a more abstract
level where the modeling is easier. Such property is extremely useful when
the representation of linguistic phenomena is still not well understood.
For example, the feature design of semantic role labeling appear to be
quite complex since several and non-definitive feature sets have been
proposed.
<p>
As a viable alternative to manual feature design, kernel methods propose
two steps: (1) they generate all substructures of the target
syntactic/semantic structures and (2) they let the learning algorithm
(e.g. Support Vector Machines) to select the most relevant substructures.
In this talk, we (1) introduce the PropBank and FrameNet predicate
argument structures, (2) present the standard approaches to the automatic
labeling of semantic roles and (3) show advanced semantic role labeling
models based on kernel methods.
<p>
About the speaker:
<p>
Alessandro Moschitti is a researcher at the Computer Science Department of
the University of Rome ^Ã“Tor Vergata^Ã”. In 1998 he took his master degree
in Computer Science at the University of Rome ^Ã“La Sapienza^Ã”. In 2003 he
finished his PhD in Computer Science at ^Ã“Tor Vergata^Ã” University.
Between 2002 and 2004 he worked as an associate researcher in the
University of Texas at Dallas. His research interests concern machine
learning approaches for Natural Language Processing and Information
Retrieval. His deep expertise relates to automated text categorization and
semantic role labeling.  Recently, he has devised new kernels which enable
Support Vector and other kernel-based machines to carry out advanced
semantic processing.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Jun 2005</td>
<td align=left valign=top>Michael Fleischman (MIT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Jun_2005');">
Intentional Context in Situated Language Learning
</a><br>
<span id=abs23_Jun_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 am - 12:00 pm<br>
<b>Location:</b> 11 Small<br>
<b>Abstract:</b> Natural language interfaces designed for agents that interact with users
in shared environments (e.g. training simulators, videogames) must
incorporate knowledge about the users' context in order to address the
many ambiguities of situated language use. We introduce a model of
situated language acquisition that operates in two phases.  First,
intentional context is represented and inferred from user actions using
probabilistic context free grammars.  Then, utterances are mapped onto
this representation in a noisy channel framework.  The acquisition model
is trained on unconstrained speech collected from subjects playing an
interactive game, and tested using an understanding task.  Discussion of
results focuses both on the implications for theoretical models of
cognition, as well as, for natural language applications in shared
environments.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Jun 2005</td>
<td align=left valign=top>Mitsunori Matsushita</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Jun_2005b');">
Lumisight Table: A Face-to-face Collaboration Support System That Optimizes Direction of Projected Information to Each Stakeholder
</a><br>
<span id=abs22_Jun_2005b style="display:none;">
<font size=-1>
<b>Time:</b> 11:00 am - 12:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> (This talk occurs in the morning on the same day as the Bayesian tutorial.)
<p>
The goal of our research is to support cooperative work performed by
stakeholders sitting around a table. To support such cooperation, various
table-based systems with a shared electronic display on the tabletop have
been developed. These systems, however, suffer the common problem of not
recognizing shared information such as text and images equally because the
orientation of their view angle is not favorable. To solve this problem,
we propose the Lumisight Table. This is a system capable of displaying
personalized information to each required direction on one horizontal
screen simultaneously by multiplexing them and of capturing stakeholders'
gestures to manipulate the information.
<p>
About the Speaker:
<p>
Mitsunori Matsushita is a research scientist of NTT Communication Science
Labs., Nippon Telegraph and Telephone Corporation (NTT). He received B.E.,
M.E., and Dr.E. degrees from Osaka University, in 1993, 1995 and 2003
respectively. In 1995, he joined NTT, and has been engaged in researches
on natural language understanding, information visualization, and
interaction design.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Jun 2005</td>
<td align=left valign=top>Hal Daume III</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Jun_2005');">
Beyond EM: Bayesian Techniques for NLP Researchers
</a><br>
<span id=abs22_Jun_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 1:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> EM has proved to be a great and useful technique for unsupervised learning
problems in natural language.  Unfortunately, it cannot solve every
problem out there, either because the E-step is intractable, the M-step is
intractable or both.  Typically our community resorts to a Viterbi
approximation in this case, which really isn't very justified and can
easily diverge from our expectations (no pun intended). Moreover, EM --
like all maximum likelihood methods -- suffers from a need for ad-hoc and
undesirable smoothing.  All of these problems -- intractable E- or
M-steps, the Viterbi approximation, and the annoyance of smoothing -- are
solved by using Bayesian methods. Moreover, from a theoretic point of
view, the Bayesian paradigm is much more foundationally well justified
than the frequentist use of estimators (such as the maximum likelihood
estimator), at some cost in computation (though not as much as you might
believe).
<p>
In this tutorial, I will discuss Bayesian methods as they can be used in
natural language processing.  The first half will be background (some of
which you probably won't have seen, some of which you probably will have
seen, but which will probably be presented in a different way that you're
used to) including graphical models, EM, priors and pro- (and con-)
Bayesian arguments.  The second half of the tutorial will focus on solving
complex inference problems, essentially building on what we've seen from
EM.  I'll cover MAP (*not* Bayesian -- if you can't tell me why, then you
should come to the tutorial!), summing, Monte Carlo, MCMC, Laplace,
variational and expectation propagation.  Time permitting, I will briefly
discuss Bayesian discriminative models (basically what a Bayesian uses
instead of SVMs), non-parametric (infinite) models and Bayesian decision
theory, all of which make use of the inference techniques we will have
already covered.
<p>
This tutorial is intended to be largely self contained, though I will
expect that you know what probabilities are, what distributions are and
the standard manipulations of conditional/joint distributions. Familiarity
with EM would be helpful, but I'll cover this topic in some depth since it
will be important for understanding the rest of the tutorial.  I hope --
though this never really seems to come to fruition -- that this will be a
semi-interactive talk and I will attempt to adjust according to what
people are interested in and what is putting people to sleep.
<p>
(see http://www.isi.edu/~hdaume/bayesnlp/ for more information)
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>20 Jun 2005</td>
<td align=left valign=top>Birte Loenneker (Hamburg)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_Jun_2005');">
Between Story Generation and Natural Language Generation
</a><br>
<span id=abs20_Jun_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 10:00 am - 11:30 am<br>
<b>Location:</b> 11 Small<br>
<b>Abstract:</b> Narratology analyzes the discursive structure of narratives as finalized
products of human invention, such as novels, short-stories, or
fairy-tales. Those narratives are rendered in a given surface form;
Narratology focuses on narratives in natural language. Narratologists
assume that each narrative surface representation is associated with a
neutral, abstract event sequence, the "Story" (histoire, sjuzhet). The
abstractness of Story is illustrated by the fact that the same Story can
be realized in different surface texts. By discursive structure or
"Discourse" (discours, fabula), narralogists mean the relation between an
abstract Story and its concrete expression in a sequential text. For
example, if the chronological order of the Story is not respected in its
textual recount, we are dealing with the Discourse parameter of order.
Other Discourse parameters include the frequency with which Story events
are evoked, the point of view from which they are narrated (perceived,
evaluated,...), or framed narratives with several narrative levels.
<p>
The Story Generator Algorithms project at the University of Hamburg
evaluated several existing Story Generators with respect to their
discursive abilities. It became obvious that most Story Generators
concentrate on creating a coherent and chronological abstract Story,
which is directly mapped onto natural language. This results in a
predominance of 1:1 relations between Story and surface, and in most
cases corresponds to a default or zero instantiation of Discourse
parameters. As a consequence, Story Generator outputs tend to be very
explicit and straightforward, and are likely to be perceived as uniform
and boring.
<p>
Narratological expert knowledge might be useful to future enhanced Story
Generators and to Natural Language Generation systems dealing with
narrative. One of the aims of Computational Narratology is to model that
expert knowledge. Ideally, narratological knowledge will be integrated
into a Narratological Structurer, as a processing component of an
advanced system that creates narratives. In such a system, the
Narratological Structurer will be the interface between a Story Generator
and subsequent Natural Language Generation modules. The talk also
presents examples of the knowledge that is being modelled.
<p>
<p>
About the Speaker:
<p>
Birte LÃ¶nneker graduated from the University of Hamburg, Germany, with a
degree in French with Finno-Ugristics (Finnish) and Business
Administration. Since then, her main fields of publication are Cognitive
Linguistics and electronic resources for Natural Language Processing,
with special focus on frames and metaphors, as well as electronic
dictionaries, corpora, and recently part-of-speech tagging. Her PhD on
Concept Frames and Relations, also published as a book in 2003, was
co-supervised at the Institute for Romance Languages and at the
Department of Informatics in Hamburg. For her Slovenian-German online
dictionary, Birte LÃ¶nneker was twice awarded the EURALEX Laurence Urdang
Award. From 2002 to 2004, she received various research grants for
Slovenia, where she was working in the Corpus Laboratory of the Institute
of Slovenian Language.
<p>
Since 2004, Birte LÃ¶nneker carries out research on Story Generator
Algorithms within the Narratology Research Group Hamburg. She is also a
board member of the German Cognitive Linguistics Association.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Jun 2005</td>
<td align=left valign=top>Gully Burns</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Jun_2005');">
The neuroscience laboratory as a knowledge factory: challenges, approaches and tools
</a><br>
<span id=abs17_Jun_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 am - 12:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> As a discipline of biology, the field of neuroscience suffers greatly from
information overload, non-standardization and complexity. In the absence
of a mathematical theoretical structure for the subject, scientists use
their own ad-hoc methods of collating and synthesizing information from
both the primary literature and their own data. In order to eventually
formalize and accelerate the development of theoretical approaches in the
subject, we are combining an Electronic Laboratory Notebook (ELN) with
asset management of the primary research literature to construct a
knowledge engineering framework based around the organizational unit of a
neuroscience laboratory. This project, called Â¡NeuroScholarÂ¢
(http://www.neuroscholar.org/) is open-source, and is being tested and
used in the laboratories of Prof. Larry Swanson and Prof. Alan Watts at
USC. In each laboratory, the system will operate on top of a Â¡laboratory
corpusÂ¢ of knowledge resources (data files, full-text pdf files , etc.)
that summarizes the relevant knowledge for that laboratory. Not only will
this collection provide a valuable resource for the members of the
laboratory, it provides a platform for natural language processing and
knowledge engineering to answer formally-defined research questions. The
Society for NeuroscienceÂ¢s annual meeting attracts over 30,000 attendees,
who collectively form potential user-base of this software.
<p>
I will talk about the ideas underlying the project, the current
implementation of NeuroScholar, developments from collaboration with the
natural language group at ISI and possible collaborations for the future.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>13 Jun 2005</td>
<td align=left valign=top>Hal Daume III</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs13_Jun_2005');">
Search, Learning and Features (my thesis proposal proposal)
</a><br>
<span id=abs13_Jun_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 am - 12:00 pm<br>
<b>Location:</b> 11 Small<br>
<b>Abstract:</b> I'm going to talk about what I've been working on recently.  My thesis
proposal is something having to do with the interaction of search,
learning and features in supervised natural language problems.  I will be
focusing on the task of coreference, since it is a well-studied problem,
yet nevertheless not really solved and quite difficult.  It is also a
great pedagogical example for why we should care about something *other*
than standard Markov random fields for structured prediction, since, for
the coreference problem (and pretty much every other "real" natural
language problem) inference in such models is intractable.
<p>
The contents of this talk will be roughly 40% from a paper I have at ICML
this year on efficient, accurate supervised learning techniques for
structured prediction (and why I feel inclined to make the very
controversial statement that supervised learning for NLP problems is
solved); it will be roughly 40% about an application of this technique to
the coreference resolution problem and an exploration of the feature space
for solving this problem (submitted to HLT); and it will be roughly 20%
about looking forward to what I want to accomplish in the remainder of my
thesis, not covered by the first 80%.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Jun 2005</td>
<td align=left valign=top>Liang Huang (Penn)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Jun_2005');">
Better k-best Parsing, Hypergraphs and Dynamic Programming
</a><br>
<span id=abs10_Jun_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We discuss the relevance of k-best parsing to recent applications in
natural language parsing, and develop algorithms that substantially
improve on previously-used algorithms with respect to efficiency,
scalability, and accuracy. We demonstrate these algorithms in experiments
on Bikel's implementation of Collins' lexicalized PCFG model, and on a
synchronous CFG based decoder for statistical machine translation. We show
in particular how the improved output of our algorithms has the potential
to improve results from parse reranking systems and other applications.
<p>
In this talk, I will demonstrate the convergence of several popular
parsing formalisms (weighted deduction, shared forest, semiring) under the
powerful hypergraph formalism. If time permits, I will also show how
generic Dynamic Programming can be formalised as hypergraph searching.
<p>
Joint work with David Chiang (University of Maryland)
<p>
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>08 Jun 2005</td>
<td align=left valign=top>Hao Zhang (Rochester)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs08_Jun_2005');">
Lexicalization and A* Searching for Inversion Transduction Grammar
</a><br>
<span id=abs08_Jun_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 4th floor<br>
<b>Abstract:</b> The Inversion Transduction Grammar (ITG) of \cite{DekaiCL} generates a
synchronous parse tree for a given pair of sentences in two languages. By
allowing inversion of the order of children at any level of the
synchronous parse tree, ITG can do recursive, systematic word reordering.
We made a version of ITG where the nonterminals are lexicalized by word
pairs and the inversions are dependent on the so-lexicalized nonterminals.
We found out that after lexicalization, the Alignment Error Rate (AER)
against gold standard is reduced for short sentences. ITG parsing
complexity is high polynomial. We proposed a pruning techique that
utilizes IBM Model 1 to estimate the inside and outside probability of a
bitext cell. Taking a step further, we applied the A* parsing having been
used for monolingual parsing to ITG.  I will talk about the heuristic
estimates we used for A* parsing for Viterbi alignment selection and
decoding.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 May 2005</td>
<td align=left valign=top>Radu Soricut</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_May_2005');">
Towards Developing Generation Algorithms for Text-to-Text
</a><br>
<span id=abs27_May_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Small<br>
<b>Abstract:</b> We describe a new sentence realization framework for text-to-text
applications. This framework uses IDL-expressions as a representation
formalism, and a generation mechanism based on algorithms for intersecting
IDL-expressions with probabilistic language models. We present both
theoretical and empirical results concerning the correctness and
efficiency of these algorithms.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>13 May 2005</td>
<td align=left valign=top>Ed Stabler (UCLA)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs13_May_2005');">
Natural Logic
</a><br>
<span id=abs13_May_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> I will describe some recent work on "natural logics", logics for languages
that are more similar to human languages than traditional first order
predicate logic, giving particular attention to questions about what the
syntax encodes about semantic relations among sentences. On everyone's
view, some but not all entailments are syntactically encoded (in a sense
that I will define precisely), but, beyond this starting point,
controversy starts almost immediately. Considering some particular
examples, I will sketch methods for addressing some of the basic
questions.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Apr 2005</td>
<td align=left valign=top>Deepak Ravichandran</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Apr_2005');">
Working with Large Corpus, High speed clustering and its applications
</a><br>
<span id=abs22_Apr_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> I am going to be talking about stuff that I have been working over the
past 6-9 months. This includes randomized algorithms and its application
to 2 NLP problems: noun clustering and noun-pair clustering. I will also
be commenting on my experience of working with very very large amounts of
real Natural Language text (This includes processing and working with data
available from the web. This corpus is not the standard newspaper text
that we are so used to in the NLP community.) This talk will also cover a
large part of my thesis work.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>08 Apr 2005</td>
<td align=left valign=top>Jamie Callan (CMU)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs08_Apr_2005');">
Search Engines for HLT Applications
</a><br>
<span id=abs08_Apr_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> TBA
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Mar 2005</td>
<td align=left valign=top>Dagen Wang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Mar_2005');">
Metalinguistic feature study for spontaneous speech in human computer interaction
</a><br>
<span id=abs25_Mar_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large (THIS HAS CHANGED!!!)<br>
<b>Abstract:</b> Speech is a crucial component in human computer interaction. While
tremendous progress has been made in automatic speech recognition, speech
transcription -- which is the output of automatic speech recognition -- is
far from providing all the information that one could retrieve from
speech. For example, prominence, pause, rhythm, and rate of speech all
carry important information in speech and are crucial in speech
perception. Inclusion of such information can facilitate better machine
recognition and understanding of speech.
<p>
In this talk, we will introduce the research effort and result in speech
rate, prominence, disfluency and utterance boundary detection. We will
also show some interesting applications utilizing these features in
natural language understanding and dialog management.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 Mar 2005</td>
<td align=left valign=top>Ed Hovy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_Mar_2005');">
Methodologies of ontology content construction
</a><br>
<span id=abs18_Mar_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This talk is the second in three tutorial lectures on ontologies.  It
first shows some details of various Upper Ontologies-ResearchCYC, SUMO,
DOLCE, and the Penman Upper Model.  It then discusses the problem of
creating content for the 'Middle Model' zone of ontologies, and outlines a
methodology for moving from words to word senses to concepts.  It
concludes by describing ISI's Omega ontology and showing how Omega has
been used in annotation projects to support semantic labeling of texts.
<p>
Please bring a pen or pencil and some paper; there is a small exercise!
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 Feb 2005</td>
<td align=left valign=top>Inderjeet Mani (Georgetown)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_Feb_2005');">
TBA
</a><br>
<span id=abs18_Feb_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> TBA
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Feb 2005</td>
<td align=left valign=top>Tim Chklovski</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Feb_2005');">
Collecting Broad-Coverage Knowledge Bases from Volunteers
</a><br>
<span id=abs14_Feb_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> (Note that this is a MONDAY!)
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>11 Feb 2005</td>
<td align=left valign=top>Hae-Chang Rim</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs11_Feb_2005');">
Unsupervised Word Sense Disambiguation Using Wordnet Relatives
</a><br>
<span id=abs11_Feb_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> <br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>28 Jan 2005</td>
<td align=left valign=top>Yutaka Sasaki (ATR)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs28_Jan_2005');">
Research Activities in Speech Translation at ATR/QA as Question-Biased Term Extraction
</a><br>
<span id=abs28_Jan_2005 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This talk has two parts. In the first part, I will introduce research
activities in Speech-to-Speech Translation at ATR, including on-going
research on statistical machine translation. In the second part, I will
present a new approach to QA named Question-Biased Term Extraction (QBTE).
The QBTE directly extracts answers as terms biased by the question. To
confirm the feasibility of our QBTE approach, we conducted experiments on
the CRL QA Data based on 10-fold cross validation, using Maximum Entropy
Models as an ML technique. Experimental results showed that the trained
system achieved approximately 0.35 in MRR and 50% in TOP5 accuracy. This
part is an English version of my presentation given in IPSJ SIGNL-163 in
2004 in Japanese. If time allows, I would like to introduce the NTCIR-5
(2004/2005) Cross-Lingual QA task (CLQA) that I am going to organize.
<p>
About the speaker:
<p>
Yutaka Sasaki received his Ph.D. in Engineering from the University of
Tsukuba, Japan in 2000 for his work on generating Information Extraction
rules with hierarchically sored Inductive Logic Programming. He joined NTT
Laboratories in 1988. Since then, he was involved in research in
rule-based CAI, inductive logic programming, Information Extraction, and
Question Answering. From 1995 to 1996, he spent one year at Simon Fraser
University, Canada as a visiting researcher. From 1999, he led a subgroup
to develop the first practical Japanese Question Answering System SAIQA.
Then, he applied SVMs to automatically construct the QA system SAIQA-II
from QA and NE data. In June 2004, he moved to ATR Spoken Language
Translation Research Laboratories. Currently, he is the head of Department
of Natural Language Processing. He is also an organizer of the NTCIR 5
Cross-Lingual Question Answering Task.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Dec 2004</td>
<td align=left valign=top>Nicola Ueffing</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Dec_2004');">
Word-Level Confidence Measures for SMT
</a><br>
<span id=abs17_Dec_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This talk will address the problem of assessing the correctness of MT
output on the word level. I will give an overview on word confidence
measures for SMT.  Different variants of word posterior probabilities that
can be directly used as confidence measure will be presented. Their
connection with the Bayes decision rule and the underlying error measure
will be shown. Experimental comparison of different word confidence
measures will be presented on a translation task consisting of technical
manuals.
<p>
Additionally, I will show how word confidence measures can be applied in
an interactive SMT system. This system predicts translations, taking parts
of the sentence into account that have already been accepted or typed by
the user. Through the use of confidence measures, the performance of the
prediction engine can be improved.
<p>
<p>
About the Speaker:
<p>
Nicola Ueffing is a graduate research assistant at the group for "Human
Language Technology and Pattern Recognition" (Lehrstuhl fuer Informatik
VI) at RWTH Aachen University. She received her diploma in mathematics
from RWTH Aachen University in 2000. Her research topic is statistical
machine translation, focusing on confidence measures for SMT. In 2003, she
was a member of the team working on "Confidence Estimation for SMT" at the
CLSP workshop at JHU.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Dec 2004</td>
<td align=left valign=top>Nick Mote</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Dec_2004');">
Developing a Language Model for Second Language Learner Speech
</a><br>
<span id=abs10_Dec_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> ISI's Tactical Language Project is a system designed to teach Americans
how to speak Arabic through a video game environment. We've taken a FPS
engine (Unreal 2003) and re-did the graphics so it looks like you're in a
typical Lebanese village. We took away the guns, added speech recognition,
and set the player in the middle of it all. The theory is that if you
learn well in a classroom, you'll perform well in a classroom, but if you
learn well in a pseudo-naturalistic environment, you'll perform better in
real life.
<p>
In a pedagogical context, speech recognition is a hard thing we're trying
to recover signal from noisy language-learner speech--with all of its
mispronunciations, disfluencies, and grammatical errors . Language
understanding is hopeless unless you have a good approximation of what
kinds of mistakes learners make, and you can build a system to anticipate
them.
<p>
Suppose an English language learner says "Water". Is he asking you for
water? Is he telling you there's a puddle in front of you? Is he saying
his name is "Walter", but with horrible pronunciation? There's a lot of
ambiguity involved. In order to disambiguate, we need to look at the
speech signal itself, the utterance's context, the learner's past language
performance, and details about the learner's mother language as it relates
to English, etc., etc... Only then can we hope to guess what the learner
is actually trying to say.
<p>
And then, of course, once we've made a good guess at the learner's speech
intentions, what do we do about it? How do we correct him? How do we
balance the consideration of inherent qualities of learner motivation,
language errors, learning objectives, and possibly low-confidence speech
recognition, as we generate good pedagogical feedback?
<p>
This is NLP (primarily statistical) with a bit of pedagogy theory and
linguistic (SLA and phonology) theory sprinkled in.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 Nov 2004</td>
<td align=left valign=top>Chin-Yew Lin</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_Nov_2004');">
After TIDES, What's Left? - Finding Basic Elements
</a><br>
<span id=abs19_Nov_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> As DARPA's TIDES (Translingual Information Detection, Extraction, and
Summarization) program coming to an end, I will give a summary of what we
have learned from TIDES in summarization and a brief overview of our
current effort in developing automatic evaluation methods that go beyond
surface n-gram matching. Topics to be covered:
<p>
(1) Summary of DUCs 2001 - 2004
(2) Automatic Evaluations in Summarization and MT
(3) Basic Elements - New Efforts in Summarization at ISI
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Nov 2004</td>
<td align=left valign=top>Thiago Pardo</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Nov_2004');">
Unsupervised learning of verb argument structures
</a><br>
<span id=abs15_Nov_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 8th floor multipurpose room (#849) -- NOT the conference room<br>
<b>Abstract:</b> In this talk, I'll present the investigation I'm carrying out in ISI
lately under Daniel Marcu's supervision.  Following the noisy-channel
framework, we propose a statistical model for learning the argument
structures of verbs automatically.  We show that we are able to learn both
lexicalized and generalized structures and achieve good results, relying
only on basic NLP tools like a POS tagger and named-entity recognizer. We
also present a comparison of the structures we learn with the predicted
ones in PropBank.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>12 Nov 2004</td>
<td align=left valign=top>Dragomir Radev</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs12_Nov_2004');">
Words, links, and patterns: novel representations for Web-scale text mining
</a><br>
<span id=abs12_Nov_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Textual data is everywhere, in email and scientific papers, in online
newspapers and e-commerce sites. The Web contains more than 200 terabytes
of text not even counting the contents of dynamic textual databases. This
enormous source of knowledge is seriously underexploited. Textual
documents on the Web are very hard to model computationally: they are
mostly unstructured, time-dependent, collectively authored, multilingual,
and of uneven importance.  Traditional grammar-based techniques don't
scale up to address such problems. Novel representations and analytical
tools are needed.
<p>
I will discuss several current projects at Michigan related to text mining
from a variety of genres. Depending on the amount of time, I will talk
about (a) lexical centrality for multidocument summarization, (b)
syntax-based sentence alignment, (c) graph-based classification,(d)
lexical models of Web growth, and (e) mining protein interactions from
scientific papers. As it turns out, the right representations, when
complemented with traditional NLP and IR techniques, turn many of these
into instances of better studied problems in areas such as social
networks, statistical mechanics, sequence analysis, and computational
phylogenetics.
<p>
<p>
<p>
About the Speaker:
<p>
Dragomir R. Radev is Assistant Professor of Information, Electrical
Engineering and Computer Science, and Linguistics at the University of
Michigan, Ann Arbor.  He leads the CLAIR (Computational Lingusitics
And Information Retrieval) group which currently includes 12
undergraduate and graduate students.  Dragomir holds a Ph.D. in
Computer Science from Columbia University.  Before joining Michigan,
he was a Research Staff Member at IBM's TJ Watson Research Center in
Hawthorne, NY.  He is the author of more than 45 papers on information
retrieval, text summarization, graph models of the Web, question
answering, machine translation, text generation, and information
extraction.  Dr. Radev's current research on probabilistic and
link-based methods for exploiting very large textual repositories,
representing and acquiring knowledge of genome regulation, and
semantic entity and relation extraction from Web-scale text document
collections is supported by NSF and NIH.  Dragomir serves on the
HLT-NAACL advisory committee, was recently reelected as treasurer of
NAACL, is a member of the editorial boards of JAIR and Information
Retrieval, and is a four-time finalist at the ACM international
programming finals (as contestant in 1993 and as coach in
1995-1997). Dragomir received a graduate teaching award at Columbia
and recently, the U. of Michigan award for Outstanding Research
Mentorship (UROP).
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Nov 2004</td>
<td align=left valign=top>Mary Wood (Manchester)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Nov_2004');">
A Human-Computer Collaborative Approach to Computer Aided Assessment
</a><br>
<span id=abs05_Nov_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> The ABC (Assess by Computer) system has been developed and used in the
School of Computer Science at the University of Manchester for formative
and (principally) summative assessment at undergraduate and postgraduate
level. We believe that fully automatic marking of constructed answers -
especially free text answers - is not a sensible aim. Instead - drawing on
parallels in the history of machine translation - we take a
"human-computer collaborative" approach, in which the system does what it
can to support the efficiency and consistency of the human marker, who
keeps the final judgement.
<p>
Our current work focuses on what are generally referred to as "short text
answers" as contrasted to "essays". However we prefer to contrast
"factual" with "discursive" answers, and speculate that the former may be
amenable to simple statistical techniques, while the latter require more
sophisticated natural language analysis. I will show some examples of real
exam data and the techniques we are using and developing to handle them.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Oct 2004</td>
<td align=left valign=top>Jerry Hobbs</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Oct_2004');">
Like Now:  Two Explorations in Deep Lexical Semantics
</a><br>
<span id=abs22_Oct_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> As part of an effort to encode the commonsense knowledge we need in
natural language understanding, I have been looking at several very common
words and their uses in diverse corpora, and asking what we have to know
to understand this word in this context.  In this talk, I will describe
the investigations of the uses of two words -- the adverb "now" and the
preposition "like".
<p>
One might think that "now" simply expresses a temporal property of an
event.  But in fact in almost every instance, it is used to point up a
contrast -- "This is true now.  Something else was true then."  It is thus
more of a relation than a property.  I will describe several categories of
such relations.  Another question of interest about "now" is "How long a
period is the word "now" describing in its various uses?": "I'm typing an
abstract now" vs. "We travel by automobile now."  I suggest some
categories of knowledge that need to be encoded to answer this question.
<p>
When we successfully understand "A is like B", we have figured out some
property that A and B have in common.  How can we find that property
computationally?  In the data I looked at, in 80% of the instances, the
property is explicit in the nearby text, and I will talk about how we can
identify it.  For the remainder I examine the knowledge we would need in
order to infer the common property.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 Sep 2004</td>
<td align=left valign=top>Hal Daume III</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_Sep_2004');">
Domain Adaptation in Maximum Extropy Models
</a><br>
<span id=abs24_Sep_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> I will present some preliminary results on the problem of domain
adaptation in maximum entropy models, specifically in the case when there
is a large amount of "out of domain" data, and only a very small amount of
"in domain" data.  The model and algorithms I present are based on the
technique of conditional Expectation Maximization (CEM) and allow for
relatively fast optimization of these models.  Preliminary results on some
tasks are quite promising.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Sep 2004</td>
<td align=left valign=top>Various</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Sep_2004');">
About Syntax Fest 2004 (Part II)
</a><br>
<span id=abs17_Sep_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This summer we held a three-month workshop on syntax-driven machine
translation, in which we learned syntactic transformations automatically
from Chinese/English translated corpora and applied them to translate new
text.  We'll give a progress report!
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Sep 2004</td>
<td align=left valign=top>Various</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Sep_2004');">
About Syntax Fest 2004 (Part I)
</a><br>
<span id=abs10_Sep_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This summer we held a three-month workshop on syntax-driven machine
translation, in which we learned syntactic transformations automatically
from Chinese/English translated corpora and applied them to translate new
text.  We'll give a progress report!
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Aug 2004</td>
<td align=left valign=top>Patrick Pantel & Tim Chklovski</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Aug_2004');">
VerbOcean: Mining the Web for Fine-Grained Semantic Verb Relations
</a><br>
<span id=abs16_Aug_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 2:00 pm - 3:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Broad-coverage repositories of semantic relations between verbs could
benefit many NLP tasks. We present a semi-automatic method for extracting
fine-grained semantic relations between verbs. We detect similarity,
strength, antonymy, enablement, and temporal happens-before relations
between pairs of strongly associated verbs using lexico-syntactic patterns
over the Web. On a set of 29,165 strongly associated verb pairs, our
extraction algorithm yielded 65.5% accuracy. We provide the resource,
called VerbOcean, for download at http://semantics.isi.edu/ocean/. We will
also discuss current work on disambiguating the verbs in the network as
well as refining the semantic relations using path analysis.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>13 Aug 2004</td>
<td align=left valign=top>Deepak Ravichandran</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs13_Aug_2004');">
Randomized algorithms and its application to NLP
</a><br>
<span id=abs13_Aug_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> The last decade has seen a plethora of papers in NLP devoted to Machine
Learning algorithms. However, most of these papers have devoted their
effort exclusively to improving the system performance on the accuracy
axis. Most of the sophisticated NLP algorithms are extremely slow and do
not scale up easily when applied to large amounts of data.
<p>
I will talk about the importance of randomized algorithms and their
potential in speeding up some NLP algorithms. This talk will be a survey
of some recent advances in Theoretical Computer Science/Math seen with an
NLP point-of-view. I am not going to present any results. But I am hoping
that this talk will clarify my thinking process, get feedback from people
and help me colloborate with others.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 Aug 2004</td>
<td align=left valign=top>Justin Busch, Hai Huang, Jens Stephan & Chen-kang Yang</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_Aug_2004');">
CL Student Presentations
</a><br>
<span id=abs09_Aug_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Justin Busch:
Weight and Semantic Class Issues in Japanese Noun Phrase Ordering
<p>
Many current designs for automatic parsers learn probabilities for the
relative frequencies of parts-of-speech and syntactic rules, and this has
proven to be generally reliable. In spite of the ubiquity of probabilistic
techniques for parsing, however, little attention has been given to the
linguistic significance of the probabilistic data and what it might say
about human performance.
<p>
Hawkins proposes a general theory of grammaticalization based on the
minimization of syntactic domains. Given that a sentence of any language
will contain at least one noun phrase, one verb, and possibly additional
noun phrases and prepositional phrases, "minimize domains" suggests that
these phrases will order themselves according to whichever pattern
requires the least effort to recognize the higher syntactic structure of
the sentence. These effects are directly measurable through corpus
statistics, and can be interpreted as potential heuristics for
probabilistic parsers.  In this study, we examine Japanese data from the
Kyoto Treebank and test Hawkins' predictions for noun phrase ordering by
noun phrase weight as well as by generic semantic types. The discussion
will focus primarily on how accurately Hawkins' predictions are reflected
in the corpus statistics, and will conclude with observations about how
they might be applied to the decision mechanisms of probabilistic parsers.
<p>
--------------------------------------------------------------------------
<p>
Hai Huang:
TBA
<p>
--------------------------------------------------------------------------
<p>
Jens Stephan:
Evaluation and Visualization of a Dialogue System
<p>
Evaluations have become a necessary standard to almost any type of
research. However, there are many areas where there is no common agreement
on how to evaluate, which is the case for complex problem of evaluating
dialogue systems. The evaluation of the multi party multi modal dialogue
system MRE(1) provides a good example of what questions are important for
such an evaluation, how to actually do the evaluation and finally how to
how make special problems of the system visible to use the evaluation
results to improve the systems performance.
<p>
After a brief introduction of the MRE domain and architecture, I will
break the task town to a set of general evaluation questions. From there I
will explain what kinds of metrics and visualizations are suited to answer
those questions and what kind of data is needed, as well as how that data
was obtained. Along the road, examples of actual system problems and
performances will be presented. The topics of data formatting and
visualization will receive some special attention by introducing the MRE
Evaluation Toolkit as well as the corpus it operates on.
<p>
--------------------------------------------------------------------------
<p>
Chen-kang Yang:
Using the Omega Ontology to Determine Selectional Restrictions for Word Sense Disambiguation
<p>
Word sense disambiguation is fundamental for language processing. Though
purely statistical methods are effective for this task, they neglect the
syntactic and semantic aspects. In this study, we adopt a hybrid approach
by applying an unsupervised machine learning method to learn verbs
selectional restrictions on their subjects/objects. The system then uses
these learned selectional restrictions for word sense disambiguation of
the subjects/objects. Instead of words, the training data contains
ontological taxonomy hierarchies that are retrieved from the Omega
ontology. Unlike other similar systems, we are able to automatically find
the best match among classes from different levels of the ontology. This
provides us more flexibility and is closer to human instinct. Our system
performs better than other similar systems, though it still needs
cooperating methods for better results.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 Aug 2004</td>
<td align=left valign=top>Hae-Chang Rim</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_Aug_2004');">
Information Retrieval using Word Senses: Root Sense Tagging Approach
</a><br>
<span id=abs06_Aug_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Information retrieval using word senses is emerging as a good research
challenge on semantic information retrieval. In this presentation, I am
going to propose a new method using word senses in information retrieval:
root sense tagging method. This method assigns coarse-grained word senses
defined in WordNet to query terms and document terms by unsupervised way
using co-occurrence information constructed automatically. The sense
tagger is crude, but performs consistent disambiguation by considering
only the single most informative word as evidence to disambiguate the
target word. We also allow multiple-sense assignment to alleviate the
problem caused by incorrect disambiguation.
<p>
Experimental results on a large-scale TREC collection show that the
proposed approach to improve retrieval effectiveness is successful, while
most of the previous work failed to improve performances even on small
text collection. The proposed method also shows promising results when is
combined with pseudo relevance feedback and state-of-the-art retrieval
function such as BM25.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Jul 2004</td>
<td align=left valign=top>Hal Daume III and Radu Soricut</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Jul_2004');">
Practice Talks for ACL (+workshops)
</a><br>
<span id=abs16_Jul_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> TBA
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 Jul 2004</td>
<td align=left valign=top>Kevin Knight</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_Jul_2004');">
Survey of Trees and Grammars
</a><br>
<span id=abs09_Jul_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> I'll give a survey of trees and grammars, at least the parts that seem
most relevant to ongoing work at ISI.  This will be a theory talk.  I'll
start with context-free grammars, which were developed in the 1950s, and
cover other tree-generating systems.  I'll also talk about
tree-transforming systems.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>02 Jul 2004</td>
<td align=left valign=top>Hal Daume III</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs02_Jul_2004');">
A Phrase-Based HMM Approach to Document/Abstract Alignment
</a><br>
<span id=abs02_Jul_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 1:30 pm - 3:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> I will present work that extends the standard hidden Markov model to a
version that can emit multiple symbols in a single time step.  Using this
model, we are able to automatically create phrase-to-phrase mappings in an
alignment process.  I've applied this model to the task of creating
alignments between documents and their human-written abstracts, yielding
an overall alignment F-score of 0.548, a significant improvement on the
best results to date of 0.363.  These results are published in an EMNLP
paper this year, but the talk will be an extended version of the talk I
will give there (namely, I will discuss the mechanics of the extended HMM
in more detail in this seminar).
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Jun 2004</td>
<td align=left valign=top>Dan Gildea</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Jun_2004');">
Syntactic Supervision and Tree-Based Alignment
</a><br>
<span id=abs25_Jun_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Tree-based probability models of translation have been proposed to take
advantage of parse trees on one, both, or neither sides of a parallel
corpus.  I will present comparative results for these three approaches for
the task of word alignment on Chinese-English and French-English data, as
well as some analysis of what is going on behind the numbers.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>21 Jun 2004</td>
<td align=left valign=top>Emil Ettelaie</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs21_Jun_2004');">
Speech-to-Speech Translation: A Phrase Classification Approach
</a><br>
<span id=abs21_Jun_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This talk will be about automatic speech-to-speech translation.  In our
system, a doctor speaks one language, the patient speaks another language,
and the machine translates their utterances from one language to the
other.  The talk will be followed by a demo of our system.
<p>
One approach we have been successful with is phrase classification, i.e.,
classifying a noisy speech-recognized utterance into one of many meaning
categories.  Phrase classification is computationally cheap and can
provide high quality translations for in domain utterances almost
instantaneously. Speed is important for speech translation, where
processing delay is a great concern.
<p>
In this talk, different aspects of building a classification-based speech
translator are discussed. Following an overview of automatic
speech-to-speech translation and its challenges, a comparison of different
classification methods is presented and data collection techniques for
that application are introduced.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Jun 2004</td>
<td align=left valign=top>Marcello Federico</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Jun_2004');">
Statistical Machine Translation at ITC-irst
</a><br>
<span id=abs17_Jun_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 4th Floor<br>
<b>Abstract:</b> My presentation will overview recent activities on Chinese-English SMT
carried out at ITC-irst (Trento, Italy).  After an overview of the
complete architecture of our system, I will focus on progress made in
Chinese word-segmentation, phrase-based modeling and decoding, log-linear
modeling and minimum error training, and language model adaptation.
Experimental results will be provided in terms of Bleu and Nist scores on
two translation tasks:  basic traveling expressions and news reports,
respectively adopted by the C-STAR consortium and for the 2002 and 2003
NIST MT evaluation campaigns.
<p>
Bio:
<p>
Marcello Federico has been a permanent researcher at ITC-irst since 1991.
During 1998-2003, he led the "Multilingual natural speech technologies"
(MUNST)  research line at ITC-irst.  Since 2004, he is head of the
"Cross-language information processing" (Hermes) research line. His
interests include automatic speech recognition, statistical language
modeling, information retrieval, and machine translation.
<p>
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 May 2004</td>
<td align=left valign=top>Philipp Koehn</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_May_2004');">
Challenges in Statistical Machine Translation
</a><br>
<span id=abs24_May_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 4:00 pm - 5:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> In the last years a standard model in statistical machine
translation has emerged, which is based on the translation
of sequences of words (so-called "phrases") at a time.
I will describe this model, how to train and decode with it,
but the focus of this talk will be how to address the
challenges to advance and move beyond the model: my thesis
work on noun phrase translation, making use of syntax, and
better modeling, such as discriminative training.
<p>
Bio: Philipp Koehn is the author of papers on natural language
processing, machine translation, and machine learning. He
received his PhD from the University of Southern California
in 2003 (advisor: Kevin Knight), and is currently employed as
a postdoc at the Massachusetts Institute of Technology, working
with Michael Collins. He has worked at AT&T Laboratories on
text-to-speech systems, and at WhizBang! Labs on text
categorization.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>21 May 2004</td>
<td align=left valign=top>Tom Murray and Rahul Bhagat</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs21_May_2004');">
Statistical Learning for Dialogue System <b>and</b> A Community of Words
</a><br>
<span id=abs21_May_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> <b>Natural Language Understanding: A fast and accurate Statistical Learning Approach for Dialogue Systems</b>
<p>
Natural Language Understanding (NLU) is an essential module of a good
dialogue system. To achieve satisfactory performance levels, real time
dialogue systems need the NLU module to be both fast and accurate. Finite
State Model (FSM) based systems are fast and accurate but lack robustness
and flexibility. The Statistical Learning Model (SLM) based systems are
robust and flexible but lack accuracy and are at most times slow.
<p>
In this talk, I am going to talk about an SLM based NLU approach for
dialogue utterances that is both accurate and fast. The system has high
accuracy and produces frames in real time.
<p>
<b>A Community of Words: Understanding Social Relationships from E-mail</b>
<p>
A corpus of e-mail messages presents a number of challenges for NLP
techniques, with its nearly unconstrained structure and vocabulary,
mistyped words and ungrammatical sentences, and extensive contextual
information that is never explicitly stated. Yet, the intrinsically social
nature of such communication provides an opportunity to study not just a
bag of words, but also the relationships, competencies, and activities
behind them.
<p>
This talk presents work with Eduard Hovy as part of the MKIDS project.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Apr 2004</td>
<td align=left valign=top>Liang Zhou</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Apr_2004');">
Automating the Building of Summarization Systems
</a><br>
<span id=abs30_Apr_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Summarization requires one to identify the internal structure of
information and to bring that to the surface both operationally and
organizationally.
<p>
How does one put this theory to practice and build real summarization
systems? How do the systems built based on this idea perform?
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>28 Apr 2004</td>
<td align=left valign=top>Dragos Muntanu, Radu Soricut and Hal Daume III</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs28_Apr_2004');">
Practice Talks for HLT/NAACL
</a><br>
<span id=abs28_Apr_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 5:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> TBA
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 Apr 2004</td>
<td align=left valign=top>Hal Daume III</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_Apr_2004');">
A Tree-Position Kernel for Document Compression
</a><br>
<span id=abs23_Apr_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 10 Large<br>
<b>Abstract:</b> I'll describe our entry into the DUC 2004 automatic document summarization
competition.  We competed only in the single document, headline generation
task.  Our system is based on a novel kernel dubbed the tree position
kernel, combined with two other well-known kernels.  Our system performs
well on white-box evaluations, but does very poorly in the overall DUC
evaluation.  C'est la vie.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Apr 2004</td>
<td align=left valign=top>Rada Mihalcea (UNT)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Apr_2004');">
Graph-based Ranking Algorithms for Language Processing
</a><br>
<span id=abs16_Apr_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 am - 12:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Although we live in a predominantly statistical world, there are still
many language processing applications that long for accurate
representations of text meaning. Even applications that found partial
solutions in statistical modeling, including information retrieval,
machine translation, or automatic summarization, are likely to get a
significant boost from deeper text understanding.
<p>
In this talk, I will present an innovative method for automatic extraction
of conceptual graphs as a means to represent text meaning.Â  The method
relies on a novel adaptation of graph-based ranking algorithms -
traditionally (and successfully) used in citation analysis, Web page
ranking, and social networks. I will show how such algorithms can be
adapted to semantic networks, resulting in an efficient unsupervised
method for resolving the semantic ambiguity of all words in open text, and
identifying relations between entities in the text. I will also outline a
number of applications that are enabled by this representation, including
keyphrase extraction, domain classification, and extractive summarization.
<p>
BIO: Rada Mihalcea is an Assistant Professor of Computer Science at
University of North Texas. Her research interests are in lexical
semantics, minimally supervised natural language learning, and
multilingual natural language processing. She is currently involved in a
number of research projects, including word sense disambiguation, shallow
semantic parsing, (non-traditional) methods for building annotated corpora
with volunteer contributions over the Web, word alignment for language
pairs with scarce resources, and graph-based ranking algorithms for
language processing. Her research is supported by NSF and the state of
Texas.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>13 Apr 2004</td>
<td align=left valign=top>Jill Burstein (ETS)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs13_Apr_2004');">
Automated Essay Evaluation: From NLP research through deployment as a business
</a><br>
<span id=abs13_Apr_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 4 Large<br>
<b>Abstract:</b> Automated essay scoring was initially motivated by its potential cost
savings for large-scale writing assessments.  However, as automated essay
scoring became more widely available and accepted, teachers and assessment
experts realized that the potential of the technology could go way beyond
just essay scoring.  Over the past five years or so, there has been rapid
development, and commercial deployment of automated essay evaluation for
both large-scale assessment and classroom instruction.  A number of
factors contribute to an essay score, including varying sentence
structure, grammatical correctness, appropriate word choice, errors in
spelling and punctuation, use of transitional words/phrases, and
organization and development. Instructional software capabilities exist
that provide essay scores and evaluations of student essay writing in all
of these domains.  The foundation of automated essay evaluation software
is rooted in NLP research.  This talk will walk through the development of
CriterionSM, e-rater, and Critique writing analysis tools, automated essay
evaluation software developed at Educational Testing Service - from NLP
research through deployment as a business.
<p>
(Preview of an HLT/NAACL-2004 Invited Speaker Presentation)
<p>
Jill Burstein
Educational Testing Service
Princeton, NJ
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 Apr 2004</td>
<td align=left valign=top>Eduard Hovy</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_Apr_2004');">
Three (and a half?) Trends: The Future of NLP
</a><br>
<span id=abs09_Apr_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> An interesting (disturbing?) new trend is beginning to manifest itself in
NLP, one that is focused on performance and hence very attractive in the
context of inter-system competitive evaluations such as TREC and DUC, but
one that does not provide much insight about language or NLP methods to
the researcher interested in these topics.  This addition of a new
paradigm to NLP has implications for all of us.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>02 Apr 2004</td>
<td align=left valign=top>Stephan Vogel</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs02_Apr_2004');">
The CMU Statistical Machine Translation System
</a><br>
<span id=abs02_Apr_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> The presentation will give an overview of the SMT activities at the
Language Technologies Institute, Carnegie Mellon University, in large
vocabulary text translation tasks, esp. the Chinese-English and
Arabic-English, as well as in limited domain speech-to-speech translation
tasks.  The CMU SMT system is, like most modern statistical MT systems,
based on phrase translation.  Several approaches have been developed to
extract the phrase pairs from parallel corpora and current research
investigates different scoring approaches for these translation pairs.
Details of the decoder, esp. on hypothesis recombination, pruning, and
efficient n-best list generation will be given.  Recently, the SMT system
has been extended to use partial translations generated from example based
and grammar based translation system, thereby performing multi-engine
machine translation.
<p>
Bio:
<p>
Stephan Vogel is a researcher at the Language Technologies Institute,
Carnegie Mellon University, where he heads the statistical machine
translation team.  He received a Diploma in Physics from Philips
University Marburg, Germany, and a Masters of Philosophy from the
University of Cambridge, England.  After working for a number of years on
the history of science, he turned to computer science, especially natural
language processing.  Before coming to CMU, he worked for several years at
the Technical Univerity of Aachen on statistical machine translation, and
also in the Interactive Systems Lab at the University of Karlsruhe.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>26 Mar 2004</td>
<td align=left valign=top>Shlomo Argamon</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs26_Mar_2004');">
On Writing, Our Selves: Explorations in Stylistic Text Categorization
</a><br>
<span id=abs26_Mar_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 1:30 pm - 3:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> This talk will survey results of several recent projects we have been
undertaking in automated text categorization based upon the style,
rather than the topic, of the documents.  I will describe a general
text-categorization framework using machine learning along with general
principles for choosing stylistically relevant sets of features for
learning effective classification models.  Applications of these methods
include determining author gender and text genre in published books and
articles, authorship attribution of email messages, and analysis of
language use in different scientific fields.  In many cases, the models
that are learned also give some insight into the respective styles being
distinguished, which I will also discuss.
<p>
Shlomo Argamon is an associate professor at the Illinois Institute of
Technology Chicago.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Mar 2004</td>
<td align=left valign=top>Jon Patrick (U. of Sydney)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Mar_2004');">
ScamSeek: Capturing Financial Scams at the Coalface by Language Technology
</a><br>
<span id=abs25_Mar_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 am - 12:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> The Scamseek project aims to build a surveillance tool for identifying
financial scams on the Internet by performing document classification of
Internet pages. There are three principle types of documents of concern:
those that give financial advice by unregistered advisors, unlawful
investment schemes, and share ramping.
<p>
The first phase of the project has been completed and a working system,
known as ScamAlert installed at the Australian Securities and Investment
Commission (ASIC). The independent audit of the performance of the system
proved satisfactory with a result for precision of .75, recall .43, and
F=. 54, along with identification of 4 scams misclassified by the client.
Significant improvement in recall is foreshadowed in the 2nd phase of the
project.  The results are satisfying in the context of the structure of
the data where the density of scam documents is about 1.8% of the total
corpus.
<p>
The good performance of the operational system is ascribed to the
combination of using a strong linguistic model of language (Systemic
Functional Linguistics) to define the scam documents in parallel with a
rich statistical analysis of the structure of non-scam documents and scam
look-alikes. A large amount of the experimental program has concentrated
on understanding and exploiting the interaction between the linguistically
described aspects of the documents and the statistical properties. Each
type of data has been used to inform and modify the usage of the other.
<p>
The operational aspects of the project have proven to be as challenging as
the research objectives. The project has a budget of $2.2M over 15 months.
It has been managed so as to create a balance in resources between the
needs of both the research objectives and the engineering objectives.
Software development has concentrated on three aspects. Firstly, to
produce an environment for the strong directive management of
computational linguistics experiments, secondly, in the aid of the
linguists to create tools to support their manual analysis, and thirdly
the best practice of software engineering principles to ensure a clean
automated rollout of the production system for ASIC.
<p>
The contributing partners in the Scamseek project are The Capital Markets
Co-operative Research Centre (CMCRC), ASIC, the University of Sydney and
Macquarie University.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>12 Mar 2004</td>
<td align=left valign=top>Deepak Ravichandran</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs12_Mar_2004');">
About My Thesis Proposal
</a><br>
<span id=abs12_Mar_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> TBA
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>20 Feb 2004</td>
<td align=left valign=top>Hal Daume III</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_Feb_2004');">
Some Results in Automatic Evaluation for Summarization and MT
</a><br>
<span id=abs20_Feb_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 4 Large<br>
<b>Abstract:</b> I will be presenting some recent results of mine regarding the possibility
of automatic evaluation in summarization.  I will discuss both my own
findings, as well of those of people here and at Columbia, and attempt to
explain in a principled fashion why there are disparate opinions on the
plausibility of performing automatic evaluation in this task.  I will
discuss my (perhaps pessimistic) views on the plausibility of doing any
sort of evaluation of summarization, automatic or otherwise.
<p>
The results and experimental setups developed in connection with
summarization will be extended to the machine translation.  I will review
possible reasons why metrics such a bleu have experienced significantly
more success in machine translation than in summarization.  I will also
connect the evaluation criterea developed in the context of summarization
to machine translation, and discuss the automation of these methods.
<p>
In short: I'll talk about why I've been doing so much data elicitaiton
recently.
<p>
This will be a highly informal seminar and participation is highly
encouraged.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>06 Feb 2004</td>
<td align=left valign=top>Mark Hopkins</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs06_Feb_2004');">
What's in a Translation Rule?
</a><br>
<span id=abs06_Feb_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We propose a theory that gives formal semantics to word-level
alignments defined over parallel corpora. We use our theory to
introduce a linear algorithm that can be used to derive from
word-aligned, parallel corpora the minimal set of syntactically
motivated transformation rules that explain human translation data.
<p>
(joint work with Michel Galley, Kevin Knight, and Daniel Marcu)
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>30 Jan 2004</td>
<td align=left valign=top>Paul Kingsbury (Penn)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs30_Jan_2004');">
PropBank: the next stage of Treebank <b>and</b><br>Inducing a Chronology of the Pali Canon
</a><br>
<span id=abs30_Jan_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> PropBank: the next stage of Treebank
<p>
Natural-language engineers the world over are coming to a consensus that a
degree of semantic knowledge is a necessary addition to purely structural
representations of language.  This talk describes the Propbank project at
Penn, which provides a complete shallow semantic parse of the Treebank II
corpus.
<p>
Inducing a Chronology of the Pali Canon:
<p>
Works such as Kroch (1989), Taylor (1994) and Han (2000) have demonstrated
that syntactic change can be described mathematically as the competition
between innovating and archaic formations.  This paper demonstrates how
this same mathematical description can be turned around to predict the
date of a historical text.  The Middle Indic period showed dramatic change
in the morphological system, such as the collapse of the past-tense verbal
system.  Whereas Sanskrit had three competing formations, each with
multiple possible morphological realizations, Pali (a Middle Indo-Aryan
language) had only a single formation, based mostly on the sigmatic aorist
although many archaic nonsigmatic aorists are also attested.  The
proportions of the archaic and innovative forms can be easily calculated
for each text in the Pali Canon and these proportions used to assign an
approximate date for each text.  The accuracy of the method can be
assessed qualitatively by comparing the derived chronology to chronologies
based on various non-linguistic criteria, or quantitatively by comparing
the derived chronology to a known dating scheme.  For the latter it is
necessary to turn to a different dataset, such as that describing the rise
of do-support in Early Modern English, as described in Ellegard (1953) and
Kroch (1989).
<p>
Bio:
<p>
Paul Kingsbury graduated summa cum laude in linguistics from Ohio State
University in 1993 with a thesis on "Some sources for L-words in
Sanskrit".  He subsequently entered the University of Pennsylvania to
study historical linguistics and Sanskrit, but (like most historical
students) was diverted to computational issues.  He joined the Propbank
project in 2000 and soon thereafter engineered a major rethinking of the
methods and goals of the project, in order to make the annotation
linguistically meaningful.  He completed his doctorate in 2002 with a
thesis entitled 'The Chronology of the Pali Canon: the case of the
aorist'.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 Jan 2004</td>
<td align=left valign=top>John Prager (IBM)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_Jan_2004');">
Using Constraints to Improve Question-Answering Accuracy
</a><br>
<span id=abs16_Jan_2004 style="display:none;">
<font size=-1>
<b>Time:</b> 2:00 pm - 3:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Leading Question-Answering systems employ a variety of means to boost the
accuracy of their answers.  Such methods include redundancy (getting the
same answer from multiple documents/sources), deeper parsing of questions
and texts (hence improving the accuracy of confidence measures),
inferencing (proving the answer from information in texts plus background
knowledge) and sanity-checking (verifying that answers are consistent with
known facts).  To our knowledge, however, no QA system deliberately asks
additional questions in order to derive constraints on the answers to the
original questions.
<p>
We present in this talk the method of QA-by-Dossier-with-Constraints (QDC).
This is an extension of the simpler method of QA-by-Dossier, in which
definitional questions ("Who/what is X") are addressed by asking a set of
questions about anticipated properties of X.  In QDC, the collection of
Dossier candidate answers, along with possibly other answers to questions
asked expressly for this purpose, are subjected to satisfying a set of
naturally-arising constraints.  For example, for a "Who is X" question, the
system will ask about birth, accomplishment and death dates, which, if they
exist, must occur in that order, and also obey other constraints such as
lifespan.  Temporal, spatial and kinship relationships seem to be
particularly amenable to this treatment, but it would seem that almost any
"factoid" question can benefit from QDC.  We will discuss the setting-up
and application of constraint networks, and talk about how (and whether) to
develop the constraint sets automatically.  We will demonstrate several
applications of QDC, and present one evaluation in which the F-measure for
a set of questions improved with QDC from .39 to .69.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>19 Dec 2003</td>
<td align=left valign=top>Robert Krovetz (Ask Jeeves)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs19_Dec_2003');">
More than One Sense Per Discourse
</a><br>
<span id=abs19_Dec_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Previous research has indicated that when a polysemous word appears two
or more times in a discourse, it is extremely likely that they will all
share the same sense (Gale et al. 92). However, those results were
based on a coarse-grained distinction between senses (e.g, {\em
sentence} in the sense of a `prison sentence' vs. a `grammatical
sentence'). I conducted an analysis of multiple senses within two
sense-tagged corpora, Semcor and DSO. These corpora used WordNet for
their sense inventory. I found significantly more occurrences of
multiple-senses per discourse than reported in (Gale et al. 92) (33\%
instead of 4\%). I also found classes of ambiguous words in which as
many as 45\% of the senses in the class co-occur within a document. I
will discuss the implications of these results for the task of
word-sense tagging and for the way in which senses should be
represented.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Nov 2003</td>
<td align=left valign=top>Hang Li (MSR Beijing)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Nov_2003');">
Using Bilingual Data to Mine and Rank Translations
</a><br>
<span id=abs25_Nov_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 pm - 12:00 pm<br>
<b>Location:</b> 11th Floor Large<br>
<b>Abstract:</b> In this talk, I will introduce some of the technologies which
we have developed in the project on an English reading assistant system
called English Reading Wizard. The technologies include a method for
mining translations from web (unparallel corpora), a method for word
translation disambiguation based on bootstrapping, which is called
Bilingual Bootstrapping, and a general method of bootstrapping, which is
called Collaborative Bootstrapping. First, I will introduce the main
features of English Reading Wizard. Next, I will introduce each of the
methods. The translation mining method is based on a naÃ¯ve Bayesian
ensemble and the EM algorithm. Bilingual Bootstrapping uses the
asymmetric translation relationship between words in the two languages
in translation and can construct reliable classifiers for word
translation disambiguation. Collaborative Bootstrapping contains the
co-training algorithm as its special case, and it uses the strategy of
uncertainty reduction in training of the two classifiers.
<p>
Bio:
<p>
Hang Li is a researcher at the Natural Language Computing Group
of Microsoft Research in Beijing, China. He is also adjunct professor of
Xian Jiaotong University. Hang Li obtained a B.S. in Electrical
Engineering from Kyoto University (Japan) in 1988 and a M.S. in Computer
Science from Kyoto University in 1990. He earned his Ph.D. in Computer
Science from the University of Tokyo in 1998. >From 1990 to 2001, Hang
Li worked at the Research Laboratories of NEC Corporation in Kawasaki,
Japan. He joined Microsoft Research in 2001. Â His research interest
includes statistical learning, natural language processing, data mining,
and information retrieval. Hang Li's web site:
http://research.microsoft.com/users/hangli/
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Nov 2003</td>
<td align=left valign=top>Dr. Kato and Dr. Fukomoto (NTCIR)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Nov_2003');">
An Overview of the QA Challenge + NTCIR -- The Way Ahead
</a><br>
<span id=abs17_Nov_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 10:30 am - 12:00 pm<br>
<b>Location:</b> 4th Floor<br>
<b>Abstract:</b> An Overview of Question Answering Challenge
Jun'ichi Fukumoto and Tsuneaki Kato
<p>
In this talk, we will present an overview of Question Answering
Challenge(QAC), which is the question answering task of the NTCIR
Workshop.  QAC-1 (the first evaluation of QAC) was carried out
at NTCIR Workshop 3 in October 2002, and QAC-2 will be at
NTCIR Workshop 4 in December 2003.  In the QAC, systems to be
evaluated are expected to return exact answers consisting of a noun
or noun compound denoting, for example, the names of persons,
organizations, or various artifacts or numerical expressions such
as money, size, or date.  Those basically range over the Named
Entity (NE) elements of MUC and IREX but is not limited to them.
QAC consists of three kinds of subtasks: Task 1, where the systems
are allowed to return ranked five possible answers; Task 2, where
the systems are required to return a complete list of answers; and
Task 3, the systems are required to answer series of questions, that
have anaphora and zero-anaphora.  We will present the results of
QAC-1, and vision and prospect of QAC-2.
<p>
NTCIR -- the Way Ahead
Noriko Kando
<p>
Dr. Noriko Kando is the leader of NTCIR(Test Collections and Evaluation
of IR, Text Summarization, Q&A, etc) project, and an associate professor
of National Institute of Informatics (NII).  She got her Ph. D in 1995
from Keio University.  Her research interest includes evaluation of
information retrieval systems, technologies to "Make Information Usable
for Users", cross-lingual information retrieval, and analysis of text
structure, genre, citation & link  She is a member of editorial boards of
International Journal on Information Processing and Management,
ACM-Transaction on Asian Language Information Processing, etc.
<p>
Jun'ichi Fukumoto and Tsuneaki Kato are task organizers of QAC.
Dr. Jun'ichi Fukumoto is an associate professor of Ritsumeikan
University.  He got his Ph. D in 1999 from University of Manchester
Institute of Science and Technology.  His research interest includes
Q&A, automatic summarization, and dialogue processing.
Dr. Tsuneaki Kato is an associate professor of the University of Tokyo.
He got his Dr. of Engineering in 1995 from Tokyo Institute of
Technology.  His research interests includes multimodal dialogue
processing, multimodal presentation generation and domain independent
question and answering.  He is a member of editorial committee of
transaction on information and systems of The Institute of Electronics,
Information and Communication Engineers.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 Oct 2003</td>
<td align=left valign=top>Christopher Manning (Stanford)</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_Oct_2003');">
Natural Language Parsing: Graphs, the A* Algorithm, and Modularity
</a><br>
<span id=abs27_Oct_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 10:00 am - 11:00 am<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Probabilistic parsing methods have in recent years transformed our ability to
robustly find correct parses for open domain sentences.  Much of this work has
been within a common architecture of heuristic search for good pares in
lexicalized probabilistic context-free grammars, with many layers of back-off
to avoid problems of sparse data.
<p>
In this talk, I will outline some different ideas that we have been pursuing.
I will connect stochastic parsing with finding shortest paths in hypergraphs,
and show how this approach naturally provides a chart parser for arbitrary
probabilistic context-free grammars (finding shortest paths in a hypergraph is
easy; the central problem of parsing is that the hypergraph has to be
constructed on the fly). From this viewpoint, a natural approach is to use the
A* algorithm to cut down the work in finding the best parse. On unlexicalized
grammars, this can reduce the parsing work done dramatically, by at least 97%.
This approach is competitive with methods standardly used in statistical
parsers, while ensuring optimality, unlike most heuristic approaches to
best-first parsing.
<p>
Finally, I will present a novel modular generative model in which semantic
(lexical dependency) and syntactic structures are scored separately. This
factored model is conceptually simple, linguistically interesting, admits exact
inferenence with an extremely effective A* algorithm, and provides
straightforward opportunities for separately improving the component models. In
particular, I will mention some of the work we have done focusing on the PCFG
component to produce a very high accuracy unlexicalized grammar.
<p>
This is joint work with Dan Klein.
<p>
About the Speaker:
<p>
Christopher Manning is an Assistant Professor of Computer Science and
Linguistics at Stanford University. He received his Ph.D. from Stanford
University in 1995, and served on the faculty of the Computational Linguistics
Program at Carnegie Mellon University (1994-1996) and the University of Sydney
Linguistics Department (1996-1999) before returning to Stanford. His research
interests include probabilistic models of language, natural language parsing,
constraint-based linguistic theories, syntactic typology, information
extraction and text mining, and computational lexicography. He is the author of
three books, including Foundations of Statistical Natural Language Processing
(MIT Press, 1999, with Hinrich Schuetze).
<p>
Chris' schedule is available in <a href="manning.ps">Postscript</a> or
<a href="manning.pdf">PDF</a> format.
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>17 Oct 2003</td>
<td align=left valign=top>Hovy, Marcu, Knight, Byrd, Narayanan, Traum, Gordon</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs17_Oct_2003');">
Introduction to CL Research
</a><br>
<span id=abs17_Oct_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:30 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> The annual Computational Linguistics Open House will be held at USC's Information
Sciences Institute from 3:00-4:30pm in the 11th floor Conference Room. Researchers from
ISI, including Eduard Hovy, Daniel Marcu, and Kevin Knight will present overviews of
their latest research.  We will also hear about the research activities of Dani Byrd of
the Linguistics Department, Shri Narayanan's group in EE, and David Traum and Andrew
Gordon of USC's Institute for Creative Technologies.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>10 Oct 2003</td>
<td align=left valign=top>Philipp Koehn</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs10_Oct_2003');">
Advances in Statistical MT: Phrases, Noun Phrases and Beyond
</a><br>
<span id=abs10_Oct_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> (This is a practice run for I talk I will give a few times over the next
weeks when interviewing for job positions.)
<p>
I will review the state of the art in statistical machine translation
(SMT), present my dissertation work, and sketch out the research
challenges of syntactically structured statistical machine translation.
<p>
The currently best methods in SMT build on the translation of phrases (any
sequences of words) instead of single words. Phrase translation pairs are
automatically learned from parallel corpora. While SMT systems generate
translation output that often conveys a lot of the meaning of the original
text, it is frequently ungrammatical and incoherent.
<p>
The research challenge at this point is to introduce syntactic knowledge
to the state of the art in order to improve translation quality. My
approach breaks up the translation process along linguistic lines. I will
present my thesis work on noun phrase translation and ideas about clause
structure.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 Oct 2003</td>
<td align=left valign=top>Anton Leuski</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_Oct_2003');">
A Year in Paradise
</a><br>
<span id=abs03_Oct_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> I would like to talk about some of the things I did during the last
year. I will discuss and demonstrate CuSTaRD, a cross-lingual
information retrieval, organization, summarization, and visualization
system that was built for the Surprise Language exercise. I will focus
in more details on iNeATS, the interactive multi-document summarization
part of CuSTaRD. The other project I plan to present is eArchivarius, a
system for accessing collections of electronic mail.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>02 Oct 2003</td>
<td align=left valign=top>Ana-Maria Popescu</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs02_Oct_2003');">
TBA
</a><br>
<span id=abs02_Oct_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 4:00 pm - 5:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> <br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Sep 2003</td>
<td align=left valign=top>Beata Klebanov</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Sep_2003');">
Analyzing Sentences into Facts: Simple is Beautiful
</a><br>
<span id=abs15_Sep_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 2:30 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> I present my summer project  - writing rule-based software for
simplifying texts. Task definition and motivations will be
discussed, as well as human and automatic evaluation, the
latter using a question answering system.
<p>
This is joint work with Daniel Marcu and Kevin Knight.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>12 Sep 2003</td>
<td align=left valign=top>Lara Taylor</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs12_Sep_2003');">
Discourse Coherence for Ordering Information
</a><br>
<span id=abs12_Sep_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 2:30 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> In this talk, I look at how the notion of discourse coherence can be
modeled computationally. I begin with the following idea: if you take
a text and shuffle its sentences into a random order, that text will
no longer make sense. In other words, the text will be "incoherent".
Our task is to learn how to reassemble a shuffled text into an order
that humans would consider to be coherent.
<p>
I discuss practical and theoretical motivations for the task,
evaluations of our model, increases in performance achieved over the
summer, and directions for future research.
<p>
This work was done in collaboration with Kevin Knight, Daniel Marcu,
Jonathan Graehl and Nick Mote.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>05 Sep 2003</td>
<td align=left valign=top>Nishit Rathod and Anish Nair</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs05_Sep_2003');">
Deciphering Hindi Scripts
</a><br>
<span id=abs05_Sep_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> A major hurdle in building automated information retrieval systems for
Hindi text is the lack of an uniform encoding for text representation.
Standards do exist, but noone seems interested. Every web content
publisher seems to have their encoding system, making information
extraction a nightmare. We explore an unsupervised approach to
convert any given "unknown" encoding to UTF-8, by treating it as a
decipherment problem. We also study how a little amount of supervision
can improve decoding accuracy.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>03 Sep 2003</td>
<td align=left valign=top>Alex Fraser and Franz Och</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs03_Sep_2003');">
JHU MT Workshop
</a><br>
<span id=abs03_Sep_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We will present the results of the 2003 Johns Hopkins University
Summer Workshop on "Syntax for Statistical Machine Translation".
<p>
We will describe a large effort to extend a high-performing
phrase-based MT system as baseline by adding new features representing
syntactic knowledge that deal with specific problems of the underlying
baseline. We investigate a broad range of possible feature functions,
from very simple binary features to sophisticated tree-to-tree
translation models. Simple feature functions test if a certain
constituent occurs in the source and the target language parse
tree. More sophisticated features will be derived from an alignment
model where whole sub-trees in source and target can be aligned node
by node. We present results on the Chinese-English large data track of
the recent TIDES MT evaluations.
<p>
This is joint work with the other workshop team members: Daniel
Gildea, Anoop Sarkar, Sanjeev Khudanpur, Kenji Yamada, Libin Shen,
Shankar Kumar, David Smith, Viran Jain, Katherine Eng, Jin Zhen and
Dragomir Radev.
<p>
See <a
href="http://www.clsp.jhu.edu/ws03/groups/translate/">http://www.clsp.jhu.edu/ws03/groups/translate/</a>
for more.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>29 Aug 2003</td>
<td align=left valign=top>Stefan Riezler</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs29_Aug_2003');">
Deepening Representations
</a><br>
<span id=abs29_Aug_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> <br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 Aug 2003</td>
<td align=left valign=top>Michel Galley and Mark Hopkins</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_Aug_2003');">
Syntax for Statistical MT
</a><br>
<span id=abs27_Aug_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> <br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>22 Aug 2003</td>
<td align=left valign=top>Satoshi Sekine</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs22_Aug_2003');">
Information Extraction, IR and QA
</a><br>
<span id=abs22_Aug_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> <br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>15 Aug 2003</td>
<td align=left valign=top>Beata Klebanov</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs15_Aug_2003');">
On Her Masters Research
</a><br>
<span id=abs15_Aug_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> <br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>01 Aug 2003</td>
<td align=left valign=top>Shou-de Lin</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs01_Aug_2003');">
Toward deciphering the 2-dimensional ancient Luwian script by discovering its writing order
</a><br>
<span id=abs01_Aug_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> <br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>29 Jul 2003</td>
<td align=left valign=top>Michael Brasser</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs29_Jul_2003');">
A Model of Word Movement for Machine Translation
</a><br>
<span id=abs29_Jul_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Small<br>
<b>Abstract:</b> <br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Jul 2003</td>
<td align=left valign=top>Jonathan Graehl and Kevin Knight</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Jul_2003');">
Super-Carmel for Trees
</a><br>
<span id=abs25_Jul_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> <br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>18 Jul 2003</td>
<td align=left valign=top>Doug Oard</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs18_Jul_2003');">
A Maryland Yankee in King Eduard's Court: Some Remarks on a Year in Paradise
</a><br>
<span id=abs18_Jul_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> <br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>27 Jun 2003</td>
<td align=left valign=top>Michael Fleischman</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs27_Jun_2003');">
Offline Strategies for Online Question Answering: Answering Questions Before They Are Asked and Maximum Entropy Models for FrameNet Classification
</a><br>
<span id=abs27_Jun_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 10 Large<br>
<b>Abstract:</b> <br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>12 Jun 2003</td>
<td align=left valign=top>Dina Demner-Fushman</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs12_Jun_2003');">
Measuring the Effect of Dictionary Coverage on Cross-Language Retrieval
</a><br>
<span id=abs12_Jun_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 11:00 am - 12:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Bilingual term lists have proven to be a useful basis for
dictionary-based Cross-Language Information Retrieval (CLIR), but
there is ample anecdotal evidence that differences in vocabulary
coverage can have a substantial impact on retrieval effectiveness.
This issue has recently been explored using ablation studies in which
progressively smaller term lists were synthesized using sampling
techniques. The ablation techniques used in those studies have not,
however, been validated using real terms lists. In this talk I will
report the results of what we believe is the first large coverage
study use naturally occurring term lists. Thirty-five bilingual terms
lists were obtained from a variety of sources, each with English as
one of the two paired languages. From these, we created 35
English-to-English term lists by taking each term that was present in
the English side of the list as its own translation. When used with
an English information retreval test collection, this allowed us to
measure the reduction in retrieval effectivenss that could be
attributed to deficiencies in the coverage of English terms. Eight
types of untranslatable terms were identified in a collection of news
stories, of which named entitles were found to have the greatest
impact on retrieval effectiveness. Differences in named entity
coverage were found to produce large differences in retrieval
effectiveness for term lists of similar sizes. Controlling for named
entity effects yielded a clear relationship between retrieval
effectiveness and the size of the translatable English vocabulary.
The functional dependence that we observed is consistent with one
previously applied ablation technique and inconsistent with another.
Our results indicate that the outcome of a widely cited landmark study
of query expansion effects for CLIR was likely affected by a flawed
ablation model. We conclude our talk with a suggestion for further
work on that topic, and a simple prescription for avoiding such
problems in the future.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>23 May 2003</td>
<td align=left valign=top>Liang Zhou</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs23_May_2003');">
A Web-Trained Extraction Summarization System and Headline Summarization at ISI
</a><br>
<span id=abs23_May_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> 1) A serious bottleneck in the development of trainable text summarization
systems is the shortage of training data. Constructing such data is a very
tedious task, especially because there are in general many different
correct ways to summarize a text. Fortunately we can utilize the Internet
as a source of suitable training data. In this paper, we present a
summarization system that uses the web as the source of training data. The
procedure involves structuring the articles downloaded from various
websites, building adequate corpora of (summary, text) and (extract,
text) pairs, training on positive and negative data, and automatically
learning to perform the task of extraction-based summarization systems.
<p>
2) Headlines are useful for users who only need information on the main
topics of a story. We present a headline summarization system that is
built at ISI for this purpose and is a top performer for DUC2003's task 1,
generating very short summaries (10 words or less).
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>20 May 2003</td>
<td align=left valign=top>Michel Galley</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs20_May_2003');">
Discourse Segmentation of Multi-Party Conversation
</a><br>
<span id=abs20_May_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> <br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>16 May 2003</td>
<td align=left valign=top>Chin-Yew Lin</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs16_May_2003');">
Automatic Evaluation of Summaries Using N-gram Co-Occurrence Statistics
</a><br>
<span id=abs16_May_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Following the recent adoption by the machine translation community of
automatic evaluation using the BLEU/NIST scoring process, we conduct an
in-depth study of a similar idea for evaluating summaries. The results
show that automatic evaluation using unigram co-occurrences between
summary pairs correlates surprising well with human evaluations, based
on various statistical metrics; while direct application of the BLEU
evaluation procedure does not always give good results.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>09 May 2003</td>
<td align=left valign=top>Doug Oard</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs09_May_2003');">
Coping with Surprise: The Case of Cebuano
</a><br>
<span id=abs09_May_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> For ten days in March, nine research teams worked together to build
Cebuano language resources and systems for a "dry run" the TIDES Suprise
Language experiment. Cebuano is spoken widely in the southern
Phillipines, but there had previously been little work on computational
linguistics for that language. As we prepare for the actual Suprise
Language experiment this June, we will use this talk to look back on what
worked, what didn't, and what lessons there are to be learned from our
experience in March. Come prepared to share the excitement, offer your
ideas, and understand why we have tried to ask Ed to cancel all vacations
during the month of June (just kidding...).
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>02 May 2003</td>
<td align=left valign=top>Hal Daum&eacute; III</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs02_May_2003');">
Acquiring Paraphrase Templates from Document/Abstract Pairs
</a><br>
<span id=abs02_May_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We present an approach to automatically extracting paraphrase templates
from document/abstract pairs. This methodology relies on word-based
alignments created by off-the-shelf software. Our paraphrases are
evaluated by human evaluators for precision and automatically for
applicability. We find that 77% of the extracted paraphrases are judged
to be always correct and that the generalized templates of 60% are
judged to be applicable most of the time and 87% are judged to be
applicable sometimes.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>25 Apr 2003</td>
<td align=left valign=top>Quamrul Tipu</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs25_Apr_2003');">
Statistical MT with Bilingual Morphology
</a><br>
<span id=abs25_Apr_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Traditional statistical MT systems mostly work on the word-
andphrase-level. For different language pairs, the performance of such
systems vary from some 15% to 35%. These systems suffer from problems
such as sparse data, with huge vocabulary sizes leading to less
reliable probability estimates. In our current research, we aim to
come up with a better MT system by looking inside the words. Almost in
every language, a root (stem) can have many different forms
(inflectional, derivational, etc.). If we can identify the roots, the
size of the vocabulary will quite small, and we can have better
probability estimates, reducing the sparse data problem and
potentially leading to higher accuracy. We are trying to come up with
a model that induces morphology automatically from a bilingual corpus
and achieves this improvement.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>04 Apr 2003</td>
<td align=left valign=top>Donghui Feng</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs04_Apr_2003');">
Natural Language Understanding in MRE
</a><br>
<span id=abs04_Apr_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> In this talk, I will present my current work on language understanding
in the project, Mission Rehearsal Exercise(MRE). One of the challenges
in a dialogure system is to provide a robust understanding/parsing
compoment. We applied both Finte State Model and Statistical Learning
Model for the parsing of separate sentences of dialogue utterances.
Their performances are evaluated and compared with a new blind set.
And we hope to incorporate them to make a better solution in this
specific application.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>21 Mar 2003</td>
<td align=left valign=top>Gareth Jones</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs21_Mar_2003');">
An Investigation of the Application of Broad Coverage Automatic Pronoun Resolution in Information Retrieval
</a><br>
<span id=abs21_Mar_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Term weighting methods have been shown to give significant increases
in information retrieval performance. Term weights are typically
calculated using frequency counts across the whole retrieval
collection, frequency of each term within individual documents and
compensation for varying document length. The presence of pronomial
references in documents effectively reduces the within document term
frequency of associated words with a consequent effect on term weights
and information retrieval behaviour. This presentation will describe
an experimental investigation into the impact on information retrieval
performance of broad coverage automatic pronoun resolution. Results
using a standard information retieval test collection indicate that
calculating term weights using a pronoun resolved version of the
document test collection can improve both fixed cutoff and average
retrieval precision.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>14 Mar 2003</td>
<td align=left valign=top>Kareem Darwish</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs14_Mar_2003');">
Improving the Efficiency and Effectiveness of Structured Query Methods
</a><br>
<span id=abs14_Mar_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> One of the key challenges in retrieval is what to do when a query term
needs to be replaced with more than one term. This problem arises in
applications such as cross language information retrieval and
thesaurus expansion. One solution is to use structured query methods,
which treat all the possible replacements as if they were one query
term by computing a joint document frequency and a joint term
frequency. This presentation will review prior work on structured
query techniques and then introduce three new variants that aim to
improve computational efficiency and to leverage estimates of
replacement probabilities to improve retrieval effectiveness. The
methods have now been tested in cross-language retrieval and
OCR-degraded text retrieval applications in which replacement
probability estimates could be estimated. In both applications, the
new structured query methods showed statistically significant
improvements in retrieval effectiveness over previously known
structured query methods.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Mar 2003</td>
<td align=left valign=top>Scott Klemmer</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Mar_2003');">
Books with Voices: Paper Transcripts as a Tangible Interface to Oral Histories
</a><br>
<span id=abs07_Mar_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Our contextual inquiry into the practices of oral
historians unearthed
a curious incongruity. While oral historians consider interview
recordings a central historical artifact, these recordings
sit unused
after a written transcript is produced. We hypothesized
that this is
largely because books are more usable than recordings.
Therefore, we
created Books with Voices: bar-code augmented paper transcripts
enabling fast, random access to digital video interviews on
a PDA. We
present quantitative results of an evaluation of this tangible
interface with 13 participants. They found this lightweight,
structured access to original recordings to offer
substantial benefits
with minimal overhead. Oral historians found a level of
emotion in the
video not available in the printed transcript. The video
also helped
readers clarify the text and observe nonverbal cues.
<p>
<a
href="http://guir.berkeley.edu/oral-history/">http://guir.berkeley.edu/oral-history/
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>28 Feb 2003</td>
<td align=left valign=top>Radu Soricut</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs28_Feb_2003');">
Sentence Level Discourse Parsing using Syntactic and Lexical Information
</a><br>
<span id=abs28_Feb_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> We introduce two probabilistic models that can be used to identify
elementary discourse units and build sentence-level discourse parse
trees. The models use syntactic and lexical features. A discourse parsing
algorithm that implements these models derives discourse parse trees with
an error reduction of 18.8\% over a state-of-the-art decision-based
discourse parser. A set of empirical evaluations shows that our discourse
parsing model is sophisticated enough to yield discourse trees at an
accuracy level that matches near-human levels of performance.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>21 Feb 2003</td>
<td align=left valign=top>Nate Chambers</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs21_Feb_2003');">
Statistical Language Generation in a Dialogue System
</a><br>
<span id=abs21_Feb_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> The large corpora of written text that is available to the language
community has largely been utilized for language understanding; it has
somewhat been ignored in the context of language generation. Recent
developments in stochastic generation have allowed such systems to shift
the burden from hand crafted databases (lexicons, grammars, ontologies) to
the knowledge implicitly found in written text. However, when building a
dialogue system, generation is largely interactive, very different from
the written structure of most corpora.
<p>
In this talk, I will discuss my recent work at applying a stochastic
generator, HALogen, and its newswire language model to a dialogue system,
TRIPS. I'll describe the difficulties in mapping the TRIPS semantic form
into HALogen's representation, the critical differences between newswire
and dialogue, and the possibility of using HALogen and a large newswire
model as a domain independent generator.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>07 Feb 2003</td>
<td align=left valign=top>Jeongwon Cha</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs07_Feb_2003');">
Automatic Pattern Learning for Information Extraction using Web Data
</a><br>
<span id=abs07_Feb_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> I will give a status report work on information extraction during last
10 months. The motivation of this work is to learn extraction
patterns automatically using seed template and web search engine. My
approach is to generate linguistics patterns and surface patterns and
combine them to compenstate for the respective weaknesses of two
patterns. On the DUC01-test-disasters (67 documents),
DUC01-training-disasters (54 documents) I got a 0.34/0.26 f-measure
respectively. In this talk, I will give a status report on ReAD
project (with Dr. Chin-Yew Lin).
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>31 Jan 2003</td>
<td align=left valign=top>Philipp Koehn</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs31_Jan_2003');">
Noun Phrase Translation
</a><br>
<span id=abs31_Jan_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> I will give a status report on my current thesis work on
noun phrase translation. The motivation of this work is
to break up the machine translation problem into smaller,
more manageable units. The treatment of noun phrase translation
as a subtask of machine translation is both linguistically
and empirically motivated. My approach is to generate
a n-best list of candidate translations with a statistical
machine translation system and rerank the candidates with
additional features. For about 90% of all noun phrases we
can find an acceptable translation in the 100-best list, while
an acceptable translation comes out on the very top for only
about 60% of the noun phrases. I will discuss a variety of
linguistic and empirical features that (may) help to move
the acceptable translations higher in the list. I will also
present results modeling issues such as phrase based
translation and compound splitting. This talk is also
intended as a fishing expedition for feature suggestions by
the audience.
<p>
<br>
</font>
</span>
</td></tr><tr class="speakerItem" border=0 >
<td align=left valign=top>24 Jan 2003</td>
<td align=left valign=top>Doug Oard &amp; Anton Leuski</td>
<td align=left valign=top>
<a onMouseOver="window.status='View abstract'; return true" onMouseOut="window.status=' '; return true" href="javascript:exp_coll('abs24_Jan_2003');">
Access to Archival Collections of Electronic Mail
</a><br>
<span id=abs24_Jan_2003 style="display:none;">
<font size=-1>
<b>Time:</b> 3:00 pm - 4:00 pm<br>
<b>Location:</b> 11 Large<br>
<b>Abstract:</b> Since its inception more than 30 years ago, electronic mail (email)
has developed into a powerful communication medium with applications
that extend well beyond simple asynchronous message exchange between
individuals. Automated tools to support the use of email in
individual, organizational and social contexts have received
increasing attention in recent years. Among the tasks that are now
supported are filtering (e.g., spam detection), aggregation (e.g.,
mailing list digests), workflow management (e.g., help desk routing),
and reuse (e.g., retrospective search). We are interested in how
today's email will be used in the future -- some will certainly be
preserved (indeed, some MUST be preserved!), and those records will
serve as powerful evidence of how we lived our lives and organized our
societies. The challenges of managing many types of electronic record
collections are receiving increasing attention, but we are not aware
of any work yet on supporting access to electronic mail archives.
That will be the focus of this talk.
<p>
We will introduce the Open Archival Information Systems (OAIS) model,
and then focus on two key processes: ingestion and access. Our focus
in ingestion is on support for review and redaction, which we believe
will be key enablers to acquisition and near-term access. For access,
we will address both browsing based on provenance (original order) and
user-guided reorganization based on search and visualization. Along
the way, we will identify potentially productive opportunities to
apply natural language processing technologies such as topic
segmentation, link detection, and summarization. We will then
describe two test collections, and demonstrate a system that we have
developed to explore user-guided reorganization through visualization
for one of those collections. We will conclude the talk by sketching
out a research agenda. At that point, we will expect suggestions and
comments from the audience. Knowing this audience, it is unlikely
that we will need to wait that long :-).
<p>
<br>
</font>
</span>
</td></tr></table><br><br>
<div align="center"><font face="Verdana, Arial, Helvetica, sans-serif" size="1">
This web page was last generated on Wed Dec 06 09:26:25 2017.<br>
</font></div>
</body></html>