Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion data/xml/2020.coling.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3769,7 +3769,7 @@
</paper>
<paper id="310">
<title>Federated Learning for Spoken Language Understanding</title>
<author><first>Zhiqi</first><last>Huang</last></author>
<author id="zhiqi-huang"><first>Zhiqi</first><last>Huang</last></author>
<author><first>Fenglin</first><last>Liu</last></author>
<author><first>Yuexian</first><last>Zou</last></author>
<pages>3467–3478</pages>
Expand Down
2 changes: 1 addition & 1 deletion data/xml/2021.acl.xml
Original file line number Diff line number Diff line change
Expand Up @@ -7151,7 +7151,7 @@ The source code has been made available at \url{https://github.com/liam0949/DCLO
</paper>
<paper id="509">
<title><fixed-case>G</fixed-case>host<fixed-case>BERT</fixed-case>: Generate More Features with Cheap Operations for <fixed-case>BERT</fixed-case></title>
<author><first>Zhiqi</first><last>Huang</last></author>
<author id="zhiqi-huang"><first>Zhiqi</first><last>Huang</last></author>
<author><first>Lu</first><last>Hou</last></author>
<author><first>Lifeng</first><last>Shang</last></author>
<author><first>Xin</first><last>Jiang</last></author>
Expand Down
2 changes: 1 addition & 1 deletion data/xml/2022.nlp4convai.xml
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@
</paper>
<paper id="11">
<title><fixed-case>MTL</fixed-case>-<fixed-case>SLT</fixed-case>: Multi-Task Learning for Spoken Language Tasks</title>
<author><first>Zhiqi</first><last>Huang</last></author>
<author id="zhiqi-huang"><first>Zhiqi</first><last>Huang</last></author>
<author><first>Milind</first><last>Rao</last></author>
<author><first>Anirudh</first><last>Raju</last></author>
<author><first>Zhe</first><last>Zhang</last></author>
Expand Down
2 changes: 1 addition & 1 deletion data/xml/2023.emnlp.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6795,7 +6795,7 @@
<title>Enhancing Code-Switching for Cross-lingual <fixed-case>SLU</fixed-case>: A Unified View of Semantic and Grammatical Coherence</title>
<author><first>Zhihong</first><last>Zhu</last></author>
<author><first>Xuxin</first><last>Cheng</last></author>
<author><first>Zhiqi</first><last>Huang</last></author>
<author id="zhiqi-huang"><first>Zhiqi</first><last>Huang</last></author>
<author><first>Dongsheng</first><last>Chen</last></author>
<author><first>Yuexian</first><last>Zou</last></author>
<pages>7849-7856</pages>
Expand Down
6 changes: 3 additions & 3 deletions data/xml/2023.findings.xml
Original file line number Diff line number Diff line change
Expand Up @@ -13165,7 +13165,7 @@
<title>Towards Unified Spoken Language Understanding Decoding via Label-aware Compact Linguistics Representations</title>
<author><first>Zhihong</first><last>Zhu</last><affiliation>Peking University</affiliation></author>
<author><first>Xuxin</first><last>Cheng</last><affiliation>Peking University</affiliation></author>
<author orcid="0000-0003-1126-1217"><first>Zhiqi</first><last>Huang</last><affiliation>Tencent AI Lab</affiliation></author>
<author orcid="0000-0003-1126-1217" id="zhiqi-huang"><first>Zhiqi</first><last>Huang</last><affiliation>Tencent AI Lab</affiliation></author>
<author><first>Dongsheng</first><last>Chen</last><affiliation>Peking University</affiliation></author>
<author orcid="0000-0001-9999-6140"><first>Yuexian</first><last>Zou</last><affiliation>Peking University</affiliation></author>
<pages>12523-12531</pages>
Expand Down Expand Up @@ -21565,7 +21565,7 @@
</paper>
<paper id="533">
<title><fixed-case>MCLF</fixed-case>: A Multi-grained Contrastive Learning Framework for <fixed-case>ASR</fixed-case>-robust Spoken Language Understanding</title>
<author><first>Zhiqi</first><last>Huang</last></author>
<author id="zhiqi-huang"><first>Zhiqi</first><last>Huang</last></author>
<author><first>Dongsheng</first><last>Chen</last></author>
<author><first>Zhihong</first><last>Zhu</last></author>
<author><first>Xuxin</first><last>Cheng</last></author>
Expand Down Expand Up @@ -24927,7 +24927,7 @@
<author><first>Yifeng</first><last>Xie</last></author>
<author><first>Zhihong</first><last>Zhu</last></author>
<author><first>Xuxin</first><last>Cheng</last></author>
<author><first>Zhiqi</first><last>Huang</last></author>
<author id="zhiqi-huang"><first>Zhiqi</first><last>Huang</last></author>
<author><first>Dongsheng</first><last>Chen</last></author>
<pages>11858-11864</pages>
<abstract>Spoken Language Understanding (SLU), a crucial component of task-oriented dialogue systems, has consistently garnered attention from both academic and industrial communities. Although incorporating syntactic information into models has the potential to enhance the comprehension of user utterances and yield impressive results, its application in SLU systems remains largely unexplored. In this paper, we propose a carefully designed model termed Syntax-aware attention (SAT) to enhance SLU, where attention scopes are constrained based on relationships within the syntactic structure. Experimental results on three datasets show that our model achieves substantial improvements and excellent performance. Moreover, SAT can be integrated into other BERT-based language models to further boost their performance.</abstract>
Expand Down
4 changes: 2 additions & 2 deletions data/xml/2024.acl.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3991,7 +3991,7 @@
<author><first>Liming</first><last>Liang</last></author>
<author><first>Yuxin</first><last>Xie</last></author>
<author><first>Zhichang</first><last>Wang</last></author>
<author><first>Zhiqi</first><last>Huang</last><affiliation>Tencent Game</affiliation></author>
<author id="zhiqi-huang"><first>Zhiqi</first><last>Huang</last><affiliation>Tencent Game</affiliation></author>
<author><first>Yuexian</first><last>Zou</last><affiliation>Peking University</affiliation></author>
<pages>5235-5246</pages>
<abstract>Spoken language understanding (SLU) inevitably suffers from error propagation from automatic speech recognition (ASR) in actual scenarios. Some recent works attempt to alleviate this issue through contrastive learning. However, they (1) sample negative pairs incorrectly in pre-training; (2) only focus on implicit metric learning while neglecting explicit erroneous predictions; (3) treat manual and ASR transcripts indiscriminately. In this paper, we propose a novel framework termed <tex-math>\textbf{PCAD}</tex-math>, which can calibrate bias and errors and achieve adaptive-balanced decoupling training. Specifically, PCAD utilizes a prototype-based loss to aggregate label and prediction priors and calibrate bias and error-prone semantics for better inter-class discrimination and intra-class consistency. We theoretically analyze the effect of this loss on robustness enhancement. Further, we leverage a teacher-student model for asymmetric decoupling training between different transcripts and formulate a novel gradient-sensitive exponential moving averaging (GS-EMA) algorithm for adaptive balance of accuracy and robustness. Experiments on three datasets show that PCAD significantly outperforms existing approaches and achieves new state-of-the-art performance.</abstract>
Expand Down Expand Up @@ -12450,7 +12450,7 @@
<author><first>Xuxin</first><last>Cheng</last></author>
<author><first>Zhanpeng</first><last>Chen</last></author>
<author><first>Xianwei</first><last>Zhuang</last></author>
<author><first>Zhiqi</first><last>Huang</last><affiliation>Tencent Game</affiliation></author>
<author id="zhiqi-huang"><first>Zhiqi</first><last>Huang</last><affiliation>Tencent Game</affiliation></author>
<author><first>Yuexian</first><last>Zou</last><affiliation>Peking University</affiliation></author>
<pages>153-160</pages>
<abstract>Zero-shot cross-lingual spoken language understanding (SLU) can promote the globalization application of dialog systems, which has attracted increasing attention. While current code-switching based cross-lingual SLU frameworks have shown promising results, they (i) predominantly utilize contrastive objectives to model hard alignment, which may disrupt the inherent structure within sentences of each language; and (ii) focus optimization objectives solely on the original sentences, neglecting the relation between original sentences and code-switched sentences, which may hinder contextualized embeddings from further alignment. In this paper, we propose a novel framework dubbed REPE (short for Representation-Level and Prediction-Level Alignment), which leverages both code-switched and original sentences to achieve multi-level alignment. Specifically, REPE introduces optimal transport to facilitate soft alignment between the representations of code-switched and original sentences, thereby preserving structural integrity as much as possible. Moreover, REPE adopts multi-view learning to enforce consistency regularization between the prediction of the two sentences, aligning them into a more refined language-invariant space. Based on this, we further incorporate a self-distillation layer to boost the robustness of REPE. Extensive experiments on two benchmarks across ten languages demonstrate the superiority of the proposed REPE framework.</abstract>
Expand Down
4 changes: 2 additions & 2 deletions data/xml/2024.emnlp.xml
Original file line number Diff line number Diff line change
Expand Up @@ -10236,7 +10236,7 @@
</paper>
<paper id="736">
<title>Language Concept Erasure for Language-invariant Dense Retrieval</title>
<author><first>Zhiqi</first><last>Huang</last></author>
<author id="zhiqi-huang-umass"><first>Zhiqi</first><last>Huang</last></author>
<author orcid="0000-0001-7913-8632"><first>Puxuan</first><last>Yu</last></author>
<author><first>Shauli</first><last>Ravfogel</last></author>
<author orcid="0000-0003-0132-5694"><first>James</first><last>Allan</last><affiliation>University of Massachusetts, Amherst</affiliation></author>
Expand Down Expand Up @@ -13491,7 +13491,7 @@
<author><first>Zhanpeng</first><last>Chen</last></author>
<author><first>Zhihong</first><last>Zhu</last><affiliation>Tencent</affiliation></author>
<author><first>Xianwei</first><last>Zhuang</last></author>
<author><first>Zhiqi</first><last>Huang</last><affiliation>Tencent Game</affiliation></author>
<author id="zhiqi-huang"><first>Zhiqi</first><last>Huang</last><affiliation>Tencent Game</affiliation></author>
<author><first>Yuexian</first><last>Zou</last><affiliation>Peking University</affiliation></author>
<pages>17554-17567</pages>
<abstract>Multimodal intent detection is designed to leverage diverse modalities for a comprehensive understanding of user intentions in real-world scenarios, thus playing a critical role in modern task-oriented dialogue systems. Existing methods have made great progress in modal alignment and fusion, however, two vital limitations are neglected: (I) close entanglement of multimodal semantics with modal structures; (II) insufficient learning of the causal effects of semantic and modality-specific information on the final predictions under the end-to-end training fashion. To alleviate the above limitations, we introduce the Dual-oriented Disentangled Network with Counterfactual Intervention (DuoDN). DuoDN addresses key limitations in current systems by effectively disentangling and utilizing modality-specific and multimodal semantic information. The model consists of a Dual-oriented Disentangled Encoder that decouples semantics-oriented and modality-oriented representations, alongside a Counterfactual Intervention Module that applies causal inference to understand causal effects by injecting confounders. Experiments on three benchmark datasets demonstrate DuoDN’s superiority over existing methods, with extensive analysis validating its advantages.</abstract>
Expand Down
2 changes: 1 addition & 1 deletion data/xml/2024.findings.xml
Original file line number Diff line number Diff line change
Expand Up @@ -18122,7 +18122,7 @@
<author><first>Zhihong</first><last>Zhu</last><affiliation>Tencent</affiliation></author>
<author><first>Xianwei</first><last>Zhuang</last></author>
<author><first>Zhanpeng</first><last>Chen</last></author>
<author><first>Zhiqi</first><last>Huang</last><affiliation>Tencent Game</affiliation></author>
<author id="zhiqi-huang"><first>Zhiqi</first><last>Huang</last><affiliation>Tencent Game</affiliation></author>
<author><first>Yuexian</first><last>Zou</last><affiliation>Peking University</affiliation></author>
<pages>14868-14879</pages>
<abstract>As a crucial task in the task-oriented dialogue systems, spoken language understanding (SLU) has garnered increasing attention. However, errors from automatic speech recognition (ASR) often hinder the performance of understanding. To tackle this problem, we propose MoE-SLU, an ASR-Robust SLU framework based on the mixture-of-experts technique. Specifically, we first introduce three strategies to generate additional transcripts from clean transcripts. Then, we employ the mixture-of-experts technique to weigh the representations of the generated transcripts, ASR transcripts, and the corresponding clean manual transcripts. Additionally, we also regularize the weighted average of predictions and the predictions of ASR transcripts by minimizing the Jensen-Shannon Divergence (JSD) between these two output distributions. Experiment results on three benchmark SLU datasets demonstrate that our MoE-SLU achieves state-of-the-art performance. Further model analysis also verifies the superiority of our method.</abstract>
Expand Down
Loading