IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization (2024.07.03)
Ahmed Frikha, Nassim Walha, K. K. Nakka, Ricardo Mendes, Xue Jiang, etc
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs (2024.06.28)
Sukmin Yun, Haokun Lin, Rusiru Thushara, Mohammad Qazim Bhat, Yongxin Wang, etc
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding (2024.06.27)
Tao Zhang, Xiangtai Li, Hao Fei, Haobo Yuan, Shengqiong Wu, etc
Adversarial Search Engine Optimization for Large Language Models (2024.06.26)
Fredrik Nestaas, Edoardo Debenedetti, F. Tramèr
VideoLLM-online: Online Video Large Language Model for Streaming Video (2024.06.17)
Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, etc
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs (2024.06.14)
Rui Yang, Ruomeng Ding, Yong Lin, Huan Zhang, Tong Zhang
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation (2024.06.10)
Peize Sun, Yi Jiang, Shoufa Chen, Shilong Zhang, Bingyue Peng, etc
Patrick Haller, Lena S. Bolliger, Lena A. Jager
PaCE: Parsimonious Concept Engineering for Large Language Models (2024.06.06)
Jinqi Luo, Tianjiao Ding, Kwan Ho Ryan Chan, D. Thaker, Aditya Chattopadhyay, etc
Yuan 2.0-M32: Mixture of Experts with Attention Router (2024.05.28)
Shaohua Wu, Jiangang Luo, Xi Chen, Lingjun Li, Xudong Zhao, etc . - 【arXiv.org】
Andreas Bucher, Birgit Schenk, Mateusz Dolata, Gerhard Schwabe . - 【arXiv.org】
Measuring Impacts of Poisoning on Model Parameters and Embeddings for Large Language Models of Code (2024.05.19)
Aftab Hussain, Md Rafiqul Islam Rabin, Mohammad Amin Alipour . - 【arXiv.org】
Ayan Banerjee, Aranyak Maity, Payal Kamboj, Sandeep K. S. Gupta . - 【arXiv.org】
Lyumanshan Ye, Jiandong Jiang, Danni Chang, Pengfei Liu . - 【arXiv.org】
UniDM: A Unified Framework for Data Manipulation with Large Language Models (2024.05.10)
Yichen Qian, Yongyi He, Rong Zhu, Jintao Huang, Zhijian Ma, etc . - 【Conference on Machine Learning and Systems】
FlockGPT: Guiding UAV Flocking with Linguistic Orchestration (2024.05.09)
Artem Lykov, Sausar Karaf, Mikhail Martynov, Valerii Serpiva, A. Fedoseev, etc . - 【arXiv.org】
Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning (2024.05.09)
Shibo Jie, Yehui Tang, Ning Ding, Zhi-Hong Deng, Kai Han, etc . - 【arXiv.org】
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model (2024.05.07)
Zhihong Shao, Damai Dai, Daya Guo, Bo Liu, Zihan Wang . - 【arXiv.org】
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving (2024.05.07)
Yujun Lin, Haotian Tang, Shang Yang, Zhekai Zhang, Guangxuan Xiao, etc . - 【arXiv.org】
S. Bhattacharyya . - 【Journal of Science and Technology Policy Management】
FairEvalLLM. A Comprehensive Framework for Benchmarking Fairness in Large Language Model Recommender Systems (2024.05.03)
Yashar Deldjoo . - 【arXiv.org】
Single and Multi-Hop Question-Answering Datasets for Reticular Chemistry with GPT-4-Turbo (2024.05.03)
Nakul Rampal, Kaiyu Wang, Matthew Burigana, Lingxiang Hou, Juri Al-Johani, etc . - 【arXiv.org】
What matters when building vision-language models? (2024.05.03)
Hugo Laurençon, Léo Tronchon, Matthieu Cord, Victor Sanh . - 【arXiv.org】
Analyzing Narrative Processing in Large Language Models (LLMs): Using GPT4 to test BERT (2024.05.03)
Patrick Krauss, Jannik Hösch, C. Metzner, Andreas K. Maier, Peter Uhrig, etc . - 【arXiv.org】
Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows (2024.05.02)
Jasmine Y. Shih, Vishal Mohanty, Yannis Katsis, Hariharan Subramonyam . - 【CHI Extended Abstracts】
NumLLM: Numeric-Sensitive Large Language Model for Chinese Finance (2024.05.01)
Huan-Yi Su, Ke Wu, Yu-Hao Huang, Wu-Jun Li . - 【arXiv.org】
Is Bigger Edit Batch Size Always Better? - An Empirical Study on Model Editing with Llama-3 (2024.05.01)
Junsang Yoon, Akshat Gupta, G. Anumanchipalli . - 【arXiv.org】
Re-Thinking Inverse Graphics With Large Language Models (2024.04.23)
Peter Kulits, Haiwen Feng, Weiyang Liu, Victoria Abrevaya, Michael J. Black . - 【arXiv.org】
Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models (2024.04.23)
Aidan Z. H. Yang, Sophia Kolak, Vincent J. Hellendoorn, Ruben Martins, Claire Le Goues . - 【arXiv.org】
Quantifying Multilingual Performance of Large Language Models Across Languages (2024.04.17)
Zihao Li, Yucheng Shi, Zirui Liu, Fan Yang, Ninghao Liu, etc . - 【arXiv.org】
Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding (2024.04.17)
Zezhong Fan, Xiaohan Li, Kaushiki Nag, Chenhao Fang, Topojoy Biswas, etc . - 【The Web Conference】
LLMorpheus: Mutation Testing using Large Language Models (2024.04.15)
Frank Tip, Jonathan Bell, Max Schäfer . - 【arXiv.org】
Generating consistent PDDL domains with Large Language Models (2024.04.11)
Pavel Smirnov, F. Joublin, A. Ceravola, Michael Gienger
Generating consistent PDDL domains with Large Language Models (2024.04.11)
Pavel Smirnov, F. Joublin, A. Ceravola, Michael Gienger . - 【arXiv.org】
Manipulating Large Language Models to Increase Product Visibility (2024.04.11)
Aounon Kumar, Himabindu Lakkaraju
High-Dimension Human Value Representation in Large Language Models (2024.04.11)
Samuel Cahyawijaya, Delong Chen, Yejin Bang, Leila Khalatbari, Bryan Wilie, etc
MetaCheckGPT -- A Multi-task Hallucination Detector Using LLM Uncertainty and Meta-models (2024.04.10)
Rahul Mehta, Andrew Hoblitzell, Jack O'Keefe, Hyeju Jang, Vasudeva Varma
Rahul Mehta, Andrew Hoblitzell, Jack O’keefe, Hyeju Jang, Vasudeva Varma . - 【International Workshop on Semantic Evaluation】
From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications (2024.04.10)
Yongqiang Ma, Lizhi Qin, Jiawei Liu, Yangyang Kang, Yue Zhang, etc
LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding (2024.04.08)
Chuwei Luo, Yufan Shen, Zhaoqing Zhu, Qi Zheng, Zhi Yu, etc
Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models (2024.04.08)
Yutao Ouyang, Jinhan Li, Yunfei Li, Zhongyu Li, Chao Yu, etc
Topic-based Watermarks for LLM-Generated Text (2024.04.02)
Alexander Nemecek, Yuzhou Jiang, Erman Ayday
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks (2024.04.02)
Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion
Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference (2024.03.29)
Jovan Stojkovic, Esha Choukse, Chaojie Zhang, Íñigo Goiri, Josep Torrellas . - 【arXiv.org】
LUQ: Long-text Uncertainty Quantification for LLMs (2024.03.29)
Caiqi Zhang, Fangyu Liu, Marco Basaldella, Nigel Collier . - 【arXiv.org】
Gecko: Versatile Text Embeddings Distilled from Large Language Models (2024.03.29)
Jinhyuk Lee, Zhuyun Dai, Xiaoqi Ren, Blair Chen, Daniel Cer, etc . - 【arXiv.org】
WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models (2024.03.28)
Piotr Molenda, Adian Liusie, Mark J. F. Gales . - 【arXiv.org】
MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model (2024.03.27)
Yike Wu, Jiatao Zhang, Nan Hu, LanLing Tang, Guilin Qi, etc . - 【arXiv.org】
Comp4D: LLM-Guided Compositional 4D Scene Generation (2024.03.25)
Dejia Xu, Hanwen Liang, N. Bhatt, Hezhen Hu, Hanxue Liang, etc
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? (2024.03.21)
Renrui Zhang, Dongzhi Jiang, Yichi Zhang, Haokun Lin, Ziyu Guo, etc
Enhancing Code Generation Performance of Smaller Models by Distilling the Reasoning Ability of LLMs (2024.03.20)
Zhihong Sun, Chen Lyu, Bolun Li, Yao Wan, Hongyu Zhang, etc
Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model (2024.03.20)
Peng Zhou, Jianmin Wang, Chunyan Li, Zixu Wang, Yiping Liu, etc
Towards Robots That Know When They Need Help: Affordance-Based Uncertainty for Large Language Model Planners (2024.03.19)
James F. Mullen, Dinesh Manocha
ExeGPT: Constraint-Aware Resource Scheduling for LLM Inference (2024.03.15)
Hyungjun Oh, Kihong Kim, Jaemin Kim, Sungkyun Kim, Junyeol Lee, etc
ChartInstruct: Instruction Tuning for Chart Comprehension and Reasoning (2024.03.14)
Ahmed Masry, Mehrad Shahmohammadi, Md. Rizwan Parvez, Enamul Hoque, Shafiq R. Joty
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference (2024.03.14)
Piotr Nawrot, Adrian La'ncucki, Marcin Chochowski, David Tarjan, E. M. Ponti
Towards Proactive Interactions for In-Vehicle Conversational Assistants Utilizing Large Language Models (2024.03.14)
Huifang Du, Xuejing Feng, Jun Ma, Meng Wang, Shiyu Tao, etc
Simple and Scalable Strategies to Continually Pre-train Large Language Models (2024.03.13)
Adam Ibrahim, Benjamin Th'erien, Kshitij Gupta, Mats L. Richter, Quentin Anthony, etc
LG-Traj: LLM Guided Pedestrian Trajectory Prediction (2024.03.12)
Pranav Singh Chib, Pravendra Singh
Big City Bias: Evaluating the Impact of Metropolitan Size on Computational Job Market Abilities of Language Models (2024.03.12)
Charlie Campanella, R. Goot . - 【NLP4HR】
InfiCoder-Eval: Systematically Evaluating the Question-Answering Capabilities of Code Large Language Models (2024.03.11)
Linyi Li, Shijie Geng, Zhenwen Li, Yibo He, Hao Yu, etc
Naming, Describing, and Quantifying Visual Objects in Humans and LLMs (2024.03.11)
Alberto Testoni, Juell Sprott, Sandro Pezzelle
LLM4Decompile: Decompiling Binary Code with Large Language Models (2024.03.08)
Hanzhuo Tan, Qi Luo, Jing Li, Yuqun Zhang
Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs (2024.03.08)
Arijit Nag, Animesh Mukherjee, Niloy Ganguly, Soumen Chakrabarti
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference (2024.03.07)
Wei-Lin Chiang, Lianmin Zheng, Ying Sheng, Anastasios Nikolas Angelopoulos, Tianle Li, etc
SaulLM-7B: A pioneering Large Language Model for Law (2024.03.06)
Pierre Colombo, Telmo Pessoa Pires, Malik Boudiaf, Dominic Culver, Rui Melo, etc
KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection (2024.03.04)
Yuexin Li, Chengyu Huang, Shumin Deng, Mei Lin Lock, Tri Cao, etc
Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models (2024.02.29)
Chen Qian, Jie Zhang, Wei Yao, Dongrui Liu, Zhen-fei Yin, etc
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers (2024.02.29)
Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Ekaterina Deyneka, Hsiang-wei Chao, etc
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models (2024.02.29)
Soham De, Samuel L. Smith, Anushan Fernando, Aleksandar Botev, George Cristian-Muraru, etc
The All-Seeing Project V2: Towards General Relation Comprehension of the Open World (2024.02.29)
Weiyun Wang, Yiming Ren, Hao Luo, Tiantong Li, Chenxiang Yan, etc
LeMo-NADe: Multi-Parameter Neural Architecture Discovery with LLMs (2024.02.28)
Md Hafizur Rahman, Prabuddha Chakraborty
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits (2024.02.27)
Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, etc . - 【arXiv.org】
Fujian Jia, Xin Liu, Lixi Deng, Jiwen Gu, Chunchao Pu, etc . - 【arXiv.org】
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs (2024.02.23)
Kinjal Basu, Ibrahim Abdelaziz, Subhajit Chaudhury, Soham Dan, M. Crouse, etc
Genie: Generative Interactive Environments (2024.02.23)
Jake Bruce, Michael Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, etc
Tokenization counts: the impact of tokenization on arithmetic in frontier LLMs (2024.02.22)
Aaditya K. Singh, DJ Strouse . - 【arXiv.org】
Semantic Mirror Jailbreak: Genetic Algorithm Based Jailbreak Prompts Against Open-source LLMs (2024.02.21)
Xiaoxia Li, Siyuan Liang, Jiyi Zhang, Hansheng Fang, Aishan Liu, etc . - 【arXiv.org】
Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages (2024.02.19)
Yuan Zhang, Yile Wang, Zijun Liu, Shuo Wang, Xiaolong Wang, etc . - 【arXiv.org】
Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs (2024.02.12)
Víctor Gallego . - 【arXiv.org】
FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs (2024.02.08)
Eun Cheol Choi, Emilio Ferrara . - 【arXiv.org】
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models (2024.02.08)
Peng Gao, Renrui Zhang, Chris Liu, Longtian Qiu, Siyuan Huang, etc . - 【arXiv.org】
On the Convergence of Zeroth-Order Federated Tuning in Large Language Models (2024.02.08)
Zhenqing Ling, Daoyuan Chen, Liuyi Yao, Yaliang Li, Ying Shen . - 【arXiv.org】
Large Language Model Meets Graph Neural Network in Knowledge Distillation (2024.02.08)
Shengxiang Hu, Guobing Zou, Song Yang, Yanglan Gan, Bofeng Zhang, etc . - 【arXiv.org】
Panacea: Pareto Alignment via Preference Adaptation for LLMs (2024.02.03)
Yifan Zhong, Chengdong Ma, Xiaoyuan Zhang, Ziran Yang, Qingfu Zhang, etc . - 【arXiv.org】
Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners? (2024.01.31)
Andreas Opedal, Alessandro Stolfo, Haruki Shirakami, Ying Jiao, Ryan Cotterell, etc . - 【arXiv.org】
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model (2024.01.29)
Xiao-wen Dong, Pan Zhang, Yuhang Zang, Yuhang Cao, Bin Wang, etc . - 【arXiv.org】
True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning (2024.01.25)
Weihao Tan, Wentao Zhang, Shanqi Liu, Longtao Zheng, Xinrun Wang, etc . - 【arXiv.org】
ChatQA: Building GPT-4 Level Conversational QA Models (2024.01.18)
Zihan Liu, Wei Ping, Rajarshi Roy, Peng Xu, Chankyu Lee, etc . - 【arXiv.org】
Beyond Reference-Based Metrics: Analyzing Behaviors of Open LLMs on Data-to-Text Generation (2024.01.18)
Zdeněk Kasner, Ondvrej Duvsek . - 【arXiv.org】
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models (2024.01.11)
Damai Dai, Chengqi Deng, Chenggang Zhao, R. Xu, Huazuo Gao, etc . - 【arXiv.org】
Can Large Language Models Beat Wall Street? Unveiling the Potential of AI in Stock Selection (2024.01.08)
G. Fatouros, Konstantinos Metaxas, John Soldatos, D. Kyriazis . - 【Social Science Research Network】
Instruct-Imagen: Image Generation with Multi-modal Instruction (2024.01.03)
Hexiang Hu, Kelvin C.K. Chan, Yu-Chuan Su, Wenhu Chen, Yandong Li, etc . - 【arXiv.org】
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones (2023.12.28)
Zhengqing Yuan, Zhaoxu Li, Lichao Sun . - 【arXiv.org】
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices (2023.12.28)
Xiangxiang Chu, Limeng Qiao, Xinyang Lin, Shuang Xu, Yang Yang, etc . - 【arXiv.org】
Generative AI for Math: Part I - MathPile: A Billion-Token-Scale Pretraining Corpus for Math (2023.12.28)
Zengzhi Wang, Rui Xia, Pengfei Liu . - 【arXiv.org】
WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation (2023.12.20)
Zhaojian Yu, Xin Zhang, Ning Shang, Yangyu Huang, Can Xu, etc . - 【arXiv.org】
A mathematical perspective on Transformers (2023.12.17)
Borjan Geshkovski, Cyril Letrouit, Yury Polyanskiy, Philippe Rigollet
Mathematical discoveries from program search with large language models. (2023.12.14)
Bernardino Romera-Paredes, M. Barekatain, Alexander Novikov, Matej Balog, M. P. Kumar, etc . - 【Nature】
LMDrive: Closed-Loop End-to-End Driving with Large Language Models (2023.12.12)
Hao Shao, Yuxuan Hu, Letian Wang, Steven L. Waslander, Yu Liu, etc . - 【arXiv.org】
LLM360: Towards Fully Transparent Open-Source LLMs (2023.12.11)
Zhengzhong Liu, Aurick Qiao, W. Neiswanger, Hongyi Wang, Bowen Tan, etc
From Text to Motion: Grounding GPT-4 in a Humanoid Robot"Alter3" (2023.12.11)
Takahide Yoshida, A. Masumori, Takashi Ikegami
Control Risk for Potential Misuse of Artificial Intelligence in Science (2023.12.11)
Jiyan He, Weitao Feng, Yaosen Min, Jingwei Yi, Kunsheng Tang, etc
Sequential Modeling Enables Scalable Learning for Large Vision Models (2023.12.01)
Yutong Bai, Xinyang Geng, K. Mangalam, Amir Bar, Alan Yuille, etc
MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers (2023.11.27)
Yawar Siddiqui, A. Alliegro, Alexey Artemov, Tatiana Tommasi, Daniele Sirigatti, etc . - 【arXiv.org】
Minimizing Factual Inconsistency and Hallucination in Large Language Models (2023.11.23)
I. Muneeswaran, Shreya Saxena, Siva Prasad, M. V. S. Prakash, Advaith Shankar, etc . - 【arXiv.org】
Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents (2023.11.20)
Zhuosheng Zhang, Yao Yao, Aston Zhang, Xiangru Tang, Xinbei Ma, etc . - 【arXiv.org】
An Embodied Generalist Agent in 3D World (2023.11.18)
Jiangyong Huang, Silong Yong, Xiaojian Ma, Xiongkun Linghu, Puhao Li, etc . - 【arXiv.org】
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning (2023.11.17)
Rohit Girdhar, Mannat Singh, Andrew Brown, Quentin Duval, S. Azadi, etc . - 【arXiv.org】
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding (2023.11.14)
Peng Jin, Ryuichi Takanobu, Caiwan Zhang, Xiaochun Cao, Li Yuan . - 【arXiv.org】
SpectralGPT: Spectral Foundation Model (2023.11.13)
D. Hong, Bing Zhang, Xuyang Li, Yuxuan Li, Chenyu Li, etc . - 【arXiv.org】
Social Motion Prediction with Cognitive Hierarchies (2023.11.08)
Wentao Zhu, Jason Qin, Yuke Lou, Hang Ye, Xiaoxuan Ma, etc . - 【arXiv.org】
Pre-training LLMs using human-like development data corpus (2023.11.08)
Khushi Bhardwaj, Raj Sanjay Shah, Sashank Varma . - 【arXiv.org】
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration (2023.11.07)
Qinghao Ye, Haiyang Xu, Jiabo Ye, Mingshi Yan, Anwen Hu, etc . - 【arXiv.org】
Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation (2023.11.06)
Rusheb Shah, Quentin Feuillade--Montixi, Soroush Pour, Arush Tagade, Stephen Casper, etc . - 【arXiv.org】
Ziya2: Data-centric Learning is All LLMs Need (2023.11.06)
Ruyi Gan, Ziwei Wu, Renliang Sun, Junyu Lu, Xiaojun Wu, etc . - 【arXiv.org】
Levels of AGI: Operationalizing Progress on the Path to AGI (2023.11.04)
Meredith Ringel Morris, Jascha Narain Sohl-Dickstein, Noah Fiedel, T. Warkentin, Allan Dafoe, etc . - 【arXiv.org】
PILL: Plug Into LLM with Adapter Expert and Attention Gate (2023.11.03)
Fangyuan Zhang, Tingting Liang, Zhengyuan Wu, Yuyu Yin . - 【arXiv.org】
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation (2023.11.02)
Yufei Wang, Zhou Xian, Feng Chen, Tsun-Hsuan Wang, Yian Wang, etc . - 【arXiv.org】
TopicGPT: A Prompt-based Topic Modeling Framework (2023.11.02)
Chau Minh Pham, Alexander Miserlis Hoyle, Simeng Sun, Mohit Iyyer . - 【arXiv.org】
ChipNeMo: Domain-Adapted LLMs for Chip Design (2023.10.31)
Mingjie Liu, Teodor-Dumitru Ene, Robert Kirby, Chris Cheng, Nathaniel Pinckney, etc . - 【arXiv.org】
Narratron: Collaborative Writing and Shadow-playing of Children Stories with Large Language Models (2023.10.29)
Yubo Zhao, Xiying Bao . - 【Adjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology】
CodeFusion: A Pre-trained Diffusion Model for Code Generation (2023.10.26)
Mukul Singh, J. Cambronero, Sumit Gulwani, Vu Le, Carina Negreanu, etc
GraphGPT: Graph Instruction Tuning for Large Language Models (2023.10.19)
Jiabin Tang, Yuhao Yang, Wei Wei, Lei Shi, Lixin Su, etc . - 【arXiv.org】
Creative Robot Tool Use with Large Language Models (2023.10.19)
Mengdi Xu, Peide Huang, Wenhao Yu, Shiqi Liu, Xilun Zhang, etc . - 【arXiv.org】
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models (2023.10.18)
Dingyao Yu, Kaitao Song, Peiling Lu, Tianyu He, Xu Tan, etc . - 【arXiv.org】
Llemma: An Open Language Model For Mathematics (2023.10.16)
Zhangir Azerbayev, Hailey Schoelkopf, Keiran Paster, Marco Dos Santos, Stephen McAleer, etc . - 【arXiv.org】
BiLL-VTG: Bridging Large Language Models and Lightweight Visual Tools for Video-based Texts Generation (2023.10.16)
Ji Qi, Kaixuan Ji, Jifan Yu, Duokang Wang, Bin Xu, etc . - 【arXiv.org】
JMedLoRA: Medical Domain Adaptation on Japanese Large Language Models using Instruction-tuning (2023.10.16)
Issey Sukeda, Masahiro Suzuki, Hiroki Sakaji, Satoshi Kodera . - 【arXiv.org】
Table-GPT: Table-tuned GPT for Diverse Table Tasks (2023.10.13)
Peng Li, Yeye He, Dror Yashar, Weiwei Cui, Song Ge, etc . - 【arXiv.org】
MemGPT: Towards LLMs as Operating Systems (2023.10.12)
Charles Packer, Vivian Fang, Shishir G. Patil, Kevin Lin, Sarah Wooders, etc
Ferret: Refer and Ground Anything Anywhere at Any Granularity (2023.10.11)
Haoxuan You, Haotian Zhang, Zhe Gan, Xianzhi Du, Bowen Zhang, etc
Understanding the Effects of RLHF on LLM Generalisation and Diversity (2023.10.10)
Robert Kirk, Ishita Mediratta, Christoforos Nalmpantis, Jelena Luketina, Eric Hambro, etc
Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading (2023.10.08)
Howard Chen, Ramakanth Pasunuru, Jason Weston, Asli Celikyilmaz . - 【arXiv.org】
xVal: A Continuous Number Encoding for Large Language Models (2023.10.04)
Siavash Golkar, Mariel Pettee, Michael Eickenberg, Alberto Bietti, M. Cranmer, etc . - 【arXiv.org】
How FaR Are Large Language Models From Agents with Theory-of-Mind? (2023.10.04)
Pei Zhou, Aman Madaan, Srividya Pranavi Potharaju, Aditya Gupta, Kevin R. McKee, etc
MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens (2023.10.03)
Kaizhi Zheng, Xuehai He, Xin Eric Wang . - 【arXiv.org】
PB-LLM: Partially Binarized Large Language Models (2023.09.29)
Yuzhang Shang, Zhihang Yuan, Qiang Wu, Zhen Dong . - 【arXiv.org】
GPT-Fathom: Benchmarking Large Language Models to Decipher the Evolutionary Path towards GPT-4 and Beyond (2023.09.28)
Shen Zheng, Yuyu Zhang, Yijie Zhu, Chenguang Xi, Pengyang Gao, etc . - 【arXiv.org】
Chatmap : Large Language Model Interaction with Cartographic Data (2023.09.28)
Eren Unlu . - 【arXiv.org】
Integration of Large Language Models within Cognitive Architectures for Autonomous Robots (2023.09.26)
Miguel Ángel González Santamarta, F. J. Lera, Ángel Manuel Guerrero Higueras, Vicente Matellán Olivera . - 【arXiv.org】
Effective Distillation of Table-based Reasoning Ability from LLMs (2023.09.22)
Bohao Yang, Chen Tang, Kangning Zhao, Chenghao Xiao, Chenghua Lin . - 【arXiv.org】
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs (2023.09.22)
Justin Chih-Yao Chen, Swarnadeep Saha, Mohit Bansal . - 【arXiv.org】
Chain-of-Verification Reduces Hallucination in Large Language Models (2023.09.20)
S. Dhuliawala, M. Komeili, Jing Xu, Roberta Raileanu, Xian Li, etc . - 【arXiv.org】
Kosmos-2.5: A Multimodal Literate Model (2023.09.20)
Tengchao Lv, Yupan Huang, Jingye Chen, Lei Cui, Shuming Ma, etc . - 【arXiv.org】
DreamLLM: Synergistic Multimodal Comprehension and Creation (2023.09.20)
Runpei Dong, Chunrui Han, Yuang Peng, Zekun Qi, Zheng Ge, etc . - 【arXiv.org】
SwitchGPT: Adapting Large Language Models for Non-Text Outputs (2023.09.14)
Xinyu Wang, Bohan Zhuang, Qi Wu . - 【arXiv.org】
NExT-GPT: Any-to-Any Multimodal LLM (2023.09.11)
Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-Seng Chua . - 【arXiv.org】
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting (2023.09.08)
Griffin Adams, Alexander R. Fabbri, Faisal Ladhak, Eric Lehman, Noémie Elhadad . - 【arXiv.org】
Large Language Models as Optimizers (2023.09.07)
Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, etc
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models (2023.09.07)
Yung-Sung Chuang, Yujia Xie, Hongyin Luo, Yoon Kim, James R. Glass, etc . - 【arXiv.org】
YaRN: Efficient Context Window Extension of Large Language Models (2023.08.31)
Bowen Peng, Jeffrey Quesnelle, Honglu Fan, Enrico Shippole . - 【arXiv.org】
MVDream: Multi-view Diffusion for 3D Generation (2023.08.31)
Yichun Shi, Peng Wang, Jianglong Ye, Mai Long, Kejie Li, etc . - 【arXiv.org】
FedLogic: Interpretable Federated Multi-Domain Chain-of-Thought Prompt Selection for Large Language Models (2023.08.29)
Pengwei Xing, Songtao Lu, Han Yu . - 【arXiv.org】
PE-MED: Prompt Enhancement for Interactive Medical Image Segmentation (2023.08.26)
Ao Chang, Xing Tao, Xin Yang, Yuhao Huang, Xinrui Zhou, etc . - 【arXiv.org】
DARWIN Series: Domain Specific Large Language Models for Natural Science (2023.08.25)
Tong Xie, Yuwei Wan, Wei Huang, Yufei Zhou, Yixuan Liu, etc . - 【arXiv.org】
ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation (2023.08.22)
Jianghao Lin, Rongjie Shan, Chenxu Zhu, Kounianhua Du, Bo Chen, etc . - 【arXiv.org】
SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding (2023.08.21)
Tianyu Yu, Chengyue Jiang, Chao Lou, Shen Huang, Xiaobin Wang, etc . - 【arXiv.org】
Giraffe: Adventures in Expanding Context Lengths in LLMs (2023.08.21)
Arka Pal, Deep Karkhanis, Manley Roberts, S. Dooley, Arvind Sundararajan, etc . - 【arXiv.org】
ExpeL: LLM Agents Are Experiential Learners (2023.08.20)
Andrew Zhao, Daniel Huang, Quentin Xu, Matthieu Lin, Y. Liu, etc . - 【arXiv.org】
Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes (2023.08.17)
Zehan Wang, Haifeng Huang, Yang Zhao, Ziang Zhang, Zhou Zhao . - 【arXiv.org】
The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation (2023.08.14)
Patrick Fernandes, Daniel Deutsch, M. Finkelstein, Parker Riley, André F. T. Martins, etc . - 【arXiv.org】
Accelerating LLM Inference with Staged Speculative Decoding (2023.08.08)
Benjamin Spector, Christal Re . - 【arXiv.org】
Shepherd: A Critic for Language Model Generation (2023.08.08)
Tianlu Wang, Ping Yu, Xiaoqing Tan, Sean O'Brien, Ramakanth Pasunuru, etc . - 【arXiv.org】
AgentBench: Evaluating LLMs as Agents (2023.08.07)
Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, etc . - 【arXiv.org】
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models (2023.08.03)
Zheng Yuan, Hongyi Yuan, Cheng Li, Guanting Dong, Chuanqi Tan, etc . - 【arXiv.org】
Advancing Beyond Identification: Multi-bit Watermark for Language Models (2023.08.01)
Kiyoon Yoo, W. Ahn, N. Kwak . - 【arXiv.org】
A Private Watermark for Large Language Models (2023.07.30)
Aiwei Liu, Leyi Pan, Xuming Hu, Shuang Li, Lijie Wen, etc . - 【arXiv.org】
Robust Distortion-free Watermarks for Language Models (2023.07.28)
Rohith Kuditipudi, John Thickstun, Tatsunori Hashimoto, Percy Liang . - 【arXiv.org】
Publisher Correction: Large language models encode clinical knowledge. (2023.07.27)
K. Singhal, Shekoofeh Azizi, Tao Tu, S. S. Mahdavi, Jason Wei, etc . - 【Nature】
Med-Flamingo: a Multimodal Medical Few-shot Learner (2023.07.27)
Michael Moor, Qian Huang, Shirley Wu, Michihiro Yasunaga, C. Zakka, etc . - 【arXiv.org】
Med-Flamingo: a Multimodal Medical Few-shot Learner (2023.07.27)
Michael Moor, Qian Huang, Shirley Wu, Michihiro Yasunaga, C. Zakka, etc . - 【arXiv.org】
CARTIER: Cartographic lAnguage Reasoning Targeted at Instruction Execution for Robots (2023.07.21)
Nikhil Kakodkar, D. Rivkin, Bobak H. Baghi, F. Hogan, Gregory Dudek . - 【arXiv.org】
ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning (2023.07.18)
Liang Zhao, En Yu, Zheng Ge, Jinrong Yang, Hao-Ran Wei, etc . - 【arXiv.org】
TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT (2023.07.17)
Liangyu Zha, Junlin Zhou, Liyao Li, Rui Wang, Qingyi Huang, etc . - 【arXiv.org】
MasterKey: Automated Jailbreak Across Multiple Large Language Model Chatbots (2023.07.16)
Gelei Deng, Yi Liu, Yuekang Li, Kailong Wang, Ying Zhang, etc
Self-consistency for open-ended generations (2023.07.11)
Siddhartha Jain, Xiaofei Ma, Anoop Deoras, Bing Xiang . - 【arXiv.org】
LongNet: Scaling Transformers to 1, 000, 000, 000 Tokens (2023.07.05)
Jiayu Ding, Shuming Ma, Li Dong, Xingxing Zhang, Shaohan Huang, etc . - 【arXiv.org】
Mitigating the Learning Bias towards Repetition by Self-Contrastive Training for Open-Ended Generation (2023.07.04)
Jian Guan, Minlie Huang . - 【Annual Meeting of the Association for Computational Linguistics】
Math Agents: Computational Infrastructure, Mathematical Embedding, and Genomics (2023.07.04)
M. Swan, Takashi Kido, Eric Roland, R. P. D. Santos . - 【arXiv.org】
Conformer LLMs - Convolution Augmented Large Language Models (2023.07.02)
Prateek Verma . - 【arXiv.org】
Inferring the Goals of Communicating Agents from Actions and Instructions (2023.06.28)
Lance Ying, Tan Zhi-Xuan, Vikash K. Mansinghka, J. Tenenbaum . - 【arXiv.org】
Kosmos-2: Grounding Multimodal Large Language Models to the World (2023.06.26)
Zhiliang Peng, Wenhui Wang, Li Dong, Y. Hao, Shaohan Huang, etc . - 【arXiv.org】
AudioPaLM: A Large Language Model That Can Speak and Listen (2023.06.22)
Paul K. Rubenstein, Chulayuth Asawaroengchai, D. Nguyen, Ankur Bapna, Zalán Borsos, etc . - 【arXiv.org】
Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models (2023.06.14)
Lingxi Xie, Longhui Wei, Xiaopeng Zhang, Kaifeng Bi, Xiaotao Gu, etc . - 【arXiv.org】
XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models (2023.06.13)
Omkar Thawakar, Abdelrahman M. Shaker, Sahal Shaji Mullappilly, Hisham Cholakkal, R. Anwer, etc . - 【arXiv.org】
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena (2023.06.09)
Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, etc . - 【arXiv.org】
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena (2023.06.09)
Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, etc . - 【arXiv.org】
PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance (2023.06.08)
Qianqian Xie, Weiguang Han, Xiao Zhang, Yanzhao Lai, Min Peng, etc . - 【arXiv.org】
ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory (2023.06.06)
Chenxu Hu, Jie Fu, Chenzhuang Du, Simian Luo, J. Zhao, etc . - 【arXiv.org】
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only (2023.06.01)
Guilherme Penedo, Quentin Malartic, Daniel Hesslow, Ruxandra-Aimée Cojocaru, Alessandro Cappelli, etc . - 【arXiv.org】
Baselines for Identifying Watermarked Large Language Models (2023.05.29)
Leonard Tang, Gavin Uberti, Tom Shlomi . - 【arXiv.org】
Undetectable Watermarks for Language Models (2023.05.25)
Miranda Christ, S. Gunn, Or Zamir . - 【IACR Cryptology ePrint Archive】
Mitigating Temporal Misalignment by Discarding Outdated Facts (2023.05.24)
Michael J.Q. Zhang, Eunsol Choi
Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective (2023.05.24)
Guhao Feng, Yuntian Gu, Bohang Zhang, Haotian Ye, Di He, etc
Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering (2023.05.24)
Avi Caciularu, Matthew E. Peters, Jacob Goldberger, Ido Dagan, Arman Cohan
Context-Aware Transformer Pre-Training for Answer Sentence Selection (2023.05.24)
Luca Di Liello, Siddhant Garg, Alessandro Moschitti
Gorilla: Large Language Model Connected with Massive APIs (2023.05.24)
Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez
Visual Programming for Text-to-Image Generation and Evaluation (2023.05.24)
Jaemin Cho, Abhay Zala, Mohit Bansal
Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model (2023.05.24)
Zirui Liu, Guanchu Wang, Shaochen Zhong, Zhaozhuo Xu, Daochen Zha, etc
LMs with a Voice: Spoken Language Modeling beyond Speech Tokens (2023.05.24)
Eliya Nachmani, Alon Levkovitch, Julian Salazar, Chulayutsh Asawaroengchai, Soroosh Mariooryad, etc
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator (2023.05.24)
Ziwei He, Meng Yang, Minwei Feng, Jingcheng Yin, Xinbing Wang, etc
CSTS: Conditional Semantic Textual Similarity (2023.05.24)
Ameet Deshpande, Carlos E. Jimenez, Howard Chen, Vishvak S. Murahari, Victoria Graf, etc
STAR: Boosting Low-Resource Event Extraction by Structure-to-Text Data Generation with Large Language Models (2023.05.24)
Mingyu Derek Ma, Xiaoxuan Wang, Po-Nien Kung, P. Jeffrey Brantingham, Nanyun Peng, etc
Contrastive Learning of Sentence Embeddings from Scratch (2023.05.24)
Junlei Zhang, Zhenzhong Lan, Junxian He
Meta-Learning Online Adaptation of Language Models (2023.05.24)
Nathan J. Hu, Eric Mitchell, Christopher D. Manning, Chelsea Finn
Who Wrote this Code? Watermarking for Code Generation (2023.05.24)
Taehyun Lee, Seokhee Hong, Jaewoo Ahn, Ilgee Hong, Hwaran Lee, etc
Reasoning over Hierarchical Question Decomposition Tree for Explainable Question Answering (2023.05.24)
Jiajie Zhang, Shulin Cao, Tingjia Zhang, Xin Lv, Jiaxin Shi, etc
Understanding Arithmetic Reasoning in Language Models using Causal Mediation Analysis (2023.05.24)
Alessandro Stolfo, Yonatan Belinkov, Mrinmaya Sachan
Active Learning for Natural Language Generation (2023.05.24)
Yotam Perlitz, Ariel Gera, Michal Shmueli-Scheuer, Dafna Sheinwald, Noam Slonim, etc
SmartTrim: Adaptive Tokens and Parameters Pruning for Efficient Vision-Language Models (2023.05.24)
Zekun Wang, Jingchang Chen, Wangchunshu Zhou, Ming Liu, Bing Qin
How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives (2023.05.24)
Xinpeng Wang, Leonie Weissweiler, Hinrich Schutze, Barbara Plank
ChatAgri: Exploring Potentials of ChatGPT on Cross-linguistic Agricultural Text Classification (2023.05.24)
Biao Zhao, Weiqiang Jin, Javier Del Ser, Guang Yang
Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models (2023.05.24)
Gen Luo, Yiyi Zhou, Tianhe Ren, Shengxin Chen, Xiaoshuai Sun, etc
Unlocking Temporal Question Answering for Large Language Models Using Code Execution (2023.05.24)
Xingxuan Li, Liying Cheng, Qingyu Tan, Hwee Tou Ng, Shafiq Joty, etc
Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation (2023.05.24)
Haonan Li, Fajri Koto, Minghao Wu, Alham Fikri Aji, Timothy Baldwin
Injecting Knowledge into Biomedical Pre-trained Models via Polymorphism and Synonymous Substitution (2023.05.24)
Hongbo Zhang, Xiang Wan, Benyou Wang
LLMDet: A Large Language Models Detection Tool (2023.05.24)
Kangxi Wu, Liang Pang, Huawei Shen, Xueqi Cheng, Tat-Seng Chua
The Art of SOCRATIC QUESTIONING: Zero-shot Multimodal Reasoning with Recursive Thinking and Self-Questioning (2023.05.24)
Jingyuan Qi, Zhiyang Xu, Ying Shen, Minqian Liu, Di Jin, etc
Reasoning with Language Model is Planning with World Model (2023.05.24)
Shibo Hao, Yi Gu, Haodi Ma, Joshua Jiahua Hong, Zhen Wang, etc
Large Language Models are Effective Table-to-Text Generators, Evaluators, and Feedback Providers (2023.05.24)
Yilun Zhao, Haowei Zhang, Shengyun Si, Linyong Nan, Xiangru Tang, etc
Improving Factuality of Abstractive Summarization without Sacrificing Summary Quality (2023.05.24)
Tanay Dixit, Fei Wang, Muhao Chen
OverPrompt: Enhancing ChatGPT Capabilities through an Efficient In-Context Learning Approach (2023.05.24)
Jiazheng Li, Runcong Zhao, Yulan He, Lin Gui
MMNet: Multi-Mask Network for Referring Image Segmentation (2023.05.24)
Yichen Yan, Xingjian He, Wenxuan Wan, Jing Liu
Tricking LLMs into Disobedience: Understanding, Analyzing, and Preventing Jailbreaks (2023.05.24)
Abhinav Rao, Sachin Vashistha, Atharva Naik, Somak Aditya, Monojit Choudhury
Editing Commonsense Knowledge in GPT (2023.05.24)
Anshita Gupta, Debanjan Mondal, Akshay Krishna Sheshadri, Wenlong Zhao, Xiang Lorraine Li, etc
Cross-lingual Data Augmentation for Document-grounded Dialog Systems in Low Resource Languages (2023.05.24)
Qi Gou, Zehua Xia, Wen-Hau Du
Trade-Offs Between Fairness and Privacy in Language Modeling (2023.05.24)
Cleo Matzken, Steffen Eger, Ivan Habernal
Frugal Prompting for Dialog Models (2023.05.24)
Bishal Santra, Sakya Basak, Abhinandan De, Manish Gupta, Pawan Goyal
M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection (2023.05.24)
Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, etc
PIVOINE: Instruction Tuning for Open-world Information Extraction (2023.05.24)
Keming Lu, Xiaoman Pan, Kaiqiang Song, Hongming Zhang, Dong Yu, etc
Text encoders are performance bottlenecks in contrastive vision-language models (2023.05.24)
Amita Kamath, Jack Hessel, Kai-Wei Chang
Privacy Implications of Retrieval-Based Language Models (2023.05.24)
Yangsibo Huang, Samyak Gupta, Zexuan Zhong, Kai Li, Danqi Chen
Interpretable by Design Visual Question Answering (2023.05.24)
Xingyu Fu, Ben Zhou, Sihao Chen, Mark Yatskar, D. Roth
Leveraging GPT-4 for Automatic Translation Post-Editing (2023.05.24)
Vikas Raunak, Amr Sharaf, Hany Hassan Awadallah, Arul Menezes
CAR: Conceptualization-Augmented Reasoner for Zero-Shot Commonsense Question Answering (2023.05.24)
Weiqi Wang, Tianqing Fang, Wenxuan Ding, Baixuan Xu, Xin Liu, etc
Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient Pre-LN Transformers (2023.05.24)
Zixuan Jiang, Jiaqi Gu, Hanqing Zhu, D. Pan
Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models (2023.05.24)
Gen Luo, Yiyi Zhou, Tianhe Ren, Shen Chen, Xiaoshuai Sun, etc . - 【arXiv.org】
Towards Few-shot Entity Recognition in Document Images: A Graph Neural Network Approach Robust to Image Manipulation (2023.05.24)
Prashant Krishnan, Zilong Wang, Yangkun Wang, Jingbo Shang
Machine Reading Comprehension using Case-based Reasoning (2023.05.24)
Dung Thai, Dhruv Agarwal, Mudit Chaudhary, R. Das, M. Zaheer, etc
Debiasing Made State-of-the-art: Revisiting the Simple Seed-based Weak Supervision for Text Classification (2023.05.24)
Chengyu Dong, Zihan Wang, Jingbo Shang
Text Conditional Alt-Text Generation for Twitter Images (2023.05.24)
Nikita Srivatsan, Sofia Samaniego, Omar Florez, Taylor Berg-Kirkpatrick
SSD-2: Scaling and Inference-time Fusion of Diffusion Language Models (2023.05.24)
Xiaochuang Han, Sachin Kumar, Yulia Tsvetkov, Marjan Ghazvininejad
UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning (2023.05.24)
Ahmed Masry, Parsa Kavehzadeh, Xuan Long Do, Enamul Hoque, Shafiq Joty
Trusting Your Evidence: Hallucinate Less with Context-aware Decoding (2023.05.24)
Weijia Shi, Xiaochuang Han, M. Lewis, Yulia Tsvetkov, Luke Zettlemoyer, etc
In-Context Demonstration Selection with Cross Entropy Difference (2023.05.24)
Dan Iter, Reid Pryzant, Ruochen Xu, Shuohang Wang, Yang Liu, etc
GlobalBench: A Benchmark for Global Progress in Natural Language Processing (2023.05.24)
Y. Song, Catherine Cui, Simran Khanuja, Pengfei Liu, FAHIM FAISAL, etc
The student becomes the master: Matching GPT3 on Scientific Factual Error Correction (2023.05.24)
Dhananjay Ashok, Atharva Kulkarni, Hai Pham, Barnab'as P'oczos
PruMUX: Augmenting Data Multiplexing with Model Compression (2023.05.24)
Yushan Su, Vishvak S. Murahari, Karthik Narasimhan, Kai Li
Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts (2023.05.24)
Sheng Shen, Le Hou, Yanqi Zhou, Nan Du, Shayne Longpre, etc
A Causal View of Entity Bias in (Large) Language Models (2023.05.24)
Fei Wang, Wenjie Mo, Yiwei Wang, Wenxuan Zhou, Muhao Chen
Emergent inabilities? Inverse scaling over the course of pretraining (2023.05.24)
James A. Michaelov, B. Bergen
Ishani Mondal, Michelle Yuan, N Anandhavelu, Aparna Garimella, Francis Ferraro, etc
Reinforcement Learning finetuned Vision-Code Transformer for UI-to-Code Generation (2023.05.24)
Davit Soselia, Khalid Saifullah, Tianyi Zhou
KNN-LM Does Not Improve Open-ended Text Generation (2023.05.24)
Shufan Wang, Yixiao Song, Andrew Drozdov, Aparna Garimella, Varun Manjunatha, etc
Abductive Commonsense Reasoning Exploiting Mutually Exclusive Explanations (2023.05.24)
Wenting Zhao, Justin T. Chiu, Claire Cardie, Alexander M. Rush
Language Models with Rationality (2023.05.23)
Nora Kassner, Oyvind Tafjord, Ashish Sabharwal, Kyle Richardson, Hinrich Schütze, etc
A Trip Towards Fairness: Bias and De-Biasing in Large Language Models (2023.05.23)
Leonardo Ranaldi, Elena Sofia Ruzzetti, Davide Venditti, Dario Onorati, Fabio Massimo Zanzotto
Question Answering as Programming for Solving Time-Sensitive Questions (2023.05.23)
Xinyu Zhu, Cheng Yang, Bei Chen, Siheng Li, Jian-Guang Lou, etc
PaD: Program-aided Distillation Specializes Large Models in Reasoning (2023.05.23)
Xuekai Zhu, Biqing Qi, Kaiyan Zhang, Xingwei Long, Bowen Zhou
Aligning Large Language Models through Synthetic Feedback (2023.05.23)
Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak, etc
LogicLLM: Exploring Self-supervised Logic-enhanced Training for Large Language Models (2023.05.23)
Fangkai Jiao, Zhiyang Teng, Shafiq Joty, Bosheng Ding, Aixin Sun, etc
Masked Path Modeling for Vision-and-Language Navigation (2023.05.23)
Zi-Yi Dou, Feng Gao, Nanyun Peng . - 【arXiv.org】
ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models (2023.05.23)
Z. Chen, Kun Zhou, Beichen Zhang, Zheng Gong, Wayne Xin Zhao, etc
DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules (2023.05.22)
Yanchen Liu, William Held, Diyi Yang
Knowledge-Retrieval Task-Oriented Dialog Systems with Semi-Supervision (2023.05.22)
Yucheng Cai, Hong Liu, Zhijian Ou, Y. Huang, Junlan Feng
Sentence Representations via Gaussian Embedding (2023.05.22)
Shohei Yoda, Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda
LM-Switch: Lightweight Language Model Conditioning in Word Embedding Space (2023.05.22)
Chi Han, Jialiang Xu, Manling Li, Y. Fung, Chenkai Sun, etc
MacLaSa: Multi-Aspect Controllable Text Generation via Efficient Sampling from Compact Latent Space (2023.05.22)
Hanxing Ding, Liang Pang, Z. Wei, Huawei Shen, Xueqi Cheng, etc
Enhancing Cross-lingual Natural Language Inference by Soft Prompting with Multilingual Verbalizer (2023.05.22)
Shuang Li, Xuming Hu, Aiwei Liu, Yawen Yang, Fukun Ma, etc
A Benchmark on Extremely Weakly Supervised Text Classification: Reconcile Seed Matching and Prompting Approaches (2023.05.22)
Zihan Wang, Tianle Wang, Dheeraj Mekala, Jingbo Shang
Keeping Up with the Language Models: Robustness-Bias Interplay in NLI Data and Models (2023.05.22)
Ioana Baldini, Chhavi Yadav, Payel Das, K. Varshney
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis (2023.05.22)
Fuzhao Xue, Yao Fu, Wangchunshu Zhou, Zangwei Zheng, Yang You
Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance (2023.05.22)
Yue Zhang, Leyang Cui, Deng Cai, Xinting Huang, Tao Fang, etc
InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT (2023.05.22)
Yichong Xu, Ruochen Xu, Dan Iter, Yang Liu, Shuo Wang, etc
Making Language Models Better Tool Learners with Execution Feedback (2023.05.22)
Shuofei Qiao, Honghao Gui, Huajun Chen, Ningyu Zhang
GPT-SW3: An Autoregressive Language Model for the Nordic Languages (2023.05.22)
Ariel Ekgren, Amaru Cuba Gyllensten, F. Stollenwerk, Joey Ohman, Tim Isbister, etc
ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist Examination (2023.05.22)
Dongfang Li, Jindi Yu, Baotian Hu, Zhenran Xu, Min Zhang
Infor-Coef: Information Bottleneck-based Dynamic Token Downsampling for Compact and Efficient language model (2023.05.21)
Wenxin Tan
Contrastive Learning with Logic-driven Data Augmentation for Logical Reasoning over Text (2023.05.21)
Qiming Bao, Alex Yuxuan Peng, Zhenyun Deng, Wanjun Zhong, Neset Tan, etc
Retrieving Texts based on Abstract Descriptions (2023.05.21)
Shauli Ravfogel, Valentina Pyatkin, Amir D. N. Cohen, Avshalom Manevich, Yoav Goldberg
Pruning Pre-trained Language Models with Principled Importance and Self-regularization (2023.05.21)
Siyu Ren, Kenny Q. Zhu
Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers (2023.05.21)
Linyuan Gong, Chenyan Xiong, Xiaodong Liu, Payal Bajaj, Yiqing Xie, etc
Pointwise Mutual Information Based Metric and Decoding Strategy for Faithful Generation in Document Grounded Dialogs (2023.05.20)
Yatin Nandwani, Vineet Kumar, Dinesh Raghu, Sachindra Joshi, L. Lastras
Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning (2023.05.20)
Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
LogiCoT: Logical Chain-of-Thought Instruction-Tuning Data Collection with GPT-4 (2023.05.20)
Hanmeng Liu, Zhiyang Teng, Leyang Cui, Chaoli Zhang, Qiji Zhou, etc
Self-QA: Unsupervised Knowledge Guided Language Model Alignment (2023.05.19)
Xuanyu Zhang, Qing Yang
SelfzCoT: a Self-Prompt Zero-shot CoT from Semantic-level to Code-level for a Better Utilization of LLMs (2023.05.19)
IokTong Lei, ZhiDong Deng . - 【arXiv.org】
Self-Agreement: A Framework for Fine-tuning Language Models to Find Agreement among Diverse Opinions (2023.05.19)
Shiyao Ding, Takayuki Ito . - 【arXiv.org】
BOLT: Fast Energy-based Controlled Text Generation with Tunable Biases (2023.05.19)
Xin Liu, Muhammad Khalifa, Lu Wang
STOAT: Structured Data to Analytical Text With Controls (2023.05.19)
Deepanway Ghosal, Preksha Nema, A. Raghuveer . - 【arXiv.org】
Decouple knowledge from paramters for plug-and-play language modeling (2023.05.19)
Xin Cheng, Yankai Lin, Xiuying Chen, Dongyan Zhao, Rui Yan . - 【arXiv.org】
Enhancing Personalized Dialogue Generation with Contrastive Latent Variables: Combining Sparse and Dense Persona (2023.05.19)
Yihong Tang, Bo Wang, Miao Fang, Dongming Zhao, Kun Huang, etc . - 【arXiv.org】
XuanYuan 2.0: A Large Chinese Financial Chat Model with Hundreds of Billions Parameters (2023.05.19)
Xuanyu Zhang, Qing Yang, Dongliang Xu
Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning (2023.05.19)
Mustafa Safa Ozdayi, Charith S. Peris, Jack G. M. FitzGerald, Christophe Dupuy, Jimit Majmudar, etc . - 【arXiv.org】
RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought (2023.05.19)
Tianci Xue, Ziqi Wang, Zhenhailong Wang, Chi Han, Pengfei Yu, etc . - 【arXiv.org】
LLM Itself Can Read and Generate CXR Images (2023.05.19)
Suhyeon Lee, Won Jun Kim, Jong-Chul Ye . - 【arXiv.org】
Post Hoc Explanations of Language Models Can Improve Language Models (2023.05.19)
Satyapriya, Krishna, Jiaqi Ma, Dylan Slack, Asma Ghandeharioun, etc . - 【arXiv.org】
Federated Foundation Models: Privacy-Preserving and Collaborative Learning for Large Models (2023.05.19)
Sixing Yu, J. P. Muñoz, A. Jannesari . - 【arXiv.org】
Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning (2023.05.19)
Po-Nien Kung, Nanyun Peng . - 【arXiv.org】
AutoTrial: Prompting Language Models for Clinical Trial Design (2023.05.19)
Zifeng Wang, Cao Xiao, Jimeng Sun . - 【arXiv.org】
Democratized Diffusion Language Model (2023.05.18)
Nikita Balagansky, Daniil Gavrilov . - 【arXiv.org】
Ahead-of-Time P-Tuning (2023.05.18)
Daniil Gavrilov, Nikita Balagansky . - 【arXiv.org】
SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation via Over-sampling and Post-evaluation (2023.05.18)
Junkai Zhou, Liang Pang, Huawei Shen, Xueqi Cheng . - 【arXiv.org】
How does the task complexity of masked pretraining objectives affect downstream performance? (2023.05.18)
Atsuki Yamaguchi, Hiroaki Ozaki, Terufumi Morishita, Gaku Morio, Yasuhiro Sogawa . - 【arXiv.org】
Ditto: A Simple and Efficient Approach to Improve Sentence Embeddings (2023.05.18)
Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Chong Deng, etc . - 【arXiv.org】
ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval (2023.05.18)
Yue Yu, Yuchen Zhuang, Rongzhi Zhang, Yu Meng, Jiaming Shen, etc . - 【arXiv.org】
Efficient Prompting via Dynamic In-Context Learning (2023.05.18)
Wangchunshu Zhou, Yuchen Jiang, Ryan Cotterell, Mrinmaya Sachan . - 【arXiv.org】
LIMA: Less Is More for Alignment (2023.05.18)
Chunting Zhou, Pengfei Liu, Puxin Xu, Srini Iyer, Jiao Sun, etc . - 【arXiv.org】
SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities (2023.05.18)
Dong Zhang, Shimin Li, Xin Zhang, Jun Zhan, P. Wang, etc . - 【arXiv.org】
The Web Can Be Your Oyster for Improving Large Language Models (2023.05.18)
Junyi Li, Tianyi Tang, Wayne Xin Zhao, Jingyuan Wang, J. Nie, etc . - 【arXiv.org】
TOME: A Two-stage Approach for Model-based Retrieval (2023.05.18)
Ruiyang Ren, Wayne Xin Zhao, J. Liu, Huaqin Wu, Ji-rong Wen, etc . - 【arXiv.org】
When Gradient Descent Meets Derivative-Free Optimization: A Match Made in Black-Box Scenario (2023.05.17)
Chengcheng Han, Liqing Cui, Renyu Zhu, J. Wang, Nuo Chen, etc . - 【arXiv.org】
Emergent and Predictable Memorization in Large Language Models (2023.04.21)
Stella Rose Biderman, Usvsn Sai Prashanth, Lintang Sutawika, Hailey Schoelkopf, Quentin G. Anthony, etc
Improving Multiparty Interactions with a Robot Using Large Language Models (2023.04.19)
Prasanth Murali, Ian Steenstra, Hye Sun Yun, Ameneh Shamekhi, T. Bickmore . - 【CHI Extended Abstracts】
Large Language Models Can Be Used to Estimate the Latent Positions of Politicians (2023.03.21)
Patrick Y. Wu, Joshua A. Tucker, Jonathan Nagler, Solomon Messing
SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks (2023.03.01)
Kai-Wei Chang, Yu-Kai Wang, Hua Shen, Iu-thing Kang, W. Tseng, etc . - 【ArXiv】
Soft Prompt Guided Joint Learning for Cross-Domain Sentiment Analysis (2023.03.01)
Jingli Shi, Weihua Li, Quan-wei Bai, Yi Yang, Jianhua Jiang . - 【ArXiv】
EvoPrompting: Language Models for Code-Level Neural Architecture Search (2023.02.28)
Angelica Chen, David Dohan, David R. So . - 【ArXiv】
Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, C. Endres, Thorsten Holz, etc . - 【ArXiv】
Grimm in Wonderland: Prompt Engineering with Midjourney to Illustrate Fairytales (2023.02.17)
M. Ruskov . - 【ArXiv】
LabelPrompt: Effective Prompt-based Learning for Relation Classification (2023.02.16)
W. Zhang, Xiaoning Song, Zhenhua Feng, Tianyang Xu, Xiaojun Wu . - 【ArXiv】
Prompt Tuning of Deep Neural Networks for Speaker-adaptive Visual Speech Recognition (2023.02.16)
Minsu Kim, Hyungil Kim, Y. Ro . - 【ArXiv】
Prompting for Multimodal Hateful Meme Classification (2023.02.08)
Rui Cao, R. Lee, Wen-Haw Chong, Jing Jiang . - 【Conference on Empirical Methods in Natural Language Processing】
Toxicity Detection with Generative Prompt-based Inference (2022.05.24)
Yau-Shian Wang, Y. Chang . - 【ArXiv】
Learning to Transfer Prompts for Text Generation (2022.05.03)
Junyi Li, Tianyi Tang, J. Nie, Ji-rong Wen, Wayne Xin Zhao . - 【North American Chapter of the Association for Computational Linguistics】
RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction (2022.03.17)
Yew Ken Chia, Lidong Bing, Soujanya Poria, Luo Si . - 【Findings】
QaNER: Prompting Question Answering Models for Few-shot Named Entity Recognition (2022.03.03)
Andy T. Liu, Wei Xiao, Henghui Zhu, Dejiao Zhang, Shang-Wen Li, etc . - 【ArXiv】
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts (2022.02.02)
Stephen H. Bach, Victor Sanh, Zheng Xin Yong, Albert Webson, Colin Raffel, etc . - 【Annual Meeting of the Association for Computational Linguistics】
Few-Shot Bot: Prompt-Based Learning for Dialogue Systems (2021.10.15)
Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung . - 【ArXiv】
SentiPrompt: Sentiment Knowledge Enhanced Prompt-Tuning for Aspect-Based Sentiment Analysis (2021.09.17)
Chengxi Li, Feiyu Gao, Jiajun Bu, Lu Xu, Xiang Chen, etc . - 【ArXiv】
LightNER: A Lightweight Tuning Paradigm for Low-resource NER via Pluggable Prompting (2021.08.31)
Xiang Chen, Lei Li, Shumin Deng, Chuanqi Tan, Changliang Xu, etc . - 【International Conference on Computational Linguistics】
Program Synthesis with Large Language Models (2021.08.16)
Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, H. Michalewski, etc . - 【ArXiv】
Evaluating Large Language Models Trained on Code (2021.07.07)
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde, etc . - 【ArXiv】
KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction (2021.04.15)
Xiang Chen, Ningyu Zhang, Ningyu Zhang, Xin Xie, Shumin Deng, etc . - 【The Web Conference】
Language Models as Knowledge Bases? (2019.09.01)
Fabio Petroni, Tim Rocktäschel, Patrick Lewis, A. Bakhtin, Yuxiang Wu, etc . - 【Conference on Empirical Methods in Natural Language Processing】
Leveraging Commonsense Knowledge from Large Language Models for Task and Motion Planning
Yan Ding, Xiaohan Zhang
AdaPrompt: Adaptive Prompt-based Finetuning for Relation Extraction
Xiang Chen, Xin Xie, Ningyu Zhang, Jiahuan Yan, Shumin Deng, etc
Mingshu Zhai, Jiaao He, Zixuan Ma, Zan Zong, Runqing Zhang, etc . - 【USENIX Annual Technical Conference】
T. Santhi, K. Srinivasan . - 【IEEE Transactions on Learning Technologies】