Skip to content

MiuLab/LMRec-Survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Augment or Not? A Comparative Study of Pure and Augmented Large Language Model Recommenders

Static Badge GitHub Repo stars GitHub last commit

🌞 Paper Overview

✨ Table of content

πŸ‘€ Overview

LLM Recommenders utilize LLM to do the recommendations. In this survey, we further concentrate to LLM as final decision maker. That is, given $\mathcal{U}$ be the set of users, $\mathcal{I}$ be the set of items, $\mathcal{M}$ be the set of meta information, the LLM Recommenders $\mathbb{L}$ will conduct the decision of the recommendations ($\mathcal{R}$).

$$\mathbb{L}: \mathcal{U} \times \mathcal{I} \times \mathcal{M} \times f(\mathcal{U}, \mathcal{I}, \mathcal{M} ) \rightarrow \mathcal{R}.$$

where $f$ denotes the augmentation map, which can be any non-LLM techniques designed to improve the performance of the LLM Recommender $\mathbb{L}$.

With the growing interest and parallel development in both pure LLM-based approaches and those augmented with non-LLM techniques, it is crucial to systematically understand the different aspects of both scenarios. Therefore, we categorize LLM Recommenders into Pure and Augmented approaches based on whether the augmentation map $f$ is zero-map or not.

πŸŒ• Pure LLM Recommenders

Pure LLM Recommenders refer to method that leverage the capabilities of LLMs to perform recommendation tasks. These methods can be further categorized into classes such as Naive Embedding Utilization,Naive Pretrained LM Finetuning, Instruction Tuning, Model Architectural Adaptations, Reflect-and-Rethink, and Others.

βœ… Naive Embedding Utilization

Naive Embedding Utilization refers to methods that directly leverage the final hidden state or aggregated embeddings produced by LLMs for recommendation tasks.

Venue Code Paper
CIKM'19 Code BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer
ACM'23 Code One Model for All: Large Language Models are Domain-Agnostic Recommendation Systems
KDD'23 Code Text Is All You Need: Learning Language Representations for Sequential Recommendation
Recsys'23 Code Leveraging Large Language Models for Sequential Recommendation

βœ… Naive Pretrained LM Finetuning

Naive Pretrained LM Finetuning refers to approaches that formulate recommendation as a natural language task and directly fine-tune pretrained language models.

Venue Code Paper
Recsys'22 Code Recommendation as Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5)
CIKM'23 Code Prompt Distillation for Efficient LLM-based Recommendation
NAACL'24 None Aligning Large Language Models with Recommendation Knowledge
ACL'24 Code RDRec: Rationale Distillation for LLM-based Recommendation

βœ… Instruction Tuning

Instruction tuning adapts LLMs to recommendation tasks by expressing them as instructional prompts.

Venue Code Paper
Recsys'23 Code TALLRec: An Effective and Efficient Tuning Framework to Align Large Language Model with Recommendation
ECIR'24 Code GenRec: Large Language Model for Generative Recommendation
ACM'25 Code A Bi-Step Grounding Paradigm for Large Language Models in Recommendation Systems
Arxiv'23 None Do LLMs Understand User Preferences? Evaluating LLMs On User Rating Prediction

βœ… Model Architectural Adaptations

In addition to standard applications of LLMs, numerous studies have proposed novel architectural adaptations of LLM backbones, specifically designed for recommendation systems.

Venue Code Paper
Arxiv'24 None Rethinking Large Language Model Architectures for Sequential Recommendations
Inf. Process. Manag.'25 Code Sequential recommendation by reprogramming pretrained transformer
Arxiv'25 None MoLoRec: A Generalizable and Efficient Framework for LLM-Based Recommendation

βœ… Reflect-and-Rethink

Reflect-and-Rethink methods go beyond standard supervised learning by reflecting on outputs, refining prompts, or interpreting user intent to guide prompts design.

Venue Code Paper
SIGIR'24 Code Large Language Models are Learnable Planners for Long-Term Recommendation
AAAI'25 None Re2LLM: Reflective Reinforcement Large Language Model for Session-based Recommendation
SIGIR'24 Code MACRec: a Multi-Agent Collaboration Framework for Recommendation
CIKM'24 Code RecPrompt: A Self-tuning Prompting Framework for News Recommendation Using Large Language Models
SIGIR'24 Code Large Language Models for Intent-Driven Session Recommendations
ACM'24 None Recommendation as Instruction Following: A Large Language Model Empowered Recommendation Approach
CIKM'23 Code Large Language Models as Zero-Shot Conversational Recommenders
SIGIR'24 Code Retrieval-Augmented Conversational Recommendation with Prompt-based Semi-Structured Natural Language State Tracking
ACL'25 Code iAgent: LLM Agent as a Shield between User and Recommender Systems

βœ… Others

Others focus on designing suitable training objectives, metadata summarization, data essence extraction, among others.

Venue Code Paper
Recsys'24 None CALRec: Contrastive Alignment of Generative LLMs for Sequential Recommendation
Arxiv'24 Code Causality-Enhanced Behavior Sequence Modeling in LLMs for Personalized Recommendation
WWW'24 Code Collaborative Large Language Model for Recommender Systems
Arxiv'24 Code Harnessing Large Language Models for Text-Rich Sequential Recommendation
WWW'25 None LLM4Rerank: LLM-based Auto-Reranking Framework for Recommendations
KDD'24 Code Bridging Items and Language: A Transition Paradigm for Large Language Model-Based Recommendation
SIGIR'24 Code Data-efficient Fine-tuning for LLM-based Recommendation

πŸŒ“ Augmented LLM Recommenders

Augmented LLM Recommenders refer to methods that enhance LLM Recommenders by incorporating non-LLM techniques. These methods can be further categorized into Semantic Identifiers Augmentation, Collaborative Modality Augmentation, Prompts Augmentation, and Retrieve-and-Rerank.

βœ… Semantic Identifiers Augmentation

Semantic Identifiers (or Semantic IDs) augmentation methods represent user or item IDs as implicit semantic sequences with the help of auxiliary coding techiques.

Venue Code Paper
SIGIR-AP'23 Code How to Index Item IDs for Recommendation Foundation Models
NeurIPS'23 None Recommender Systems with Generative Retrieval
ICDE'24 Code Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation
Arxiv'24 None Unifying Generative and Dense Retrieval for Sequential Recommendation
CIKM'24 Code Learnable Item Tokenization for Generative Recommendation

βœ… Collaborative Modality Augmentation

Collaborative Modality Augmentation methods seek to align collaborative information with language, usually by projecting embeddings derived from traditional collaborative models into the language space.

Venue Code Paper
ICDE'25 Code CoLLM: Integrating Collaborative Embeddings into Large Language Models for Recommendation
SIGIR'24 Code LLaRA: Large Language-Recommendation Assistant
NeurIPS'24 Code Customizing Language Models with Instance-wise LoRA for Sequential Recommendation
KDD'24 Code Large Language Models meet Collaborative Filtering: An Efficient All-round LLM-based Recommender System
SIGIR'24 None Integrating Large Language Models into Recommendation via Mutual Augmentation and Adaptive Aggregation

βœ… Prompts Augmentation

Prompts Augmentation methods utilize non-LM techniques to improve the quality of prompts.

Venue Code Paper
ACM'25 Code Reinforced Prompt Personalization for Recommendation with Large Language Models
WWW'25 Code Collaborative Retrieval for Large Language Model-based Conversational Recommender Systems

βœ… Retrieve-and-Rerank

Retrieve-and-Rerank methods first retrieve top-ranked candidates using non-LM techniques, and then apply LLMs to rerank them for final recommendation.

Venue Code Paper
Arxiv'23 Code Zero-Shot Next-Item Recommendation using Large Pretrained Language Models
PGAI@CIKM'23 Code LlamaRec: Two-Stage Recommendation using Large Language Models for Ranking
Arxiv'23 None PALR: Personalization Aware LLMs for Recommendation

:octocat: Experiment

Although many benchmarks exist for recommender systems, there remains a lack of comprehensive comparisons between Pure and Augmented LLM Recommenders under consistent, fair, and modern evaluation settings. To fill this gap, we design a unified experimental framework and use it to systematically assess the performance of both categories. The details of dataset benchmark can be referred to the paper and Benchmark Formulation. Following are the results of existing representative papers.

For results discussion, please also refer to the paper.

πŸŒ‹ The Challenge of LLM Recommenders

βœ… Distribution Gap between Recommendation and Language Semantics

The goal of recommendation systems is to provide accurate suggestions based on collaborative information, such as user-item interaction patterns. To achieve this, it is essential for recommenders to effectively model user underlying behavior. LLMs, trained on vast text corpora, are expected to implicitly encode some aspects of such patterns. However, recent research has shown that directly leveraging the implicit collaborative knowledge within LLMs remains a challenge.

Even with exhaustive tuning, LLM Recommenders may still be influenced by the pretrained language semantics. This can prevent LLMs from faithfully capturing the true collaborative semantics.

βœ… Echo Chamber Effects

The echo chamber effect refers to a situation in which individuals are predominantly exposed to information that reinforces models’ preexisting beliefs, often due to selective exposure, algorithmic filtering, or even the underlying social biases inherent in LLMs. In recommender systems, this can result in users repeatedly receiving a narrow range of items, irrespective of their current intent.

βœ… Position Bias

Position bias refers to the tendency for the perceived relevance or importance of recommended items to be influenced by their position in the prompt input list, which should ideally yield symmetric outputs under permutations. In recommendation systems, especially in zero-shot prompting scenarios, the position of the ground-truth item within the candidate set is significantly affected by this bias.

🍣 Future Direction

Cold-Start and Cross-Domain Generalizability are long-standing challenges in recommendation. LLM Recommenders offer a promising solution due to their ability to understand rich textual metadata. Current trend of LLM Recommenders tries to solve this problem. Although recent approaches tackle these challenges, opportunities for enhancement remain.

βœ… Cold-Start Issue

  • The remaining unsolved issue 1: The conditional probabilities objective of decoder tends to overfit to items seen during training, leading to a significantly reduced capability to generate cold-start items.

  • The remaining unsolved issue 2: Whether incorporating collaborative signals may degrade performance as collaborative filtering based methods tends to suffer more from cold-start scenarios.

βœ… Cross-Domain Generalizability

  • The remaining unsolved issue: This direction remains largely underexplored, with relatively few studies addressing and analyzing the issue.

:shipit: Benchmark Formulation

The dataset preprocessing methods for the experiments can be reproduced by following the instructions below."

πŸ’½ Dataset Download

To avoid different random.seed mechanism in different python version or different env, we stored our used dataset for naive numerical IDs dataset and the reranking dataset in the Google Drive. Notice that one might need to preprocess other information from Dataset Preparation.

πŸ”¦ Dataset Preparation

cd prepare_dataset
sh download_data.sh
sh prepare_dataset.sh

This will create ./data/ in the main directory with corresponding downloaded dataset.

  • Dataset Explanation
{
  "preprocessed_*.train.json": "Training dataset for sequential recsys.",
  "preprocessed_*.valid.json": "Validation dataset for sequential recsys.",
  "preprocessed_*.test.json":  "Testing dataset for sequential recsys.",
  "preprocessed_meta_*.json":  "Filtered item meta data. (only left those in train + valid + test)",
  "preprocessed_review_*.json":  "Train + Valid + Test."
}

πŸ”† Naive Numerical IDs Assignment

We gave each user and item an random, unique naive numerical IDs. To preprocess it, you can do

cd prepare_dataset
sh random_hashing.sh

This will create Random_* folder with random naive numerical IDs for users and items inside ./data/.

  • Dataset Explanation
{
  "user_item_hash_table.json": "The table between naive numerical IDs and the original user_id or parent_asin.",
  "meta.json": "Meta data of the given item.",
  "review_*.json": "Review data for [train / valid /test] scope.",
  "review.json":  "All review data (train + valid + test).",
}

πŸ’‘ Other Recommendation Tasks

Besides for sequential recommendation, other recommendation task includes reranking, binary, rating, explanation, conversational and etc. We further provided the dataset setup for reranking task.

βœ… Dataset for Reranking Task

The reranking task aims to recommend items from a set of candidates. For LLM Recommenders, number of candidate set is usually set to n items, with 1 positive item and n-1 non-interacted random negative items. In the construction, we set n=20.

cd prepare_dataset && sh prepare_random_negative.sh
  • Dataset Explanation
{
  "random_numerical or original_ids": "The key is the user and the corresponding list is the items candidate pool.",
  "label_random_numerical or label_original_ids": "The key is the user and the corresponding value is the positive item."
}

πŸ“— Citations

If you find our survey and this repository beneficial for your research, please kindly cite our paper.

@misc{huang2025augmentnotcomparativestudy,
      title={Augment or Not? A Comparative Study of Pure and Augmented Large Language Model Recommenders},
      author={Wei-Hsiang Huang and Chen-Wei Ke and Wei-Ning Chiu and Yu-Xuan Su and Chun-Chun Yang and Chieh-Yuan Cheng and Yun-Nung Chen and Pu-Jen Cheng},
      year={2025},
      eprint={2505.23053},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2505.23053},
}

About

LLM for Recommender Systems

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 6