Skip to content
/ AP2O Public

AAAI'26, AP2O-Coder: Adaptively Progressive Preference Optimization for Reducing Compilation and Runtime Errors in LLM-Generated Code

License

Notifications You must be signed in to change notification settings

TsingZ0/AP2O

Repository files navigation

Introduction

This is the official implementation of our paper AP2O-Coder: Adaptively Progressive Preference Optimization for Reducing Compilation and Runtime Errors in LLM-Generated Code. Accepted by AAAI'26.

Adaptive Progressive Preference Optimization

Coding Error Reduction

Data Efficiency

Reward Curves

Requirements

  • deepspeed 0.17.2
  • python 3.11.11
  • torch 2.7.0
  • trl 0.14.0
  • transformers 4.51.3
  • vllm 0.9.2

Usage

To initiate the preference data self-generation and preference optimization processes, use the following command:

sh pipe-qwen2.5-coder.sh

About

AAAI'26, AP2O-Coder: Adaptively Progressive Preference Optimization for Reducing Compilation and Runtime Errors in LLM-Generated Code

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published