Discovering Test-Time Training on Traditional ML Models - Course Project of Data Mining in THU 2024 Fall
Scale the correct model on the correct data distribution.
Built with the tools and technologies:
The code repo is of the course project of Data Mining in the fall semester of 2024 in Tsinghua University. Based on the repo, here are two features that are implemented:
- Test-Time Training on Xgboost Models: The repository supports test-time training on Xgboost models in classification and regression tasks. The codebase provides evidence that test-time training can improve traditional ML models with scaled inference compute (see the figure below). An empirical insight: This property might work well on high-dimensional or long-tailed data.
- LLM-based Many-shot ICL in Classification and Regression Tasks: The repository supports LLM-based many-shot ICL in classification and regression tasks, with batched prompting. Instead of targeting on the best performance, the codebase uses it to interpret the decision-making process in these traditional ML tasks.
└── project-dir/
├── KNN_FS_LLM_code
├── XGBoost_code
├── preprocess
├── requirements.txt
├── report.md
└── rrl-DM_HW
preprocess/
File | Summary |
---|---|
analyze_*.py | Analyzes data by generating descriptive statistics. |
preprocess_data*.py | Facilitates data preprocessing for various datasets by loading, cleaning, and normalizing features. |
clean_data*.py | Cleansing data for various datasets. |
visualization.py | Visualizes dataset features. |
XGBoost_code/
File | Summary |
---|---|
experiment_ttt.py | The script of test-time training on Xgboost with $k$NN retrived training samples. |
experiment.py | Experimenting bare Xgboost models. |
args.py | Facilitates user-defined configurations for XGBoost model training. |
KNN_FS_LLM_code/
File | Summary |
---|---|
experiment.py | Running and evaluating many-shot in-context learning with gpt-4o-mini in classification and regression tasks. |
experiment_knn.py | Experimenting bare $k$NN models. |
args.py | Facilitates user-defined configurations for many-shot in-context learning with LLMs. |
Download datasets from the following sources:
And unzip them in the root directory of the project as follows:
└── project-dir/
├── breast_cancer_elvira_data/
├── bank_marketing_data/
├── boston_housing_data/
...
- Dependencies can be installed using the following command:
pip install -r requirements.txt
- Data preprocessing can be performed using the following command:
bash preprocess/preprocess.sh
- To try the test-time training on Xgboost models, execute the following command:
bash XGBoost_code/run_exp.sh
- To try the LLM-based many-shot ICL in classification and regression tasks, execute the following command:
bash KNN_FS_LLM_code/run_exp.sh
- S. Moro, R. Laureano and P. Cortez. Using Data Mining for Bank Direct Marketing: An Application of the CRISP-DM Methodology. In P. Novais et al. (Eds.), Proceedings of the European Simulation and Modelling Conference - ESM'2011, pp. 117-121, Guimarães, Portugal, October, 2011. EUROSIS.
- Zhuo Wang, Wei Zhang, Ning Liu, and Jianyong Wang. 2024. Scalable rule-based representation learning for interpretable classification. In Proceedings of the 35th International Conference on Neural Information Processing Systems (NIPS '21). Curran Associates Inc., Red Hook, NY, USA, Article 2332, 30479–30491.
- Text Classification via Large Language Models (Sun et al., Findings 2023)
- Batch Prompting: Efficient Inference with Large Language Model APIs (Cheng et al., EMNLP 2023)
- Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei A. Efros, and Moritz Hardt. 2020. Test-time training with self-supervision for generalization under distribution shifts. In Proceedings of the 37th International Conference on Machine Learning (ICML'20), Vol. 119. JMLR.org, Article 856, 9229–9248.
- Lu, H., Sun, S., Xie, Y., Zhang, L., Yang, X., and Yan, J., “Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple Logits Retargeting Approach”, arXiv e-prints, arXiv:2403.00250, 2024.