Installation

This is a Python application that allows you to evaluate how well LLMs handle the order dependency problem, drawing heavily from the paper Large Language Models Are not Robust Multiple Choice Selectors by Zheng, et al. It is written to easily handle multiple models with limited extra work.

Installation

The easiest way to install the project is by using pip or uv to install the wheel file in dist/. You may also clone the repo and use uv or any other pyproject.toml-compatible tool to install dependencies to a local virtual environment and then run it.

Usage

Upon installation, you can run the application either by using uv run run_analysis or run_analysis. Outpus are saved to the outputs directory.

Usage: run_analysis [OPTIONS]

Options:
  --model_name TEXT     Model name
  --data_limit INTEGER  Number of questions to use
  --random BOOLEAN      Whether to use random questions
  --help                Show this message and exit.

Authorization

Data

To access the Huggingface dataset, use the Huggingface CLI to authenticate your account.

OpenAI

To access the OpenAI API, set your OpenAI API key in your own .env file.

Future Work

This project could be extended in many ways. It could be extended to handle more models, more MCQ datasets, and more evaluation metrics, such as the standard deviation of the recall balance. It could also be extended to handle more complex questions, such as those that require multiple steps to solve (and evaluating chain-of-thought prompting to answer these).

Testing, which is non-existent in this project, would also be a good addition.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
dist		dist
output		output
src/order_dependency		src/order_dependency
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Installation

Usage

Authorization

Data

OpenAI

Future Work

About

Uh oh!

Releases

Packages

Uh oh!

Languages

kylestratis/order-dependency

Folders and files

Latest commit

History

Repository files navigation

Installation

Usage

Authorization

Data

OpenAI

Future Work

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages