BugDetectionBench

I originally built this project to create an evaluation dataset for a code review bot company. It provides a set of tools to scrape GitHub pull request comments, identify potential bug reports, and classify them by difficulty. The resulting dataset can be used for training and evaluating models that detect bugs or analyze code review practices.

Workflow

The process is broken down into several steps, managed by different scripts:

Scrape Data: The scraper.ts script fetches PR review comments from GitHub based on a search query.
Review Bugs: The raw comments are reviewed by an LLM to determine if they represent valid bug reports. This is handled by review-bugs.ts.
Assess Difficulty: Bugs that pass the review are then assessed for difficulty (Easy, Medium, Hard) using review-missing-difficulty.ts.
Analyze Data: You can get statistics on the collected data using analyze-bugs.ts.
Extract Data: Finally, you can extract subsets of the data, for example, all "easy" bugs, using scripts like extract-easy-bugs.ts.

File Structure

Core Scripts

These are the main scripts that drive the workflow.

scraper.ts: Scrapes GitHub for PR review comments that might contain bug reports. It saves the results in bugs.json.
review-bugs.ts: Uses an LLM to review the comments in bugs.json and determines if each one is a valid bug report.
analyze-bugs.ts: Provides statistics on the dataset, such as the number of bugs, review pass rate, and difficulty breakdown.
review-missing-difficulty.ts: Finds bugs in bugs.json that are missing a difficulty assessment and uses an LLM to classify them.
extract-easy-bugs.ts: Extracts all bugs marked as "easy" from bugs.json and saves them to easy-bugs.json.
extract-medium-bugs.ts: Extracts a specified number of bugs marked as "medium" from bugs.json and saves them to medium-bugs.json.

Utility Scripts

These scripts are used for more specific or one-off tasks.

review-single.ts: Reviews a single bug by providing a link to the comment.
review-single-difficulty.ts: Assesses the difficulty for a single bug.
generate-balanced-sample.ts: Generates a smaller, balanced sample of bugs from different repositories.
remove-*.ts: Various scripts for cleaning up the data (e.g., remove-failed-reviews.ts, remove-self-comments.ts).

Data Files

bugs.json: The main data file containing all the scraped comments and their metadata, including review status and difficulty.
easy-bugs.json: A subset of bugs.json containing only the bugs classified as "easy".
medium-bugs.json: A subset of bugs.json containing bugs classified as "medium".
scanned_prs.txt: A log file that keeps track of the PRs that have already been scanned to avoid duplicate work.

Setup

Install dependencies:
```
npm install
```
Create an environment file: Create a .env file in the root of the project and add your GitHub and OpenAI API keys:
```
GITHUB_TOKEN=your_github_token
OPENAI_API_KEY=your_openai_api_key
```

Usage

You can run the scripts using npx ts-node.

Scrape for new bugs:
```
npx ts-node scraper.ts
```
Review all bugs:
```
npx ts-node review-bugs.ts
```

Assess difficulty for unassessed bugs:

npx ts-node review-missing-difficulty.ts

Analyze the dataset:
```
npx ts-node analyze-bugs.ts
```
Extract all easy bugs:
```
npx ts-node extract-easy-bugs.ts
```
Extract 300 medium bugs:
```
npx ts-node extract-medium-bugs.ts 300
```

Future Work

Add support for scraping GitLab and other platforms.
Improve the accuracy of the bug identification model.
Create a web interface for easier data exploration.

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
data		data
.DS_Store		.DS_Store
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
223-easy-bugs.json		223-easy-bugs.json
300-medium-bugs.json		300-medium-bugs.json
README.md		README.md
analyze-bugs.ts		analyze-bugs.ts
assess-difficulty.ts		assess-difficulty.ts
bugs.json		bugs.json
comments-cleaned.json		comments-cleaned.json
extract-easy-bugs.ts		extract-easy-bugs.ts
extract-medium-bugs.ts		extract-medium-bugs.ts
fetch-utils.ts		fetch-utils.ts
generate-balanced-sample.ts		generate-balanced-sample.ts
generate-sample-csv.ts		generate-sample-csv.ts
nick-sample-feedback.csv		nick-sample-feedback.csv
package-lock.json		package-lock.json
package.json		package.json
remove-comment-edits.ts		remove-comment-edits.ts
remove-failed-reviews.ts		remove-failed-reviews.ts
remove-self-comments.ts		remove-self-comments.ts
removeMdEditComments.ts		removeMdEditComments.ts
review-bugs.ts		review-bugs.ts
review-missing-difficulty.ts		review-missing-difficulty.ts
review-single-difficulty.ts		review-single-difficulty.ts
review-single.ts		review-single.ts
review-utils-difficulty.ts		review-utils-difficulty.ts
review-utils.ts		review-utils.ts
scanned_prs.txt		scanned_prs.txt
scraper.ts		scraper.ts
tsconfig.json		tsconfig.json
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

BugDetectionBench

Workflow

File Structure

Core Scripts

Utility Scripts

Data Files

Setup

Usage

Future Work

About

Uh oh!

Contributors 2

Uh oh!

Languages

Uh oh!

Uh oh!

moritzWa/BugDetectionBench

Folders and files

Latest commit

History

Repository files navigation

BugDetectionBench

Workflow

File Structure

Core Scripts

Utility Scripts

Data Files

Setup

Usage

Future Work

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors 2

Uh oh!

Languages