Skip to content

emorynlp/LLM-Grading

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 

Repository files navigation

Large Language Model as a Teaching Assistant (TA)👩‍💻

This repository contains a Python script that utilizes OpenAI's GPT model to automatically grade CS 329 Computational Linguistics quiz answers. The grading system allows for flexible scoring with configurable granularity, ensuring accurate assessment of student responses.

Short Anwer Quiz

Features

  • Supports dynamic scoring with adjustable granularity.
  • Uses OpenAI's GPT model (e.g., gpt-4o) for grading.
  • Handles missing and irrelevant responses with appropriate zero grading.
  • Provides explanations for deductions when the full score is not given.
  • CSV output for easy integration with grading systems.

Usage

1. Prepare Input Files

Ensure the following JSON files are ready:

  • text_processing.json (Gold answers with correct responses)
  • student_answer.json (Student responses to grade)

2. Run the Grading Script

Use the following command to run the grading process:

python grade.py \
  --api_key YOUR_OPENAI_API_KEY \
  --max_score 2.0 \
  --num_questions 10 \
  --gpt_model gpt-4o \
  --granularity 0.1 \
  --gold_file text_processing.json \
  --student_file student_answer.json \
  --output_csv graded_results.csv

3. Output

The script generates a CSV file (graded_results.csv) with the following columns:

  • Question: The quiz question
  • Student Answer: The student's response
  • Grade: The assigned score
  • Explanation: If applicable, a brief reason for score deductions

Example Output

Question,Student Answer,Grade,Explanation
"What is NLP?","Natural Language Processing is ...",1.0,""
"Define Tokenization","Breaking text into tokens.",0.8,"Lacks mention of sentence splitting."

Notes

  • The script ensures scores adhere to the predefined valid score set.
  • If GPT returns an invalid or missing grade, the score defaults to 0.0.
  • Full-credit responses do not include explanations.

Report in PDF format

Features

  • Supports automated grading of PDF-based project proposals.

  • Uses OpenAI’s GPT model (e.g., gpt-4o) for rubric-based evaluation.

  • Provides category-wise score breakdowns with brief justifications.

  • Extracts proposal text from PDF using PyMuPDF.

  • Generates both structured CSV and raw GPT outputs for transparency and reproducibility.

  • Skips already processed files to prevent redundant evaluation.

Usage

1. Prepare Input Folder

Place all project proposal PDFs inside a folder (e.g., proposals_pdf/).

2. Run the Evaluation Script

Use the following command:

python grade_proposals.py

Ensure you have set your OpenAI API key inside the script (openai.api_key = "") or via environment variables.

3. Output

The script produces:

gpt_evaluations.csv: A structured table containing:

file_name, individual scores (Header, Abstract, etc.), Total, Explanation, and ProposalSummary

gpt_raw_outputs.txt: Raw GPT outputs for each evaluated proposal

error_log.txt: Log of any PDF files that failed to process

Notes

  • You can easily modify the evaluation rubric from the prompt and customize this code.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages