Game Dialogue Generator

A fine-tuned language model for generating game dialogue in the style of Horizon Dawn.

Demo

Demo shows the dialogue generation process with different character and scene inputs

What This Project Does

This project uses a fine-tuned GPT-2 small model (124M parameters) to generate game dialogue for different scenes and characters. It includes:

Data processing for game dialogues
Model fine-tuning on a dataset of game dialogues
A FastAPI web API for serving the model
A simple web interface for generating dialogue

Project Structure

HorizonDawn-Dialogue-Generator/
├── api/
│   ├── main.py             # FastAPI application
│   └── dialogue_routes.py  # API routes for dialogue generation
├── data/
│   ├── raw/                # Raw JSON dialogue files
│   ├── processed/          # Processed CSV files
│   └── process_data.py     # Data processing script
├── models/
│   ├── train.py            # Training script
│   ├── dialogue_generator_small/  # Smaller model checkpoint
│   └── dialogue_generator_full/   # Full model checkpoint
├── web/
│   ├── static/             # CSS, JS files
│   └── templates/          # HTML templates
├── requirements.txt        # Project dependencies
└── run.py                  # Script to run the application

How to Run

Prerequisites

Python 3.8+
PyTorch
Transformers
FastAPI
Uvicorn

Installation

Clone the repository:

git clone https://github.com/yourusername/HorizonDawn-Dialogue-Generator.git
cd HorizonDawn-Dialogue-Generator

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install the required packages:
```
pip install -r requirements.txt
```

Processing Data

Place your JSON dialogue files in the data/raw/ directory
Run the processing script:
```
cd data
python process_data.py
```
This will create structured CSV files in data/processed/

Training Models

Train the full model:
```
cd models
python train.py
```

Running the Web API and Interface

Start the FastAPI server:
```
python run.py
```
Or run it directly with uvicorn:
```
uvicorn api.main:app --reload
```
Access the web interface by opening your browser and navigating to:
```
http://localhost:8000
```
Access the API documentation at:
```
http://localhost:8000/docs
```

Using the API Directly

You can make POST requests to the API endpoint:

curl -X POST "http://localhost:8000/api/generate_dialogue" \
     -H "Content-Type: application/json" \
     -d '{"scene":"Forest Encounter", "character":"Aloy", "length":200}'

Generating Content

Create a testing script (example):

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

def generate_dialogue(scene_name, model_path="models/dialogue_generator_full"):
    # Load model and tokenizer
    tokenizer = AutoTokenizer.from_pretrained(model_path)
    model = AutoModelForCausalLM.from_pretrained(model_path)
    
    # Create prompt
    prompt = f"Generate dialogue for scene '{scene_name}':"
    
    # Generate text
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(
        **inputs, 
        max_length=200,
        temperature=0.8,
        do_sample=True,
        top_p=0.92,
        no_repeat_ngram_size=2
    )
    
    # Decode and print
    result = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return result

# Example usage
dialogue = generate_dialogue("Forest Encounter")
print(dialogue)

Model Capabilities

Small Model

Base: GPT-2 (124M parameters)
Training: 2 epochs on 20 examples
Use case: Quick testing and development

Full Model

Base: GPT-2 (124M parameters)
Training: 5 epochs on 100 examples
Use case: Production-ready dialogue generation

Sample Outputs

Input: "Generate dialogue for scene 'Forest Encounter':"
Output: [Model-generated dialogue based on the scene prompt]

Technical Details

The model is trained using causal language modeling
Data processing handles multiple JSON formats for flexibility
Compatible with Apple Silicon's MPS acceleration
Handles dialogue formatting with proper speaker attribution

Future Improvements

Implement web interface for easy dialogue generation
Add support for larger models (GPT-2 Medium/Large)
Expand training dataset with more diverse dialogue examples

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
api		api
assets		assets
data		data
models		models
notebooks		notebooks
tests		tests
web		web
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Game Dialogue Generator

Demo

What This Project Does

Project Structure

How to Run

Prerequisites

Installation

Processing Data

Training Models

Running the Web API and Interface

Using the API Directly

Generating Content

Model Capabilities

Small Model

Full Model

Sample Outputs

Technical Details

Future Improvements

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

NeuralNomad081/HorizonDawn-Dialogue-Generator

Folders and files

Latest commit

History

Repository files navigation

Game Dialogue Generator

Demo

What This Project Does

Project Structure

How to Run

Prerequisites

Installation

Processing Data

Training Models

Running the Web API and Interface

Using the API Directly

Generating Content

Model Capabilities

Small Model

Full Model

Sample Outputs

Technical Details

Future Improvements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages