HelpDesk AI

A production-ready AI helpdesk assistant with RAG (Retrieval-Augmented Generation), streaming responses, and source citations.

Built with Next.js 14, TypeScript, and Tailwind CSS. Features a clean chat interface, intelligent document retrieval, and support for multiple LLM providers.

✨ Features

Core Features

🎯 RAG Pipeline: BM25-inspired retrieval with relevance scoring
💬 Streaming Chat: Real-time token streaming for responsive UX
📚 Source Citations: Every answer includes clickable source references
🛡️ Guardrails: Refuses to answer out-of-scope questions
🎨 Modern UI: Beautiful, responsive chat interface with Tailwind CSS

Advanced Features

📤 Admin Upload: Web interface to upload new documentation files
🧪 Evaluation API: Built-in testing for retrieval quality
🔌 Multi-Provider: Support for OpenAI, Anthropic, Google Gemini, or mock LLM
⚡ Auto-Indexing: Automatic index rebuild on file upload
🔒 Security: Input sanitization, XSS prevention, no secret logging

🚀 Quick Start

Installation

# Install dependencies
npm install

# Set up environment variables
cp .env.local.example .env.local

# Start development server
npm run dev

The app will be available at http://localhost:3000

Environment Variables

Create a .env.local file with the following:

# LLM Provider (mock, openai, anthropic, or gemini)
LLM_PROVIDER=mock

# OpenAI Configuration (if using openai provider)
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-4o-mini

# Anthropic Configuration (if using anthropic provider)
ANTHROPIC_API_KEY=your_anthropic_api_key_here
ANTHROPIC_MODEL=claude-3-haiku-20240307

# Google Gemini Configuration (if using gemini provider)
GEMINI_API_KEY=your_gemini_api_key_here
GEMINI_MODEL=gemini-1.5-flash

Note: The mock provider works without any API keys and is perfect for testing!

📖 Usage

Chat Interface

Navigate to http://localhost:3000
Type your question in the input box
Press Enter or click Send
Watch the response stream in real-time
Click on source citations to see which documents were used

Example Questions:

"What are the pricing tiers?"
"How do I get an API key?"
"Can I get a refund after 20 days?"
"What's included in the Pro plan?"

Admin Panel

Upload new documentation files:

Navigate to http://localhost:3000/admin
Select one or more markdown (.md) files
Click "Upload Files"
The index will automatically rebuild

Evaluation

Test retrieval quality:

curl http://localhost:3000/api/eval

This runs a suite of test questions and reports:

Which sources were retrieved for each question
Relevance scores
Pass/fail status
Overall pass rate

🏗️ Architecture

Project Structure

helpdesk-ai/
├── app/
│   ├── api/
│   │   ├── chat/route.ts          # Streaming chat endpoint
│   │   ├── admin/upload/route.ts  # File upload endpoint
│   │   └── eval/route.ts          # Evaluation endpoint
│   ├── admin/page.tsx             # Admin upload interface
│   ├── layout.tsx                 # Root layout
│   ├── page.tsx                   # Main chat page
│   └── globals.css                # Global styles
├── components/
│   └── Chat.tsx                   # Chat UI component
├── lib/
│   ├── retriever.ts               # RAG retrieval logic
│   └── llm.ts                     # LLM provider abstraction
├── data/
│   ├── pricing.md                 # Knowledge base files
│   ├── refunds.md
│   └── getting-started.md
└── README.md

RAG Pipeline

Indexing: Markdown files are split into paragraphs and indexed on startup
Retrieval: User query is scored against all snippets using BM25-inspired algorithm
Context Building: Top-k relevant snippets are assembled into context
Generation: LLM generates answer using only the provided context
Citation: Sources are returned alongside the response

Retrieval Algorithm

The retriever uses an enhanced BM25-inspired scoring function with precision filters:

Term Frequency: How often query terms appear in the snippet
Document Length Normalization: Adjusts for snippet length
Phrase Matching Boost: Extra weight for exact phrase matches
Relevance Threshold (2.0): Filters out weak matches below minimum score
Score Gap Filter (90%): Only includes results within 90% of top score to prevent irrelevant citations

Prompt Engineering

The system uses dual-mode prompting with strict guardrails:

When Context Exists:

Answer ONLY from provided documentation
Cite sources explicitly
Never infer or make up information
Explain if context doesn't fully answer the question

When No Context Found:

Explicitly refuse to answer from general knowledge
List available documentation topics (Pricing, Getting Started, Refunds)
Ask what the user would like to know
Never hallucinate or guess

This ensures both mock and real LLMs (OpenAI, Anthropic, etc.) refuse out-of-scope questions.

🔧 Design Decisions

Why BM25 over Embeddings?

For this small knowledge base (3 docs, ~500 words each):

BM25 Advantages: No API calls, instant indexing, deterministic, explainable scores
Embeddings Trade-off: Would add API dependency and latency for minimal quality gain
Scalability: For 100+ documents, embeddings would be recommended

Why Streaming?

Better UX: Users see responses immediately, not after 5-10 seconds
Perceived Performance: Feels faster even if total time is similar
Engagement: Users stay engaged while waiting

Why Server-Side RAG?

Security: API keys never exposed to client
Control: Full control over retrieval and prompt construction
Flexibility: Easy to swap providers or add caching

🧪 Testing

Manual Testing

Test the sample prompts from the requirements:

# In-scope questions (should cite sources)
"What are the pricing tiers and what's included?"
"How do I get an API key to start?"
"Can I get a refund after 20 days?"

# Out-of-scope question (should refuse)
"Do you ship hardware devices?"

Automated Evaluation

# Run evaluation suite
curl http://localhost:3000/api/eval | jq

# Expected output: 5-6 out of 6 tests passing

Unit Testing (Optional)

To add unit tests for the retriever:

npm install --save-dev jest @types/jest ts-jest

Create lib/retriever.test.ts:

import { retrieve, buildIndexFromDataDir } from './retriever';

describe('Retriever', () => {
  const index = buildIndexFromDataDir();

  test('retrieves pricing for pricing query', () => {
    const results = retrieve('pricing tiers', index);
    expect(results[0].source).toBe('pricing.md');
  });

  test('returns empty for irrelevant query', () => {
    const results = retrieve('xyz123abc', index);
    expect(results.length).toBe(0);
  });
});

🔒 Security

Implemented Protections

✅ Input sanitization (XSS prevention)
✅ Content-Type headers to prevent MIME sniffing
✅ API keys kept server-side only
✅ No logging of sensitive data
✅ File upload validation (markdown only)
✅ Request size limits (500 char input)
✅ Prompt injection prevention via context isolation

Production Recommendations

For production deployment, add:

Rate limiting (e.g., 10 requests/minute per IP)
Authentication for admin panel
CORS configuration
Request logging and monitoring
Error tracking (Sentry, etc.)

📊 Performance

Benchmarks (Local Testing)

Index Build: ~5ms for 3 documents
Retrieval: ~2ms per query
First Token: 200ms (mock), 800ms (OpenAI)
Full Response: 2-5 seconds depending on length

Optimization Opportunities

Cache frequently asked questions
Pre-compute embeddings for hybrid search
Add Redis for distributed caching
Implement request deduplication

🚢 Deployment

Vercel (Recommended)

# Install Vercel CLI
npm i -g vercel

# Deploy
vercel

# Set environment variables in Vercel dashboard

Docker

FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]

Environment Setup

Remember to set environment variables in your deployment platform:

LLM_PROVIDER
OPENAI_API_KEY (if using OpenAI)
ANTHROPIC_API_KEY (if using Anthropic)

🛠️ Extending

Adding New LLM Providers

Edit lib/llm.ts and add a new provider case:

if (provider === 'your-provider') {
  // Implement streaming logic
}

Adding New Data Sources

Simply drop .md files into the /data directory and restart the server, or use the admin upload interface.

Customizing Retrieval

Edit lib/retriever.ts to adjust:

Scoring algorithm parameters (k1, b)
Number of results (topK)
Relevance threshold

📝 API Reference

POST /api/chat

Stream chat responses with RAG.

Request:

{
  "messages": [
    { "role": "user", "content": "What are the pricing tiers?" }
  ]
}

Response: Text stream with citations appended

POST /api/admin/upload

Upload markdown files to knowledge base.

Request: multipart/form-data with files field

Response:

{
  "success": true,
  "uploaded": 2,
  "message": "Successfully uploaded 2 file(s)"
}

GET /api/eval

Run evaluation suite.

Response:

{
  "summary": {
    "total": 6,
    "passed": 5,
    "passRate": "83.3%"
  },
  "results": [...]
}

PUT /api/chat

Rebuild search index.

Response:

{
  "success": true,
  "snippets": 15
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
app		app
components		components
data		data
hooks		hooks
lib		lib
.env.local.example		.env.local.example
.gitignore		.gitignore
DESIGN_NOTES.md		DESIGN_NOTES.md
GEMINI_SETUP.md		GEMINI_SETUP.md
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
components.json		components.json
next-env.d.ts		next-env.d.ts
next.config.js		next.config.js
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
test-retrieval-scores.js		test-retrieval-scores.js
test-retrieval.js		test-retrieval.js
tsconfig.json		tsconfig.json
tsconfig.tsbuildinfo		tsconfig.tsbuildinfo

Folders and files

Latest commit

History

Repository files navigation

HelpDesk AI

✨ Features

Core Features

Advanced Features

🚀 Quick Start

Installation

Environment Variables

📖 Usage

Chat Interface

Admin Panel

Evaluation

🏗️ Architecture

Project Structure

RAG Pipeline

Retrieval Algorithm

Prompt Engineering

🔧 Design Decisions

Why BM25 over Embeddings?

Why Streaming?

Why Server-Side RAG?

🧪 Testing

Manual Testing

Automated Evaluation

Unit Testing (Optional)

🔒 Security

Implemented Protections

Production Recommendations

📊 Performance

Benchmarks (Local Testing)

Optimization Opportunities

🚢 Deployment

Vercel (Recommended)

Docker

Environment Setup

🛠️ Extending

Adding New LLM Providers

Adding New Data Sources

Customizing Retrieval

📝 API Reference

POST /api/chat

POST /api/admin/upload

GET /api/eval

PUT /api/chat

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages