Skip to content

ashima-del/restaurant_chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agentic RAG Restaurant Chatbot

An agentic AI chatbot that answers restaurant menu questions(and answers specific for allergens) using both structured (SQL) and unstructured (vector) data.

This project demonstrates:

  • Agentic reasoning and tool orchestration
  • SQL + Vector DB integration

Running the application:

Run the following commands in the root folder after cloing the repository:

$ cd src
$ export GOOGLE_API_KEY=<shared over email>
$ docker compose up —-build -d

It may take sometime to run docker compose. This starts the various components of the system as explained in Architecture section as docker containers.

$ docker ps --format "{{.Names}}""

src-restaurant-bot-1
src-mcp-weaviate-tool-1
src-mcp-mysql-tool-1
src-weaviate-1
src-db-1
src-inference-1

The application runs on port 9090 and can be accessed via google adk ui at: http://localhost:9090/dev-ui/?app=restaurant_agent

Select restaurant_agent from dropdown if not selected.

To stop the containers:

$ docker compose down -v

Sample Queries

Queries that should trigger sql

  • Which dishes contain gluten?
  • What vegetarian dishes are priced below ₹200?

Queries that should trigger wevaiate

  • How is Paneer Tikka prepared?
  • Which dishes may have cross-contamination risks?

Queries that should trigger both

  • Is Paneer Tikka safe for someone with a nut allergy?
  • Which dishes are unsafe for someone with a dairy allergy?

High-Level Architecture

Here is a high level architecture diagarm with execution flow: Architecture

The system consists of following components:

Components

Restaurant Agent (GooK)

  • Understands user intent
  • Chooses correct tool(s)
  • Synthesizes final answer

See src/restaurant_agent/agent.py for agent code.

SQL MCP Service (Structured Data)

Database: MySQL
Purpose: Canonical truth for menu data and allergens

Tables

Menu_Items(item_id, name, price, is_veg, spice_level, ingredients)
Allergens(allergen_id, name)
Menu_Allergens(item_id, allergen_id, notes)

Key Features

  1. get_schema exposed to agent for fetching schema
  2. query_mysql for fetching results for a query
  3. checks and allows only ready only queries

See src/mcp_tools/mysql_tool/mcp-service-sql.py for code.

Weaviate MCP Service (Unstructured Data)

Database: Weaviate
Purpose: Contains prepration notes and some metadata about dishes
Schema

class: MenuItemNotes  
properties:
- item_id (int)-link to SQL primary key  
- name (text)
- notes (text)-chef notes, cross-contamination warnings
- spice_level (text)
- is_veg (boolean)
- price (number)

Key Features

  1. Chef preparation notes could be in txt or pdf files from which they can be extracted and put in weaviate for searching
  2. Uses hybrid search with alpha=0.5.
  3. Embedding model: text2vec-transformers : "sentence-transformers/all-MiniLM-L6-v2"
  4. Used for checking cross contamination

See src/mcp_tools/weaviate_tool/mcp-service-weaviate.py for code.

Evaluations

Here are some evaluations I tried. I am also preparing a presentation and can explain more in the demo.

  1. Ragas Library for aggregate statistics

I used ragas library to generate aggregate statistics for my agentic RAG system.

The script eval/helper_scripts/generate_ragas_dataset.py was used to generate dataset for ragas in eval/eval_scripts/ragas_dataset.json. Two sample metrics using two different models can be found in eval/metrics/ragas-metrics.txt

  1. Custom Script for Tool Usage Evaluation
  • For tool evaluation, I have written a custom script:eval/helper_scripts/gen_tool_dataset.py for generating tools used in a sample run. This calcualtes tool precision and recall as I didnt find adk's tool trajectory metrics to be useful for my use case. Sample results in eval/metrics/tool-metrics.txt
  1. Using ADK eval
  • Go to root directory and execute following commands to run adk eval on golden data set: (docker containers should be running as it uses mysql and weaviate docker containers)
$ cd src
$ pip install -r requirements.txt
$ adk eval restaurant_agent restaurant_agent/golden_data_set.evalset.json --config_file_path=restaurant_agent/test_config.json --print_detailed_results

Sample results are in eval/adk/adk_results.txt Please note that I faced issue running adk on a larger data set of 12 queries(src/restaurant_agent/golden_data_set.evalset.json). The results in the directory contain results for dataset of 2 queries for now.

Possible Improvements & Future Enhancements

  1. Finer-grained Tool Evaluation The system can be extended to evaluate tool arguments in addition to tool selection (e.g., validating the structure and semantics of generated SQL or vector-search parameters). While Google ADK provides built-in tool evaluation, tool_trajectory_avg_score uses strict exact-match scoring, which assigns a score of 0 even for minor, semantically equivalent mismatches. This made it less effective for evaluating practical agent behavior, motivating the need for more flexible, custom evaluation metrics along with rubric_based_tool_use_quality_v1 from adk.
  2. Multi-Agent Architecture The current design uses a single LLM agent to handle planning, decision-making, tool execution, and response synthesis. A potential improvement is to adopt a multi-agent architecture, where:
  • a planning agent decomposes user intent,

  • an execution agent handles tool selection and calls, and

  • a synthesis agent aggregates results and generates the final response. This would also allow the use of different LLMs (lightweight vs. heavyweight) for different responsibilities, improving efficiency, cost control, and scalability.

  1. Prompt Optimization The system prompt was developed through iterative experimentation to enforce correct tool usage, reduce hallucinations, and prevent internal leakage. Once agent behavior is sufficiently stable, the prompt can be simplified or shortened through further experimentation, improving runtime efficiency while preserving reliability.
  2. Conversational Context Awareness The current system processes each user query independently. As a future enhancement, maintaining and passing relevant chat history (conversation context) to the LLM could improve accuracy and user experience.

About

Agentic Restaurant chatbot specializing in Allergens

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors