🇩🇪 German Tutor 🇩🇪

German Tutor started as an AI-powered German language learning assistant that helps users improve their German vocabulary, sentence structure, and grammar.

Now it is a multi-lingual language learning assistant that can also be used as a general assistant. It uses speech recognition, large language models (LLMs), text-to-speech (TTS), and RAG (retrieval-augmented generation) to provide corrections, explanations, and up-to-date answers.

Examples

1. Speaking German

2. Asking a question in German

3. Asking a question in English

4. Session termination (with end phrase)

Latest Model: `German Tutor V3.1`

German Tutor V3.1 is rebuilt around a LangGraph ReAct pipeline with full session memory.

Memory example

V3.1 updates:

Text mode: the assistant can now be used entirely from the terminal, no microphone, no wake word required. Toggle between text and audio mode with toggle_text_mode in config.yaml.
LangGraph ReAct pipeline: the LLM now runs as a proper ReAct agent, it reasons, decides whether to call a tool, receives the result, and loops until it's ready to respond.
ReAct pipeline: separated into a react_agent node (LLM reasoning) and a retriever_agent node (tool execution), connected via LangGraph's conditional edges.
Session memory: conversation history is persisted across turns using LangGraph's MemorySaver checkpointer, the model remembers everything said earlier in the session.
TTS interruption: TTS now runs in a background thread and can be interrupted mid-speech by pressing the enter key (in both text and audio modes).

Previous: `German Tutor V3.0`

German Tutor V3 introduced multi-language support and general assistant capabilities.

V3.0 updates:

RAG integration for up-to-date answers using live web search.
Modular and organized codebase for easier maintenance and customization.
All options, including language settings, can be modified in the config.yaml file.

V3.0 major improvements:

Faster and more accurate STT: now using faster-whisper with configurable model sizes (replacing sound_recognition).
Real-time TTS: mpv + edge-tts for faster synthesis without temporary files (previous method still available if needed).
LLM upgrade: openai/gpt-oss-120b from Groq (default and recommended), offering more free daily API calls. Users can choose any other Groq LLM by changing the model in the config.yaml file.
Improved TUI for a smoother user experience.

New RAG Feature

German Tutor V3 now supports two RAG modes (retrieval-augmented generation):

Online RAG (tavily_rag.py): live web search via Tavily AI, good for current events, up-to-date grammar references, and anything not in your local books.
Offline RAG (offline_rag.py): searches a local vector database built from your own books/documents, works without internet and is faster for static reference material.

The ReAct agent decides which tool to use (or neither) based on the question.

Here's a visual comparison of RAG vs no RAG:

1. Without RAG

2. With online RAG

3. With offline RAG

Features & Complete Architecture

NOTE: Anything with an asterisk* can be customized in the .yaml file.

┌─────────────────────────────────────────────────────────────┐
│                       USER INPUT                            │
│       (German, any other language, or any question)         │
└──────────────┬──────────────────────────┬───────────────────┘
               │                          │
    toggle_text_mode: False    toggle_text_mode: True
               │                          │
               ↓                          ↓
┌──────────────────────────┐  ┌───────────────────────────────┐
│   AUDIO MODE             │  │   TEXT MODE                   │
│ - Wake word*: "Jarvis"   │  │ - Type directly in terminal   │
│ - Record until silence   │  │ - Press Enter to send         │
│ - Whisper STT            │  │ - Press Enter to stop TTS     │
└──────────────┬───────────┘  └───────────────┬───────────────┘
               │                              │
               ↓                              │
┌──────────────────────────┐                  │
│  SPEECH-TO-TEXT          │                  │
│  (Faster-Whisper)        │                  │
│  - Model*: tiny → large  │                  │
│  - Language*: auto/manual│                  │
│  - Output: USER TEXT     │                  │
└──────────────┬───────────┘                  │
               │                              │
               └──────────────┬───────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────────┐
│              LANGGRAPH ReAct PIPELINE (with session memory)         │
│                                                                     │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │  react_agent node (LLM)                                     │   │
│   │   - Receives full conversation history (MemorySaver)        │   │
│   │   - Reasons about the input                                 │   │
│   │   - Decides: answer directly OR call a tool                 │   │
│   └───────────────────┬─────────────────┬───────────────────────┘   │
│            tool call? │                 │ no → final answer         │
│                       ↓                 ↓                           │
│   ┌───────────────────────────┐    ┌─────────────────────────────┐  │
│   │  retriever_agent node     │    │  END → response to user     │  │
│   │  Tool options:            │    └─────────────────────────────┘  │
│   │  - Tavily web search      │                                     │
│   │  - Offline book search    │                                     │
│   └──────────┬────────────────┘                                     │
│              │ tool result loops back to react_agent                │
│              └──────────────────────────────────────────────────────┘
└─────────────────────────────┬───────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│                TEXT-TO-SPEECH (Edge-TTS + mpv)              │
│   - Runs in background thread (non-blocking)                │
│   - Interruptible mid-speech                                │
└─────────────────────────────┬───────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│                AUDIO PLAYBACK → Loop or Exit                │
│            (using end phrases like: close, bye)             │
└─────────────────────────────────────────────────────────────┘

File Structure

German-Tutor/
│
├── german_tutor_V3.py            # main entry point
│
├── MODEL_3/                       
│   ├── graph.py                  # LangGraph pipeline (ReAct loop + memory)
│   ├── config.yaml
│   │
│   ├── audio/              
│   │   ├── wake_word.py        
│   │   ├── audio_io.py  
│   │   ├── stt.py  
│   │   ├── tts.py           
│   │   └── end_phrase.py      
│   │
│   ├── LLM/              
│   │   ├── react_agent.py        # ReAct agent node + AgentState
│   │   ├── response_formatter.py         
│   │   └── prompt_templates.py 
│   │
│   ├── RAG/                       
│   │   ├── tavily_rag.py         # live web search tool
│   │   └── offline_rag.py        # local book search tool
│   │
│   └── experiments/ 
│
├── README.md                 
│
└── Archived Models/             # contains versions 1 and 2

Getting Started

Absolute requirements:

faster-whisper
edge-tts
groq
langchain-groq
langgraph
rich
tavily
chromadb

Only required for audio mode:

pvporcupine
pyaudio

For the best performance, install:

mpv (if not possible, then ffmpeg, but it will be slower)

You will also need access keys for:

groq → GROQ_API_KEY
pvporcupine → PORCUPINE_ACCESS_KEY
tavily -> TAVILY_API_KEY

Add them to a .env file.

License

MIT License See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🇩🇪 German Tutor 🇩🇪

Examples

1. Speaking German

2. Asking a question in German

3. Asking a question in English

4. Session termination (with end phrase)

Latest Model: `German Tutor V3.1`

Memory example

Previous: `German Tutor V3.0`

New RAG Feature

1. Without RAG

2. With online RAG

3. With offline RAG

Features & Complete Architecture

File Structure

Getting Started

Absolute requirements:

Only required for audio mode:

For the best performance, install:

You will also need access keys for:

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
Archived Models		Archived Models
MODEL_3		MODEL_3
imgs		imgs
LICENSE		LICENSE
README.md		README.md
german_tutor_V3.py		german_tutor_V3.py

Folders and files

Latest commit

History

Repository files navigation

🇩🇪 German Tutor 🇩🇪

Examples

1. Speaking German

2. Asking a question in German

3. Asking a question in English

4. Session termination (with end phrase)

Latest Model: German Tutor V3.1

Memory example

Previous: German Tutor V3.0

New RAG Feature

1. Without RAG

2. With online RAG

3. With offline RAG

Features & Complete Architecture

File Structure

Getting Started

Absolute requirements:

Only required for audio mode:

For the best performance, install:

You will also need access keys for:

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Latest Model: `German Tutor V3.1`

Previous: `German Tutor V3.0`

Packages