An intelligent video creation tool that generates engaging videos from character descriptions and topics using state-of-the-art AI models with an interactive Telegram approval workflow.
- Intelligent Plot Generation: Uses Hermes LLM to create logical, engaging plots and scene breakdowns
- High-Quality Visuals: Generates images with Nano Banana (Imagen 3) and videos with VEO 3.1
- Interactive Approval: Telegram bot workflow for approving/rejecting generated content
- Iterative Refinement: Analyzes feedback and regenerates only problematic scenes
- Professional Prompting: Research-based prompt engineering for optimal results
- Automated Stitching: Seamlessly combines scenes into final video
┌─────────────────┐
│ User Input │
│ via CLI │
└────────┬────────┘
│
▼
┌─────────────────────────────────────────┐
│ Workflow Orchestrator │
├─────────────────────────────────────────┤
│ 1. Plot Generation (Hermes LLM) │
│ 2. Image Generation (Nano Banana) │
│ 3. Telegram Approval Workflow │
│ 4. Video Generation (VEO 3.1) │
│ 5. Video Stitching (FFmpeg) │
│ 6. Iterative Feedback Loop │
└─────────────────────────────────────────┘
- Python 3.9+
- FFmpeg installed on your system
- Google Cloud account with Vertex AI enabled
- Hermes API key
- Nano Banana (AI Studio) API key
- Telegram Bot Token and Chat ID
- Clone the repository:
git clone <repository-url>
cd video_maker- Install dependencies:
pip install -r requirements.txt- Install FFmpeg (if not already installed):
# Ubuntu/Debian
sudo apt-get install ffmpeg
# macOS
brew install ffmpeg
# Windows
# Download from https://ffmpeg.org/download.html- Set up Google Cloud credentials:
# Download service account JSON from Google Cloud Console
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"- Configure environment variables:
cp .env.example .env
# Edit .env with your API keys and configurationEdit the .env file with your credentials:
# API Keys
HERMES_API_KEY=your_hermes_key
NANO_BANANA_API_KEY=your_nano_banana_key
VEO_PROJECT_ID=your_google_cloud_project_id
VEO_LOCATION=us-central1
TELEGRAM_BOT_TOKEN=your_telegram_bot_token
TELEGRAM_CHAT_ID=your_telegram_chat_id
# Google Cloud credentials (optional)
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
# Video Settings
DEFAULT_VIDEO_LENGTH=30
MAX_RETRIES=3
FRAME_GENERATION_MODE=first- Create a bot with @BotFather
- Get your bot token
- Start a chat with your bot
- Get your chat ID from @userinfobot
python main.py \
--characters "a brave knight in shining armor" "a wise old wizard" \
--topic "an epic quest to find a magical artifact" \
--length 30 \
--style "cinematic fantasy"--characters, -c: Character descriptions (can specify multiple)--topic, -t: What the video is about (required)--length, -l: Video length in seconds (default: 30)--style, -s: Video style (optional)--validate-config: Validate configuration and exit
Fantasy Adventure:
python main.py \
-c "a brave warrior princess" "a mischievous dragon" \
-t "becoming unlikely friends during a festival" \
-l 45 \
-s "animated, whimsical"Documentary Style:
python main.py \
-c "a marine biologist" \
-t "discovering a new species in the deep ocean" \
-l 60 \
-s "documentary, National Geographic style"Sci-Fi:
python main.py \
-c "a space explorer" "an alien ambassador" \
-t "first contact with an alien civilization" \
-l 40 \
-s "cinematic sci-fi, dramatic lighting"The system uses Hermes LLM to create:
- Engaging plot summary
- Detailed scene breakdown
- Character actions and cinematography
- Audio descriptions
For each scene:
- Generates optimized image prompts using Hermes
- Creates images with Nano Banana (Imagen 3)
- Sends to Telegram for approval
- Regenerates rejected images until all approved
- Converts approved images to videos using VEO 3.1
- Uses research-based prompting for best quality
- Includes audio generation
- Maintains character and scene consistency
- Combines all scene videos
- Normalizes resolution and frame rate
- Creates seamless final video
- Sends final video via Telegram
- Analyzes user feedback with Hermes
- Regenerates only problematic scenes
- Re-stitches and repeats until approved
video_maker/
├── main.py # Entry point
├── requirements.txt # Python dependencies
├── .env.example # Example configuration
├── README.md # This file
├── src/
│ ├── __init__.py
│ ├── config.py # Configuration management
│ ├── workflow.py # Main orchestrator
│ ├── api/
│ │ ├── hermes_client.py # Hermes LLM integration
│ │ ├── nano_banana_client.py # Image generation
│ │ └── veo_client.py # Video generation
│ ├── telegram/
│ │ └── bot.py # Telegram approval bot
│ └── video/
│ └── stitcher.py # Video processing
├── output/ # Generated content
│ ├── images/
│ ├── videos/
│ └── final/
└── temp/ # Temporary files
The system uses research-based VEO 3.1 prompting:
- Detailed cinematography specifications
- Precise lighting descriptions (e.g., "soft wrap", "hard rim")
- Camera movements (dolly, tracking, crane shots)
- Character consistency across scenes
- Audio and dialogue integration
Smart scene regeneration:
- Keeps approved scenes unchanged
- Only regenerates problematic parts
- Maintains plot coherence
- Preserves good elements
- Automatic retry with exponential backoff
- Multiple generation attempts for failed scenes
- Graceful degradation
- Detailed error logging
FFmpeg not found:
# Install ffmpeg for your system
sudo apt-get install ffmpeg # Ubuntu/Debian
brew install ffmpeg # macOSAPI Authentication errors:
- Verify all API keys in
.env - Check Google Cloud credentials path
- Ensure VEO 3.1 is enabled in your GCP project
Telegram bot not responding:
- Verify bot token is correct
- Check that you've started a chat with the bot
- Confirm chat ID is accurate
Video generation fails:
- VEO 3.1 has 8-second maximum per clip
- Check internet connection
- Verify GCP quotas and billing
Test your configuration:
python main.py --validate-configBe aware of rate limits:
- VEO 3.1: Check Google Cloud quotas
- Nano Banana: AI Studio limits
- Hermes: API-specific limits
The system includes automatic retry logic and delays between requests.
- Parallel Processing: Images are generated in batches
- Caching: Approved content is saved and reused
- Smart Regeneration: Only problematic scenes are redone
- Scene Length: Shorter scenes (5-8s) generate faster
Contributions welcome! Areas for improvement:
- Additional video generation models
- More transition effects
- Advanced scene planning
- Voice-over generation
- Music integration
MIT License - see LICENSE file for details
- Hermes LLM by Nous Research for intelligent prompting
- Nano Banana (Imagen 3) by Google for image generation
- VEO 3.1 by Google for video generation
- FFmpeg for video processing
- Research on VEO 3.1 prompting best practices (2025)
For issues and questions:
- Check troubleshooting section
- Verify configuration with
--validate-config - Review logs in output directories
- Open an issue on GitHub
Happy video making! 🎬