A modern web application for transcribing audio files using OpenAI's Whisper API
🎵 Convert your audio files to text with AI-powered transcription
🌐 Beautiful web interface accessible from any device
⚡ Fast, secure, and optimized for large files
A modern web application for transcribing audio files using OpenAI's Whisper API. This application provides a beautiful, responsive web interface that replaces the original tkinter GUI.
- 🎤 Audio Transcription: Convert audio files to text using OpenAI's Whisper API
- 🌐 Web Interface: Modern, responsive web UI accessible from any browser
- 📁 Drag & Drop: Easy file upload with drag and drop support
- 📱 Mobile Friendly: Responsive design that works on all devices
- 💾 Download Transcripts: Save transcription results as text files
- 🔒 Secure: File validation and secure handling
- ⚡ Fast: Optimized for quick transcription processing
- ✂️ Audio Trimming: Option to trim audio to first 10 minutes to save API costs
- 🗜️ Smart Compression: Automatically compresses large files to meet Whisper's 25MB limit
- 📦 Intelligent Chunking: Splits very large files into manageable chunks for processing
- MP3
- WAV
- M4A
- FLAC
- OGG
- WEBM
- MP4
- Python 3.7 or higher
- OpenAI API key
Quick Installation: Double-click install_windows.bat and follow the prompts!
For detailed Windows instructions, see README_WINDOWS.md or WINDOWS_INSTALL.md
-
Clone or download the project files
-
Install dependencies:
pip install -r requirements.txt
-
Set up your OpenAI API key:
export OPENAI_API_KEY="your-api-key-here"
Or create a
.envfile in the project directory:OPENAI_API_KEY=your-api-key-here
-
Start the web server:
python web_app.py
-
Open your browser and navigate to:
http://localhost:5000 -
Upload an audio file:
- Click the upload area to browse for files
- Or drag and drop an audio file directly onto the upload area
- File size limit: Up to 100MB (automatically optimized for Whisper's 25MB limit)
-
Transcribe:
- Click the "Transcribe Audio" button
- Wait for the transcription to complete
- View the results in the transcript area
-
Audio Trimming (Optional):
- Check "Trim to first 10 minutes" to save API costs
- Useful for long audio files where you only need the beginning
- Automatically converts trimmed audio to WAV format for optimal processing
-
File Optimization (Automatic):
- Compression: Large files are automatically compressed to MP3 format
- Chunking: Very large files are split into 10-minute chunks
- Quality Optimization: Audio is optimized for Whisper (16kHz, mono)
- Smart Processing: Only applies optimization when needed
-
Download:
- Click "Download Transcript" to save the transcription as a text file
- Use "Clear" to start over with a new file
transcriber/
├── web_app.py # Main Flask application
├── requirements.txt # Python dependencies
├── templates/
│ └── index.html # Web interface template
├── uploads/ # Temporary file storage (created automatically)
└── README.md # This file
The application can be configured by setting environment variables:
OPENAI_API_KEY: Your OpenAI API key (required)SECRET_KEY: Flask secret key (optional, defaults to development key)MAX_CONTENT_LENGTH: Maximum file upload size (default: 50MB)
- File type validation
- Secure filename handling
- Automatic cleanup of uploaded files
- Maximum file size limits
- Input sanitization
-
"OpenAI API key not configured"
- Make sure you've set the
OPENAI_API_KEYenvironment variable - Or create a
.envfile with your API key
- Make sure you've set the
-
"Invalid file type"
- Ensure you're uploading a supported audio format
- Check that the file extension is correct
-
"File too large"
- The maximum file size is 100MB
- Files are automatically compressed and chunked if needed
- Consider using the trim option for very long files
-
"Transcription fails"
- Check your internet connection
- Verify your OpenAI API key is valid and has sufficient credits
- Ensure the audio file is not corrupted
- Large files may take longer to process due to compression/chunking
If you encounter issues:
- Check the browser's developer console for error messages
- Verify your OpenAI API key is working
- Try with a smaller audio file first
To run in development mode:
python web_app.pyThe application will run on http://localhost:5000 with debug mode enabled.
For production deployment:
- Set a proper
SECRET_KEY - Use a production WSGI server (e.g., Gunicorn)
- Configure proper logging
- Set up HTTPS
- Consider using a reverse proxy (e.g., Nginx)
Example with Gunicorn:
pip install gunicorn
gunicorn -w 4 -b 0.0.0.0:5000 web_app:appThis project is open source and available under the MIT License.
Contributions are welcome! Please feel free to submit a Pull Request.
