Unofficial No Such Thing As A Fish episode transcripts.
Run npm install
Run npm start
Open http://localhost:5173/?deployed=true
to load all assets from remote hosts. (Try this first)
OR
Open http://localhost:5173/
to use local assets.
TODO: Add instructions for creating database with migrations.sql
Run python -m venv venv
Run source venv
Run pip install -r requirements.txt
Run npm run convert 146
Run npm run convert
Warning: This will take a long time
NOTE The first time this script is run, it needs to download the Whisper model, which requires local_files_only
to be temporarily set to False
. After this, the option can be changed back to True
.
In whisper.py
change model_size
to your preferred model. See available models.
NOTE: By default this uses the large-v2
Whisper model. On an M1 Mac with 64GB of RAM this transcribes at about 1x
speed. This means an hour long episode gets transcribed in about an hour.
So, as of 8 February 2025:
select sum(duration) from episodes
-- 1555237
1,555,237.0 seconds
÷ 60.0 seconds
÷ 60.0 minutes
÷ 24.0 hours
-----------------------
= 18.0 days
The good news is changing to the medium.en
, small.en
, or tiny.en
model increases this speed dramatically but the accuracy goes down. small.en
transcribes at about 3x
speed, for example.
The other good news is that the convert script is idempotent in that you can kill the script (Ctrl + C
) and restart it at any time and it will pick back up after the last fully transcribed episode. You can safely run this script over and over without creating any duplicates.
NOTE: This script also downloads all the audio files for the episodes as well as each episode's album art. As of 8 February 2025 this amounts to 568 episodes, ~24.2GB audio, ~190MB images.
Run npm run split
Needs rclone
and jq
installed.
Run npm run sync