Aggregation Package for Transcription and Diarization

Whisper-asr-colab is an aggregation package for speech-to-text and diarization, featuring an example implementation on Google Colab.

The main functions of this package are as follows:

Speech-to-text (transcription), powered by faster-whisper
Diarization, powered by pyannote-audio
Online audio downloading, powered by yt-dlp
Writing diarization results in docx format, powered by python-docx

Usage

Open whisper_asr_colab.ipynb on Google Colab or use the modules as shown below.

from whisper_asr_colab.worker import Worker
from whisper_asr_colab.audio import Audio

audio = "audiofile.m4a"
model_size = "turbo"
hf_token = "your hf token"

worker = Worker(
    audio=Audio.from_path_or_url(audio),
    model_size=model_size,
    hugging_face_token=hf_token,
)

worker.run()

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.github/workflows		.github/workflows
benchmark		benchmark
scripts		scripts
tests		tests
whisper_asr_colab		whisper_asr_colab
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_ja.md		README_ja.md
pyproject.toml		pyproject.toml
readme.md		readme.md
readme_ja.md		readme_ja.md
whisper_asr_colab.ipynb		whisper_asr_colab.ipynb
whisper_asr_colab_ja.ipynb		whisper_asr_colab_ja.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aggregation Package for Transcription and Diarization

Usage

About

Releases

Packages

Languages

License

pneuly/whisper-asr-colab

Folders and files

Latest commit

History

Repository files navigation

Aggregation Package for Transcription and Diarization

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages