The purpose of this tool is to make Microsoft Teams meeting transcripts easier to read and analyse using tools such as NVivo, QualCoder, or (slightly unconventional) Obsidian via obsidian-chat-view plugin.
It processes .vtt transcripts downloaded from Microsoft Teams/Stream, merges adjacent blocks from the same speaker, and outputs a clean, formatted text file.
Speaker names can optionally be renamed and assigned prefixes, and the output format is customisable via a template.
No installation required — run it once-off with uvx:
uvx teams-transcript-formatter transcript.vttInstall from PyPI:
pip install teams-transcript-formatter
# or
uv tool install teams-transcript-formatterAfter installation, teams-transcript-formatter will be available on your PATH:
teams-transcript-formatter transcript.vttIf you want to make changes to the source code you can clone the repository and install in editable mode:
git clone https://github.com/jmarshrossney/teams-transcript-formatter
cd teams-transcript-formatter
uv syncThe teams-transcript-formatter script takes one or more .vtt files and prints the formatted output to stdout. To save the output to .txt files instead (with the naming convention <original_stem>_formatted.txt), use the -o flag to specify an output directory.
# Basic: keep original speaker names, default formatting
teams-transcript-formatter transcript.vtt
# Rename speakers (e.g. for an interview)
teams-transcript-formatter \
--rename "John Smith=Interviewer" --rename "Jane Doe=Student" \
--prefix "Interviewer=> " --prefix "Student=< " \
transcript.vtt
# Custom output format
teams-transcript-formatter \
--rename "John Smith=JS" --rename "Jane Doe=JD" \
--template "{speaker}: {speech} [{timestamp}]" \
transcript.vttRun teams-transcript-formatter -h for full guidance, including shell completion.
| Flag | Description |
|---|---|
--rename |
Map original speaker names to display names: "OriginalName=DisplayName". Repeat for each speaker. |
--prefix |
Assign a prefix to each display name: "DisplayName=>". Repeat for each speaker. |
--template |
Python format string for output. Placeholders: {prefix}, {speaker}, {speech}, {timestamp}. |
-o, --output |
Directory to save .txt files. If not given, prints to stdout. |
--force |
Overwrite existing output files instead of refusing |
-q, --quiet |
Suppress all non-error output |
--version |
Show the version and exit |
-h, --help |
Show the help message and exit |
Say we have a Teams transcript file named transcript.vtt:
$ head -11 transcript.vtt
WEBVTT
91b3f3c3-44c6-4a8b-8c0a-add105d816bd/32-0
00:00:10.087 --> 00:00:13.130
<v John Smith>Hello, I am the interviewer.</v>
91b3f3c3-44c6-4a8b-8c0a-add105d816bd/32-1
00:00:13.130 --> 00:00:16.270
<v Jane Doe>Nice. I am the student being interviewed,
and I have many things to say.</v>
No flags — original speaker names, default template, print to stdout.
$ teams-transcript-formatter transcript.vtt
John Smith | Hello, I am the interviewer. | 00:00:10
Jane Doe | Nice. I am the student being interviewed, and I have many things to say. | 00:00:13Map original names to display names with --rename.
$ teams-transcript-formatter \
--rename "John Smith=Interviewer" --rename "Jane Doe=Student" \
-o . transcript.vtt
$ head -3 transcript_formatted.txt
Interviewer | Hello, I am the interviewer. | 00:00:10
Student | Nice. I am the student being interviewed, and I have many things to say. | 00:00:13Combine --rename with --prefix to visually distinguish speakers. Prefixes are keyed on the display name (after renaming).
$ teams-transcript-formatter \
--rename "John Smith=Interviewer" --rename "Jane Doe=Student" \
--prefix "Interviewer=> " --prefix "Student=< " \
-o . transcript.vtt
$ head -3 transcript_formatted.txt
> Interviewer | Hello, I am the interviewer. | 00:00:10
< Student | Nice. I am the student being interviewed, and I have many things to say. | 00:00:13Control the output format with --template. Available placeholders: {prefix}, {speaker}, {speech}, {timestamp}.
$ teams-transcript-formatter \
--rename "John Smith=JS" --rename "Jane Doe=JD" \
--template "[{timestamp}] {speaker}: {speech}" \
-o . transcript.vtt
$ head -3 transcript_formatted.txt
[00:00:10] JS: Hello, I am the interviewer.
[00:00:13] JD: Nice. I am the student being interviewed, and I have many things to say.All three flags together — rename, prefix, and template.
$ teams-transcript-formatter \
--rename "John Smith=Interviewer" --rename "Jane Doe=Student" \
--prefix "Interviewer=> " --prefix "Student=< " \
--template "{prefix}{speaker}: {speech} [{timestamp}]" \
-o . transcript.vtt
$ head -3 transcript_formatted.txt
> Interviewer: Hello, I am the interviewer. [00:00:10]
< Student: Nice. I am the student being interviewed, and I have many things to say. [00:00:13]Pass an empty value to --prefix to suppress the prefix for a given speaker.
$ teams-transcript-formatter \
--rename "John Smith=Interviewer" --rename "Jane Doe=Student" \
--prefix "Interviewer=> " --prefix "Student=" \
-o . transcript.vtt
$ head -3 transcript_formatted.txt
> Interviewer | Hello, I am the interviewer. | 00:00:10
Student | Nice. I am the student being interviewed, and I have many things to say. | 00:00:13Speaker names can be replaced using the --rename flag. All other redactions of sensitive and identifiable information must be performed before running this script.
Tip: the auto-generated transcripts can be edited in-situ using the Microsoft Stream app.
Remember to delete the original transcripts after running this script!
There are some fairly simple additions that would make this more generally useful:
- Handle meetings with >2 participants
- User can configure how names are handled
- Configure the output format, e.g. using a template
- Handle Zoom meetings
- Output to different file formats (realistically,
.docxwould probably be the most useful to folks.)
Suggestions for improvements are welcome. Contributions even more so! Just open an issue or pull request.