thomasmol/whisper-diarization

⚡️ Blazing fast audio transcription with speaker diarization | Whisper Large V3 Turbo | word & sentence level timestamps | prompt

Input
Configure the inputs for the AI model.

Or an audio file

Vocabulary: provide names, acronyms and loanwords in a list. Use punctuation for best accuracy.

Or provide: A direct audio file URL

Language of the spoken words as a language code like 'en'. Leave empty to auto detect language.

Translate the speech into English.

Either provide: Base64 encoded audio file,

1
50

Number of speakers, leave empty to autodetect.

Group segments of same speaker shorter apart than 2 seconds

Output
The generated output will appear here.

No output yet

Click "Generate" to create an output.