Reading: Whisper Overview

Whisper Overview

Last updated: 2024/04/28 at 11:29 PM

DeFi News Desk

Other existing approaches frequently use smaller, more closely matched audio-to-text training datasets,^{(^reference-1)} ^{(^reference-2)}^{(^reference-3)} or use broad but unsupervised audio pre-training.^{(^reference-4)}^{(^reference-5)}^{(^reference-6)} Because Whisper was trained on a large and diverse dataset and was not tailored to any specific dataset, it does not beat specialized models in performance from LibriSpeech, a renowned benchmark for speech recognition. However, when we measure Whisper's performance on many diverse datasets, we find that it is much more robust and makes 50% fewer errors than these models.

About a third of Whisper's audio dataset is not in English and it is alternately responsible for transcribing into the original language or translating into English. We find that this approach is particularly effective for learning speech-to-text translation and outperforms supervised SOTA on CoVoST2 towards zero-shot English translation.

Share This Article

Analyst eyes Bitcoin price of $300,000 as BTC approaches 'most aggressive part of bull cycle'

Improve your Web3 security with Brave — Layer3

Whisper Overview

Leave a Reply Cancel reply

Stay Connected

Create an Amazing Newspaper

Latest News

UX Series in Universal Design: Key Principles for Physical Disabilities in Healthcare Systems

Google DeepMind's AlphaProof and AlphaGeometry-2 Solve Advanced Mathematical Reasoning Problems

Pepe Unchained Surpasses $5.5 Million, Raises Over $500,000 in Less Than a Week

Secure your Bitcoin Vault with a YubiKey: Casa

Subscribe to our newsletter