Article·AI Engineering & Research·Jun 13, 2024

Breaking Brahmic: How OpenAI's Text Cleaning Hides Whisper's True Word Error Rate for Many South Asian Languages

OpenAI's open source speech to text model Whisper reports great Word Error Rates with Southeast Asian languages. But those numbers don't hold up in practice.

Featured Image for Breaking Brahmic: How OpenAI's Text Cleaning Hides Whisper's True Word Error Rate for Many South Asian Languages
Headshot of Ross O'Connell

By Ross O'Connell

Data Scientist

Updated