ASR for African Voices ๐
Collection
Robust speech-to-text models for languages of Africa
โข
14 items
โข
Updated
โข
2
This is a Automatic Speech Recognition (ASR) model for Sidama or Sidaamu Afoo, an Afroasiatic language that is native to Ethiopia. It is fineโtuned from Wav2Vec2โBERT 2.0 using the Ethio speech corpus.
from transformers import Wav2Vec2BertProcessor, Wav2Vec2BertForCTC
import torchaudio, torch
processor = Wav2Vec2BertProcessor.from_pretrained("badrex/w2v-bert-2.0-sidama-asr")
model = Wav2Vec2BertForCTC.from_pretrained("badrex/w2v-bert-2.0-sidama-asr")
audio, sr = torchaudio.load("audio.wav")
inputs = processor(audio.squeeze(), sampling_rate=sr, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
pred_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(pred_ids)[0]
print(transcription)
Performance varies with accents, dialects, and recording quality.
@misc{w2v_bert_ethiopian_asr,
author = {Badr M. Abdullah},
title = {Fine-tuning Wav2Vec2-BERT 2.0 for Ethiopian ASR},
year = {2025},
url = {https://huggingface.co/badrex/w2v-bert-2.0-sidama-asr}
}
Base model
facebook/w2v-bert-2.0