ASR for African Voices π
Collection
Robust speech-to-text models for languages of Africa
β’
14 items
β’
Updated
β’
2
This is a Automatic Speech Recognition (ASR) model for Amharic, one of the official languages of Ethiopia. It is fineβtuned from Wav2Vec2βBERT 2.0 using the Ethio speech corpus.
from transformers import Wav2Vec2BertProcessor, Wav2Vec2BertForCTC
import torchaudio, torch
processor = Wav2Vec2BertProcessor.from_pretrained("badrex/w2v-bert-2.0-amharic-asr")
model = Wav2Vec2BertForCTC.from_pretrained("badrex/w2v-bert-2.0-amharic-asr")
audio, sr = torchaudio.load("audio.wav")
inputs = processor(audio.squeeze(), sampling_rate=sr, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
pred_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(pred_ids)[0]
print(transcription)
Performance varies with accents, dialects, and recording quality.
@misc{w2v_bert_ethiopian_asr,
author = {Badr M. Abdullah},
title = {Fine-tuning Wav2Vec2-BERT 2.0 for Ethiopian ASR},
year = {2025},
url = {https://huggingface.co/badrex/w2v-bert-2.0-amharic-asr}
}
Base model
facebook/w2v-bert-2.0