Automatic Speech Recognition for Sidama ๐Ÿ‡ช๐Ÿ‡น

Hugging Face Hugging Face License

๐Ÿ‡ Model Description

This is a Automatic Speech Recognition (ASR) model for Sidama or Sidaamu Afoo, an Afroasiatic language that is native to Ethiopia. It is fineโ€‘tuned from Wav2Vec2โ€‘BERT 2.0 using the Ethio speech corpus.

  • Developed by: Badr al-Absi
  • Model type: Speech Recognition (ASR)
  • Languages: Sidama
  • License: CC-BY-4.0
  • Finetuned from: facebook/w2v-bert-2.0

๐ŸŽง Direct Use

from transformers import Wav2Vec2BertProcessor, Wav2Vec2BertForCTC
import torchaudio, torch

processor = Wav2Vec2BertProcessor.from_pretrained("badrex/w2v-bert-2.0-sidama-asr")
model = Wav2Vec2BertForCTC.from_pretrained("badrex/w2v-bert-2.0-sidama-asr")

audio, sr = torchaudio.load("audio.wav")
inputs = processor(audio.squeeze(), sampling_rate=sr, return_tensors="pt")

with torch.no_grad():
    logits = model(**inputs).logits

pred_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(pred_ids)[0]

print(transcription)

๐Ÿ”ง Downstream Use

  • Voice assistants
  • Accessibility tools
  • Research baselines

๐Ÿšซ Outโ€‘ofโ€‘Scope Use

  • Other languages besides Sidama
  • Highโ€‘stakes deployments without human review
  • Noisy audio without further tuning

โš ๏ธ Risks & Limitations

Performance varies with accents, dialects, and recording quality.

๐Ÿ“Œ Citation

@misc{w2v_bert_ethiopian_asr,
  author = {Badr M. Abdullah},
  title = {Fine-tuning Wav2Vec2-BERT 2.0 for Ethiopian ASR},
  year = {2025},
  url = {https://huggingface.co/badrex/w2v-bert-2.0-sidama-asr}
}
Downloads last month
25
Safetensors
Model size
0.6B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for badrex/w2v-bert-2.0-sidama-asr

Finetuned
(388)
this model

Dataset used to train badrex/w2v-bert-2.0-sidama-asr

Collections including badrex/w2v-bert-2.0-sidama-asr