bumstern
/

segmentation_model_russian_data

Model card Files Files and versions

bumstern commited on Jun 17, 2023

Commit

dc3781c

·

1 Parent(s): b1dd720

Update README.md

Files changed (1) hide show

README.md +44 -0

README.md CHANGED Viewed

@@ -1,3 +1,47 @@
 ---
 license: mit
 ---

 ---
 license: mit
+language:
+- ru
+library_name: pyannote-audio
+tags:
+- code
 ---
+# Segmentation model
+This model was trained on AMI-MixHeadset and my own synthetic dataset of Russian speech.
+Training time: 5 hours on GTX 3060
+This model can be used for diarization model from [pyannote/speaker-diarization](https://huggingface.co/pyannote/speaker-diarization)
+| Benchmark | DER% |
+| --------- |------|
+| [AMI (*headset mix,*](https://groups.inf.ed.ac.uk/ami/corpus/) [*only_words*)](https://github.com/BUTSpeechFIT/AMI-diarization-setup) | 38.8 |
+## Usage example
+```python
+import yaml
+from yaml.loader import SafeLoader
+import torch
+from pyannote.audio import Model
+from pyannote.audio.pipelines import SpeakerDiarization
+segm_model = torch.load('model/segm_model.pth', map_location=torch.device('cpu'))
+embed_model = Model.from_pretrained("pyannote/embedding", use_auth_token='ACCESS_TOKEN_GOES_HERE')
+diar_pipeline = SpeakerDiarization(
+    segmentation=segm_model,
+    segmentation_batch_size=16,
+    clustering="AgglomerativeClustering",
+    embedding=embed_model
+)
+with open('model/config.yaml', 'r') as f:
+    diar_config = yaml.load(f, Loader=SafeLoader)
+diar_pipeline.instantiate(diar_config)
+annotation = diar_pipeline('audio.wav')
+```