Zipformer: A faster and better encoder for automatic speech recognition Paper • 2310.11230 • Published Oct 17, 2023 • 1
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context Paper • 2309.08105 • Published Sep 15, 2023
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation Paper • 2211.00508 • Published Oct 31, 2022
Blank-regularized CTC for Frame Skipping in Neural Transducer Paper • 2305.11558 • Published May 19, 2023
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization Paper • 2409.00819 • Published Sep 1, 2024
Delay-penalized CTC implemented based on Finite State Transducer Paper • 2305.11539 • Published May 19, 2023
k2SSL: A Faster and Better Framework for Self-Supervised Speech Representation Learning Paper • 2411.17100 • Published Nov 26, 2024
SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation Paper • 2411.18138 • Published Nov 27, 2024 • 1
SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM Paper • 2406.06571 • Published Jun 3, 2024
SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations Paper • 2510.25955 • Published Oct 29
SPEAR encoders Collection The SPEAR encoder models (https://arxiv.org/abs/2510.25955) • 5 items • Updated Nov 3 • 1
SPEAR encoders Collection The SPEAR encoder models (https://arxiv.org/abs/2510.25955) • 5 items • Updated Nov 3 • 1