Audio stabilityai/stable-audio-open-small Text-to-Audio • Updated May 27, 2025 • 2.06k • 243 Running Featured 86 ONNX Model Explorer 🔍 86 Explore ONNX models interactively microsoft/VibeVoice-1.5B Text-to-Speech • 3B • Updated 10 days ago • 328k • 2.2k nvidia/audio-flamingo-3 Audio-Text-to-Text • Updated Nov 28, 2025 • 770 • 139
Play-Ground Running on CPU Upgrade 246 Inference Playground 🔋 246 Customize theme based on user preference
OCR SkalskiP/paligemma2_latex_ocr_v5 Updated Dec 11, 2024 • 2 nanonets/Nanonets-OCR-s Image-Text-to-Text • 4B • Updated Jun 20, 2025 • 27.6k • 1.58k nvidia/NVIDIA-Nemotron-Parse-v1.1 Image-Text-to-Text • Updated 4 days ago • 110k • 136
Multimode microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 232k • 1.57k ByteDance/Sa2VA-8B Image-Text-to-Text • 8B • Updated Sep 8, 2025 • 1.07k • 65
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 232k • 1.57k
Speako ibm-granite/granite-speech-3.2-8b Automatic Speech Recognition • 8B • Updated Apr 16, 2025 • 161 • 85 ByteDance/MegaTTS3 Text-to-Speech • Updated Apr 4, 2025 • 112 • 415 Running 2 Demo 🚀 2 Transcribe and translate audio/video files into text nvidia/audio-flamingo-3-hf Audio-Text-to-Text • 8B • Updated 5 days ago • 50.8k • 164
ibm-granite/granite-speech-3.2-8b Automatic Speech Recognition • 8B • Updated Apr 16, 2025 • 161 • 85
Audio stabilityai/stable-audio-open-small Text-to-Audio • Updated May 27, 2025 • 2.06k • 243 Running Featured 86 ONNX Model Explorer 🔍 86 Explore ONNX models interactively microsoft/VibeVoice-1.5B Text-to-Speech • 3B • Updated 10 days ago • 328k • 2.2k nvidia/audio-flamingo-3 Audio-Text-to-Text • Updated Nov 28, 2025 • 770 • 139
Speako ibm-granite/granite-speech-3.2-8b Automatic Speech Recognition • 8B • Updated Apr 16, 2025 • 161 • 85 ByteDance/MegaTTS3 Text-to-Speech • Updated Apr 4, 2025 • 112 • 415 Running 2 Demo 🚀 2 Transcribe and translate audio/video files into text nvidia/audio-flamingo-3-hf Audio-Text-to-Text • 8B • Updated 5 days ago • 50.8k • 164
ibm-granite/granite-speech-3.2-8b Automatic Speech Recognition • 8B • Updated Apr 16, 2025 • 161 • 85
Play-Ground Running on CPU Upgrade 246 Inference Playground 🔋 246 Customize theme based on user preference
OCR SkalskiP/paligemma2_latex_ocr_v5 Updated Dec 11, 2024 • 2 nanonets/Nanonets-OCR-s Image-Text-to-Text • 4B • Updated Jun 20, 2025 • 27.6k • 1.58k nvidia/NVIDIA-Nemotron-Parse-v1.1 Image-Text-to-Text • Updated 4 days ago • 110k • 136
Multimode microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 232k • 1.57k ByteDance/Sa2VA-8B Image-Text-to-Text • 8B • Updated Sep 8, 2025 • 1.07k • 65
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 232k • 1.57k