view article Article Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek huggingface • Jan 27 • 45
Running on CPU Upgrade Featured 3.19k The Smol Training Playbook 📚 3.19k The secrets to building world-class LLMs
Fine-Tuning Small Language Models for Domain-Specific AI: An Edge AI Perspective Paper • 2503.01933 • Published Mar 3, 2025 • 13 • 3
Shakti-VLMs: Scalable Vision-Language Models for Enterprise AI Paper • 2502.17092 • Published Feb 24, 2025 • 3 • 2
Samba-asr state-of-the-art speech recognition leveraging structured state-space models Paper • 2501.02832 • Published Jan 6, 2025 • 8
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.13k
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 +1 eliebak, lvwerra, lewtun • Jan 28, 2025 • 889
Samba-asr state-of-the-art speech recognition leveraging structured state-space models Paper • 2501.02832 • Published Jan 6, 2025 • 8 • 5
SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments Paper • 2410.11331 • Published Oct 15, 2024 • 8 • 4
SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments Paper • 2410.11331 • Published Oct 15, 2024 • 8
SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments Paper • 2410.11331 • Published Oct 15, 2024 • 8 • 4