Chunjiang-Intelligence/DeepSeek-v4-Fable Text Generation • 149B • Updated about 18 hours ago • 1.77k • 138
SmolLM3 pretraining datasets Collection datasets used in SmolLM3 pretraining • 15 items • Updated Aug 12, 2025 • 53
Running on CPU Upgrade Featured 3.22k The Smol Training Playbook 📚 3.22k The secrets to building world-class LLMs
incredible45/Gutenberg-BookCorpus-Cleaned-Data-English Viewer • Updated Apr 10, 2025 • 51.4k • 587 • 15