nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8 Text Generation • 50B • Updated Oct 15, 2025 • 19.2k • 24
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5 Text Generation • 50B • Updated Oct 15, 2025 • 98.7k • 226
FFN Fusion: Rethinking Sequential Computation in Large Language Models Paper • 2503.18908 • Published Mar 24, 2025 • 19
nvidia/Llama-3_3-Nemotron-Super-49B-v1 Text Generation • 50B • Updated Oct 15, 2025 • 24.7k • 320
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs Paper • 2411.19146 • Published Nov 28, 2024 • 17