view article Article NVIDIA Releases Improved Pretraining Dataset: Preserves High Value Math & Code, and Augments with Multi-Lingual nvidia • Aug 18, 2025 • 4