Collection of Quantized Models for MoE
Krishna Teja Chitty-Venkata
AI & ML interests
LLM Optimization, Neural Architecture Search, Quantization, Pruning
Recent Activity
updated
a model
about 4 hours ago
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-quantized.w4a16
updated
a model
about 22 hours ago
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4
updated
a model
1 day ago
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8-dynamic