Missing safetensors files 44-46 (model cannot load)
Issue
The model.safetensors.index.json references 46 safetensors shards (model-00001 to model-00046), but only 43 files are present in the repository (model-00001 to model-00043).
Missing files:
model-00044-of-00046.safetensorsmodel-00045-of-00046.safetensorsmodel-00046-of-00046.safetensors
Impact
These files contain layers 73-77 and the lm_head weights (216 parameters total). Without them, the model fails to load with:
ValueError: Missing 216 parameters:
lm_head.biases,
lm_head.scales,
lm_head.weight,
model.layers.73.input_layernorm.weight,
model.layers.73.mlp.gate.e_score_correction_bias,
...
Environment
- Mac Studio M3 Ultra 512GB
- mlx-lm 0.30.7
- Downloaded via
hf download inferencerlabs/GLM-5-MLX-4.8bit --local-dir ~/models/GLM-5-MLX - Download completed successfully (52/52 files) but the 3 safetensors shards are simply not in the repo.
Request
Please re-upload the missing 3 safetensors files to make the model usable. Thank you!
It's still currently uploading. Will remove the notice from the model card once it's done.
Update: It works now! π (Probably the first GLM-5 MLX deployment on Mac Studio)
The missing file model-00046-of-00046.safetensors has been uploaded β thank you!
I was so eager to run GLM-5 locally that I started downloading while you were still uploading π . Here's what happened:
The Journey
- Downloaded 43/46 files β model failed to load ("Missing 216 parameters")
- Reported the issue β turns out files 44-46 hadn't been uploaded yet
- Wrote a monitoring script to poll HuggingFace every 3 minutes, auto-download the moment file 046 appears
- Finally ran it β
mlx_lm.generateoutput the first tokens from GLM-5 on Apple Silicon MLX!
Results
Hardware: Mac Studio M3 Ultra 512GB
Framework: mlx-lm 0.30.7
Speed: 17.9 tokens/sec
Peak Memory: 449 GB / 512 GB
The model runs beautifully. 449GB peak memory fits within the 512GB unified memory with ~63GB headroom for the OS.
Lessons Learned
- Don't download a model while the author is still uploading π
hf downloadreports "52/52 complete" even when safetensors shards are missing from the repo β it only downloads what exists, no integrity check against the index file- Apple Silicon unified memory is a game-changer for running 400GB+ models locally
I might be the first person to successfully run GLM-5 via MLX on a Mac Studio. Got "tricked" by the upload timing, but honestly β I'm thrilled to be an early adopter!
Thanks again @inferencerlabs for the MLX conversion. This is exactly what the Apple Silicon community needs. π