Missing safetensors files 44-46 (model cannot load)

by Neo2025new - opened 3 days ago

Neo2025new

3 days ago

Issue

The model.safetensors.index.json references 46 safetensors shards (model-00001 to model-00046), but only 43 files are present in the repository (model-00001 to model-00043).

Missing files:

model-00044-of-00046.safetensors
model-00045-of-00046.safetensors
model-00046-of-00046.safetensors

Impact

These files contain layers 73-77 and the lm_head weights (216 parameters total). Without them, the model fails to load with:

ValueError: Missing 216 parameters:
lm_head.biases,
lm_head.scales,
lm_head.weight,
model.layers.73.input_layernorm.weight,
model.layers.73.mlp.gate.e_score_correction_bias,
...

Environment

Mac Studio M3 Ultra 512GB
mlx-lm 0.30.7
Downloaded via hf download inferencerlabs/GLM-5-MLX-4.8bit --local-dir ~/models/GLM-5-MLX
Download completed successfully (52/52 files) but the 3 safetensors shards are simply not in the repo.

Request

Please re-upload the missing 3 safetensors files to make the model usable. Thank you!

inferencerlabs

Owner 3 days ago

It's still currently uploading. Will remove the notice from the model card once it's done.

Neo2025new

3 days ago

Update: It works now! 🎉 (Probably the first GLM-5 MLX deployment on Mac Studio)

The missing file model-00046-of-00046.safetensors has been uploaded — thank you!

I was so eager to run GLM-5 locally that I started downloading while you were still uploading 😅. Here's what happened:

The Journey

Downloaded 43/46 files → model failed to load ("Missing 216 parameters")
Reported the issue → turns out files 44-46 hadn't been uploaded yet
Wrote a monitoring script to poll HuggingFace every 3 minutes, auto-download the moment file 046 appears
Finally ran it → mlx_lm.generate output the first tokens from GLM-5 on Apple Silicon MLX!

Results

Hardware: Mac Studio M3 Ultra 512GB
Framework: mlx-lm 0.30.7
Speed: 17.9 tokens/sec
Peak Memory: 449 GB / 512 GB

The model runs beautifully. 449GB peak memory fits within the 512GB unified memory with ~63GB headroom for the OS.

Lessons Learned

Don't download a model while the author is still uploading 😂
hf download reports "52/52 complete" even when safetensors shards are missing from the repo — it only downloads what exists, no integrity check against the index file
Apple Silicon unified memory is a game-changer for running 400GB+ models locally

I might be the first person to successfully run GLM-5 via MLX on a Mac Studio. Got "tricked" by the upload timing, but honestly — I'm thrilled to be an early adopter!

Thanks again @inferencerlabs for the MLX conversion. This is exactly what the Apple Silicon community needs. 🍎

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment