Missing safetensors files 44-46 (model cannot load)

#2
by Neo2025new - opened

Issue

The model.safetensors.index.json references 46 safetensors shards (model-00001 to model-00046), but only 43 files are present in the repository (model-00001 to model-00043).

Missing files:

  • model-00044-of-00046.safetensors
  • model-00045-of-00046.safetensors
  • model-00046-of-00046.safetensors

Impact

These files contain layers 73-77 and the lm_head weights (216 parameters total). Without them, the model fails to load with:

ValueError: Missing 216 parameters:
lm_head.biases,
lm_head.scales,
lm_head.weight,
model.layers.73.input_layernorm.weight,
model.layers.73.mlp.gate.e_score_correction_bias,
...

Environment

  • Mac Studio M3 Ultra 512GB
  • mlx-lm 0.30.7
  • Downloaded via hf download inferencerlabs/GLM-5-MLX-4.8bit --local-dir ~/models/GLM-5-MLX
  • Download completed successfully (52/52 files) but the 3 safetensors shards are simply not in the repo.

Request

Please re-upload the missing 3 safetensors files to make the model usable. Thank you!

It's still currently uploading. Will remove the notice from the model card once it's done.

Update: It works now! πŸŽ‰ (Probably the first GLM-5 MLX deployment on Mac Studio)

The missing file model-00046-of-00046.safetensors has been uploaded β€” thank you!

I was so eager to run GLM-5 locally that I started downloading while you were still uploading πŸ˜…. Here's what happened:

The Journey

  1. Downloaded 43/46 files β†’ model failed to load ("Missing 216 parameters")
  2. Reported the issue β†’ turns out files 44-46 hadn't been uploaded yet
  3. Wrote a monitoring script to poll HuggingFace every 3 minutes, auto-download the moment file 046 appears
  4. Finally ran it β†’ mlx_lm.generate output the first tokens from GLM-5 on Apple Silicon MLX!

Results

Hardware: Mac Studio M3 Ultra 512GB
Framework: mlx-lm 0.30.7
Speed: 17.9 tokens/sec
Peak Memory: 449 GB / 512 GB

The model runs beautifully. 449GB peak memory fits within the 512GB unified memory with ~63GB headroom for the OS.

Lessons Learned

  • Don't download a model while the author is still uploading πŸ˜‚
  • hf download reports "52/52 complete" even when safetensors shards are missing from the repo β€” it only downloads what exists, no integrity check against the index file
  • Apple Silicon unified memory is a game-changer for running 400GB+ models locally

I might be the first person to successfully run GLM-5 via MLX on a Mac Studio. Got "tricked" by the upload timing, but honestly β€” I'm thrilled to be an early adopter!

Thanks again @inferencerlabs for the MLX conversion. This is exactly what the Apple Silicon community needs. 🍎

Sign up or log in to comment