🧠 GLM-4.5-Air-HS (QLoRA Fine-Tuned on Hyperswitch Corpus)

This repository contains QLoRA fine-tuning checkpoints for the model GLM-4.5-Air, trained on the Hyperswitch Rust repository.

🗂️ Repository Structure

GLM-4.5-Air-HS/
├── checkpoints/
│   ├── checkpoint-500/
│   ├── checkpoint-1000/
│   ├── ...
│   ├── checkpoint-7000/
│   └── final-checkpoint/  # step 7800
└── Final Model/            # merged adapter (coming soon)

⚙️ Training Overview

Step	Eval Loss	Perplexity	Notes
0	2.82	16.78	Baseline before training
1000	2.06	7.82	Early stabilization
2000	1.37	3.95	Major improvement
3000	1.21	3.35	Smooth convergence
4000	1.15	3.16	Stable
5000	1.06	2.89	Optimal zone begins
6000	0.93	2.52	Strong generalization
7000	0.88	2.41	Best overall checkpoint

✅ Observation: Continuous, monotonic improvement in loss and perplexity across steps suggests stable QLoRA convergence without overfitting.

🧮 Configuration Summary

# GLM-4.5-Air QLoRA Training Configuration
optimization:
  micro_batch_per_gpu: 1
  grad_accum_steps: 16
  learning_rate: 5e-6
  weight_decay: 0.01
  warmup_ratio: 0.08
  betas: [0.9, 0.95]
  eps: 1e-8

lora:
  r: 64
  lora_alpha: 128
  target_modules: ["q_proj", "k_proj", "v_proj", "o_proj"]
  lora_dropout: 0.05
  bias: "none"

quantization:
  load_in_4bit: true
  bnb_4bit_compute_dtype: "float32"
  bnb_4bit_use_double_quant: true
  bnb_4bit_quant_type: "nf4"

logging:
  log_interval: 10
  eval_interval: 1000
  checkpoint_interval: 1000
  max_checkpoints: 8

memory:
  use_cpu_offload: true
  use_deepspeed_zero3: true
  gradient_checkpointing: true
  use_cache: false

🧩 Model Objective

Base Model: GLM-4.5-Air
Fine-Tuning Mode: QLoRA (4-bit, rank 64)
Data: 12,000 training + 1,500 validation datapoints
Precision: NF4 quantization with double quant + 4bit compute (float32)
Optimizer: AdamW (β₁=0.9, β₂=0.95, ε=1e-8)

QLoRA modifies only ~0.1% of parameters, so catastrophic forgetting is rare.

Maintained by: Juspay AI Research Contributor: Mynampati Sri Ranganadha Avinash (@ashx098)

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for juspay/GLM-4.5-Air-HS

Base model

zai-org/GLM-4.5-Air

Finetuned

(32)

this model