COMPOT: Calibration-Optimized Matrix Procrustes Orthogonalization for Transformers Compression
Abstract
COMPOT is a training-free compression framework for Transformer models that uses sparse dictionary learning with orthogonal dictionaries and closed-form updates to achieve better quality-compression trade-offs than traditional low-rank methods.
Post-training compression of Transformer models commonly relies on truncated singular value decomposition (SVD). However, enforcing a single shared subspace can degrade accuracy even at moderate compression. Sparse dictionary learning provides a more flexible union-of-subspaces representation, but existing approaches often suffer from iterative dictionary and coefficient updates. We propose COMPOT (Calibration-Optimized Matrix Procrustes Orthogonalization for Transformers), a training-free compression framework that uses a small calibration dataset to estimate a sparse weight factorization. COMPOT employs orthogonal dictionaries that enable closed-form Procrustes updates for the dictionary and analytical single-step sparse coding for the coefficients, eliminating iterative optimization. To handle heterogeneous layer sensitivity under a global compression budget, COMPOT further introduces a one-shot dynamic allocation strategy that adaptively redistributes layer-wise compression rates. Extensive experiments across diverse architectures and tasks show that COMPOT consistently delivers a superior quality-compression trade-off over strong low-rank and sparse baselines, while remaining fully compatible with post-training quantization for extreme compression. Code is available https://github.com/mts-ai/COMPOT{here}.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- ROCKET: Rapid Optimization via Calibration-guided Knapsack Enhanced Truncation for Efficient Model Compression (2026)
- Zero Sum SVD: Balancing Loss Sensitivity for Low Rank LLM Compression (2026)
- SAES-SVD: Self-Adaptive Suppression of Accumulated and Local Errors for SVD-based LLM Compression (2026)
- Don't be so Stief! Learning KV Cache low-rank approximation over the Stiefel manifold (2026)
- More Than Bits: Multi-Envelope Double Binary Factorization for Extreme Quantization (2025)
- Preserve-Then-Quantize: Balancing Rank Budgets for Quantization Error Reconstruction in LLMs (2026)
- CORP: Closed-Form One-shot Representation-Preserving Structured Pruning for Vision Transformers (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper