Qwen3-8B-FreeLM-LoRA

Implementation of paper Free(): Learning to Forget in Malloc-Only Reasoning Models

Reasoning models enhance problem-solving by scaling test-time compute, yet they face a critical paradox: excessive thinking tokens often degrade performance rather than improve it. We attribute this to a fundamental architectural flaw: standard LLMs operate as "malloc-only" engines, continuously accumulating valid and redundant steps alike without a mechanism to prune obsolete information. To break this cycle, we propose Free()LM, a model that introduces an intrinsic self-forgetting capability via the Free-Module, a plug-and-play LoRA adapter. By iteratively switching between reasoning and cleaning modes, Free()LM dynamically identifies and prunes useless context chunks, maintaining a compact and noise-free state.

Extensive experiments show that Free()LM provides consistent improvements across all model scales (8B to 685B). It achieves a 3.3% average improvement over top-tier reasoning baselines, even establishing a new SOTA on IMOanswerBench using DeepSeek V3.2-Speciale. Most notably, in long-horizon tasks where the standard Qwen3-235B-A22B model suffers a total collapse (0% accuracy), Free()LM restores performance to ~50. Our findings suggest that sustainable intelligence requires the freedom to forget as much as the power to think.

Resources

Base Model	Method	Checkpoint
Qwen3-8B	Free()LM	🤗 ldsjmdy/Qwen3-8B-FreeLM-LoRA
Qwen3-30B-A3B-Thinking-2507	Free()LM	🤗 ldsjmdy/Qwen3-30B-A3B-Thinking-2507-FreeLM-LoRA
Qwen3-235B-A3B-Thinking-2507	Free()LM	🤗 ldsjmdy/Qwen3-235B-A3B-Thinking-2507-FreeLM-LoRA

Train/Eval Data: 🤗 ldsjmdy/FreeLM

Performance

Performance of Qwen3 models. We report pass@1 (p@1) performance computed over 8 rollouts, along with the average number of response tokens (#Token). For the Average columns, brackets represent the absolute change for p@1 and the relative change for Token (where blue indicates improvement and red indicates regression).

Usage

For detailed usage instructions and source code, please refer to our FreeLM.

Citation

If you find FreeLM useful for your research, please cite our paper:

@misc{zheng2026freelearningforgetmalloconly,
      title={Free(): Learning to Forget in Malloc-Only Reasoning Models}, 
      author={Yilun Zheng and Dongyang Ma and Tian Liang and Jiahao Xu and Xinting Huang and Lijie Chen and Haitao Mi and Yan Wang},
      year={2026},
      eprint={2602.08030},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2602.08030}, 
}