Create readme.md
Browse files
readme.md
ADDED
|
@@ -0,0 +1,55 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<div align="center">
|
| 2 |
+
|
| 3 |
+
<h1>Qwen3-8B-FreeLM-LoRA</h1>
|
| 4 |
+
|
| 5 |
+
</div>
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
<div align="center">
|
| 9 |
+
|
| 10 |
+
[](https://arxiv.org/abs/your_paper_link)
|
| 11 |
+
[](https://huggingface.co/collections/your_collection)
|
| 12 |
+
[](https://github.com/TemporaryLoRA/FreeLM)
|
| 13 |
+
[](./LICENSE)
|
| 14 |
+
|
| 15 |
+
</div>
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
Implementation of paper [Free(): Learning to Forget in Malloc-Only Reasoning Models]()
|
| 19 |
+
|
| 20 |
+
Reasoning models enhance problem-solving by scaling test-time compute, yet they face a critical paradox: excessive thinking tokens often degrade performance rather than improve it. We attribute this to a fundamental architectural flaw: standard LLMs operate as **"malloc-only" engines**, continuously accumulating valid and redundant steps alike without a mechanism to prune obsolete information. To break this cycle, we propose **Free()LM**, a model that introduces an intrinsic self-forgetting capability via the **Free-Module**, a plug-and-play LoRA adapter. By iteratively switching between reasoning and cleaning modes, Free()LM dynamically identifies and prunes useless context chunks, maintaining a compact and noise-free state.
|
| 21 |
+
|
| 22 |
+
Extensive experiments show that Free()LM provides consistent improvements across all model scales (8B to 685B). It achieves a 3.3\% average improvement over top-tier reasoning baselines, even establishing a new **SOTA** on IMOanswerBench using DeepSeek V3.2-Speciale.
|
| 23 |
+
Most notably, in long-horizon tasks where the standard Qwen3-235B-A22B model suffers a total collapse (0\% accuracy), Free()LM restores performance to **~50**. Our findings suggest that sustainable intelligence requires the freedom to forget as much as the power to think.
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+

|
| 27 |
+
|
| 28 |
+
## Resources
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
| Base Model | Method | Checkpoint |
|
| 32 |
+
| :--- | :---: | :--- |
|
| 33 |
+
| **Qwen3-8B** | Free()LM | [🤗 ldsjmdy/Qwen3-8B-FreeLM-LoRA](https://huggingface.co/ldsjmdy/Qwen3-8B-FreeLM-LoRA) |
|
| 34 |
+
| **Qwen3-30B-A3B-Thinking-2507** | Free()LM | [🤗 ldsjmdy/Qwen3-30B-A3B-Thinking-2507-FreeLM-LoRA](https://huggingface.co/ldsjmdy/Qwen3-30B-A3B-Thinking-2507-FreeLM-LoRA) |
|
| 35 |
+
| **Qwen3-235B-A3B-Thinking-2507** | Free()LM | [🤗 ldsjmdy/Qwen3-235B-A3B-Thinking-2507-FreeLM-LoRA](https://huggingface.co/ldsjmdy/Qwen3-235B-A3B-Thinking-2507-FreeLM-LoRA) |
|
| 36 |
+
|
| 37 |
+
- Train/Eval Data: [🤗 ldsjmdy/FreeLM](https://huggingface.co/datasets/ldsjmdy/FreeLM)
|
| 38 |
+
|
| 39 |
+
## Performance
|
| 40 |
+
|
| 41 |
+

|
| 42 |
+
> Performance of Qwen3 models. We report pass@1 (p@1) performance computed over 8 rollouts, along with the average number of response tokens (\#Token). For the Average columns, brackets represent the absolute change for p@1 and the relative change for Token (where blue indicates improvement and red indicates regression).
|
| 43 |
+
|
| 44 |
+
## Usage
|
| 45 |
+
|
| 46 |
+
For detailed usage instructions and source code, please refer to our [FreeLM](https://github.com/TemporaryLoRA/FreeLM/tree/main).
|
| 47 |
+
|
| 48 |
+
## Citation
|
| 49 |
+
|
| 50 |
+
If you find `FreeLM` useful for your research, please cite our paper:
|
| 51 |
+
|
| 52 |
+
```bibtex
|
| 53 |
+
```
|
| 54 |
+
|
| 55 |
+
|