ldsjmdy commited on
Commit
1b4e9ba
·
verified ·
1 Parent(s): f089aac

Create readme.md

Browse files
Files changed (1) hide show
  1. readme.md +55 -0
readme.md ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <div align="center">
2
+
3
+ <h1>Qwen3-8B-FreeLM-LoRA</h1>
4
+
5
+ </div>
6
+
7
+
8
+ <div align="center">
9
+
10
+ [![Paper](https://img.shields.io/badge/Paper-ArXiv-b31b1b.svg)](https://arxiv.org/abs/your_paper_link)
11
+ [![Hugging Face Collections](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Collection-blue)](https://huggingface.co/collections/your_collection)
12
+ [![GitHub stars](https://img.shields.io/github/stars/TemporaryLoRA/FreeLM.svg?colorA=orange&colorB=orange&logo=github)](https://github.com/TemporaryLoRA/FreeLM)
13
+ [![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)](./LICENSE)
14
+
15
+ </div>
16
+
17
+
18
+ Implementation of paper [Free(): Learning to Forget in Malloc-Only Reasoning Models]()
19
+
20
+ Reasoning models enhance problem-solving by scaling test-time compute, yet they face a critical paradox: excessive thinking tokens often degrade performance rather than improve it. We attribute this to a fundamental architectural flaw: standard LLMs operate as **"malloc-only" engines**, continuously accumulating valid and redundant steps alike without a mechanism to prune obsolete information. To break this cycle, we propose **Free()LM**, a model that introduces an intrinsic self-forgetting capability via the **Free-Module**, a plug-and-play LoRA adapter. By iteratively switching between reasoning and cleaning modes, Free()LM dynamically identifies and prunes useless context chunks, maintaining a compact and noise-free state.
21
+
22
+ Extensive experiments show that Free()LM provides consistent improvements across all model scales (8B to 685B). It achieves a 3.3\% average improvement over top-tier reasoning baselines, even establishing a new **SOTA** on IMOanswerBench using DeepSeek V3.2-Speciale.
23
+ Most notably, in long-horizon tasks where the standard Qwen3-235B-A22B model suffers a total collapse (0\% accuracy), Free()LM restores performance to **~50**. Our findings suggest that sustainable intelligence requires the freedom to forget as much as the power to think.
24
+
25
+
26
+ ![Figure1_new](https://cdn-uploads.huggingface.co/production/uploads/6734a0fe3ed65dd196e40cfa/hhMi4OjxiSTTvZr7yVfnh.png)
27
+
28
+ ## Resources
29
+
30
+
31
+ | Base Model | Method | Checkpoint |
32
+ | :--- | :---: | :--- |
33
+ | **Qwen3-8B** | Free()LM | [🤗 ldsjmdy/Qwen3-8B-FreeLM-LoRA](https://huggingface.co/ldsjmdy/Qwen3-8B-FreeLM-LoRA) |
34
+ | **Qwen3-30B-A3B-Thinking-2507** | Free()LM | [🤗 ldsjmdy/Qwen3-30B-A3B-Thinking-2507-FreeLM-LoRA](https://huggingface.co/ldsjmdy/Qwen3-30B-A3B-Thinking-2507-FreeLM-LoRA) |
35
+ | **Qwen3-235B-A3B-Thinking-2507** | Free()LM | [🤗 ldsjmdy/Qwen3-235B-A3B-Thinking-2507-FreeLM-LoRA](https://huggingface.co/ldsjmdy/Qwen3-235B-A3B-Thinking-2507-FreeLM-LoRA) |
36
+
37
+ - Train/Eval Data: [🤗 ldsjmdy/FreeLM](https://huggingface.co/datasets/ldsjmdy/FreeLM)
38
+
39
+ ## Performance
40
+
41
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/6734a0fe3ed65dd196e40cfa/v8ynVFM5Wg00fZ7MVeu0I.png)
42
+ > Performance of Qwen3 models. We report pass@1 (p@1) performance computed over 8 rollouts, along with the average number of response tokens (\#Token). For the Average columns, brackets represent the absolute change for p@1 and the relative change for Token (where blue indicates improvement and red indicates regression).
43
+
44
+ ## Usage
45
+
46
+ For detailed usage instructions and source code, please refer to our [FreeLM](https://github.com/TemporaryLoRA/FreeLM/tree/main).
47
+
48
+ ## Citation
49
+
50
+ If you find `FreeLM` useful for your research, please cite our paper:
51
+
52
+ ```bibtex
53
+ ```
54
+
55
+