Spaces:
Paused
Paused
File size: 2,680 Bytes
a52f96d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
# Update Summary: Using LM Student in Comparison
## β
Changes Completed
Updated `compare_strategies.py` to use **LM Student (DistilBERT)** instead of MockStudentAgent for all three strategies:
1. **Random Strategy** - Now uses LM Student
2. **Progressive Strategy** - Now uses LM Student
3. **Teacher Strategy** - Now uses LM Student
## π§ Technical Changes
### 1. Added LM Student Import
- Added path to `student_agent_dev` directory
- Imports `StudentAgent` from `student_agent.py` as `LMStudentAgent`
- Falls back to `MockStudentAgent` if import fails
### 2. Updated All Three Strategy Functions
- `train_strategy_random()` - Uses LM Student
- `train_strategy_progressive()` - Uses LM Student
- `train_strategy_teacher()` - Uses LM Student
### 3. LM Student Configuration
All strategies use:
```python
student = LMStudentAgent(
learning_rate=5e-5, # LM fine-tuning learning rate
retention_constant=80.0, # Slower forgetting
device='cpu', # CPU for compatibility
max_length=256, # Max tokens
gradient_accumulation_steps=4 # Stability
)
```
### 4. Fallback Support
If LM Student cannot be imported, automatically falls back to MockStudentAgent.
## π How to Run
```bash
cd teacher_agent_dev
# Quick test (50 iterations)
python compare_strategies.py --iterations 50 --deterministic
# Full comparison (500 iterations - will take longer with LM)
python compare_strategies.py --iterations 500 --deterministic
```
## β οΈ Performance Notes
**LM Student is much slower** than MockStudentAgent because:
- Each `answer()` call runs DistilBERT inference
- Each `learn()` call fine-tunes DistilBERT (forward + backward pass)
- Memory decay calculations
**Expected runtime:**
- MockStudentAgent: ~30 seconds for 500 iterations
- LM Student: ~15-30 minutes for 500 iterations
## π What to Expect
With LM Student:
- **More realistic learning**: Actual neural network learning vs simple skill tracking
- **Slower convergence**: LM needs more examples to learn patterns
- **Different results**: LM behavior differs from mock student
- **Memory decay**: Ebbinghaus forgetting curve affects LM predictions
## β
Verification
The code is ready to run. When you execute:
1. You'll see: `β
Using LM Student (DistilBERT)` if import succeeds
2. Or: `β οΈ Could not import LM Student` if transformers library missing
3. All three strategies will use the same student type
## π Next Steps
Run the comparison and analyze results:
- Do teacher strategy still outperform random/progressive?
- How does LM learning differ from mock student?
- What patterns emerge with real neural network learning?
|