f9237b85afe320e346c08794a374f118
This model is a fine-tuned version of google-t5/t5-large on the Helsinki-NLP/opus_books [fr-nl] dataset. It achieves the following results on the evaluation set:
- Loss: 1.0682
- Data Size: 1.0
- Epoch Runtime: 420.0688
- Bleu: 8.8443
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 3.3130 | 0 | 27.7994 | 0.5982 |
| No log | 1 | 1000 | 2.7566 | 0.0078 | 36.8077 | 1.6539 |
| No log | 2 | 2000 | 2.5586 | 0.0156 | 38.8126 | 1.7336 |
| No log | 3 | 3000 | 2.3754 | 0.0312 | 45.6193 | 2.1060 |
| 0.0935 | 4 | 4000 | 2.2032 | 0.0625 | 58.2706 | 2.7375 |
| 2.3525 | 5 | 5000 | 2.0270 | 0.125 | 84.7273 | 3.2737 |
| 0.1298 | 6 | 6000 | 1.8329 | 0.25 | 130.1406 | 4.0081 |
| 0.1659 | 7 | 7000 | 1.6288 | 0.5 | 230.0070 | 4.9057 |
| 1.5884 | 8.0 | 8000 | 1.4209 | 1.0 | 425.6913 | 6.1489 |
| 1.4317 | 9.0 | 9000 | 1.3147 | 1.0 | 435.2061 | 6.7954 |
| 1.3169 | 10.0 | 10000 | 1.2400 | 1.0 | 426.8136 | 7.2651 |
| 1.2384 | 11.0 | 11000 | 1.1938 | 1.0 | 447.4447 | 7.6365 |
| 1.1637 | 12.0 | 12000 | 1.1566 | 1.0 | 444.3004 | 7.8984 |
| 1.1049 | 13.0 | 13000 | 1.1355 | 1.0 | 429.0359 | 8.0949 |
| 1.0587 | 14.0 | 14000 | 1.1126 | 1.0 | 429.8473 | 8.1326 |
| 0.997 | 15.0 | 15000 | 1.1031 | 1.0 | 422.7297 | 8.2998 |
| 0.9601 | 16.0 | 16000 | 1.0837 | 1.0 | 428.8351 | 8.4219 |
| 0.8953 | 17.0 | 17000 | 1.0751 | 1.0 | 432.1069 | 8.4233 |
| 0.888 | 18.0 | 18000 | 1.0752 | 1.0 | 428.3797 | 8.4970 |
| 0.8423 | 19.0 | 19000 | 1.0645 | 1.0 | 414.9656 | 8.6346 |
| 0.799 | 20.0 | 20000 | 1.0649 | 1.0 | 429.7402 | 8.6474 |
| 0.7798 | 21.0 | 21000 | 1.0522 | 1.0 | 445.6019 | 8.6459 |
| 0.7559 | 22.0 | 22000 | 1.0573 | 1.0 | 427.1178 | 8.7030 |
| 0.7149 | 23.0 | 23000 | 1.0628 | 1.0 | 434.0112 | 8.7722 |
| 0.6874 | 24.0 | 24000 | 1.0620 | 1.0 | 426.1050 | 8.7406 |
| 0.6829 | 25.0 | 25000 | 1.0682 | 1.0 | 420.0688 | 8.8443 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/f9237b85afe320e346c08794a374f118
Base model
google-t5/t5-large