f6e53d4528a3ca2713896281ee674c88
This model is a fine-tuned version of google-t5/t5-large on the Helsinki-NLP/opus_books [fr-ru] dataset. It achieves the following results on the evaluation set:
- Loss: 1.3180
- Data Size: 1.0
- Epoch Runtime: 96.8225
- Bleu: 11.7898
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 2.3306 | 0 | 7.9301 | 0.1390 |
| No log | 1 | 204 | 2.1799 | 0.0078 | 8.4599 | 0.2169 |
| No log | 2 | 408 | 2.0878 | 0.0156 | 9.4211 | 0.6286 |
| No log | 3 | 612 | 2.0095 | 0.0312 | 12.8142 | 0.7854 |
| No log | 4 | 816 | 1.9499 | 0.0625 | 15.7047 | 0.9487 |
| No log | 5 | 1020 | 1.8639 | 0.125 | 21.4848 | 1.4115 |
| 0.1617 | 6 | 1224 | 1.7845 | 0.25 | 30.4647 | 2.2086 |
| 1.8586 | 7 | 1428 | 1.6954 | 0.5 | 49.4104 | 3.1995 |
| 1.7346 | 8.0 | 1632 | 1.5843 | 1.0 | 95.0287 | 5.0825 |
| 1.6211 | 9.0 | 1836 | 1.5146 | 1.0 | 93.4640 | 6.0987 |
| 1.5591 | 10.0 | 2040 | 1.4639 | 1.0 | 94.7307 | 6.9579 |
| 1.4686 | 11.0 | 2244 | 1.4286 | 1.0 | 94.1672 | 7.8988 |
| 1.4076 | 12.0 | 2448 | 1.3924 | 1.0 | 93.2586 | 8.3769 |
| 1.3523 | 13.0 | 2652 | 1.3680 | 1.0 | 92.6734 | 8.8596 |
| 1.293 | 14.0 | 2856 | 1.3558 | 1.0 | 94.5879 | 9.2092 |
| 1.2544 | 15.0 | 3060 | 1.3414 | 1.0 | 93.3163 | 9.8379 |
| 1.183 | 16.0 | 3264 | 1.3226 | 1.0 | 91.5752 | 10.1714 |
| 1.1611 | 17.0 | 3468 | 1.3159 | 1.0 | 95.1535 | 10.4638 |
| 1.1272 | 18.0 | 3672 | 1.3185 | 1.0 | 89.5321 | 10.8573 |
| 1.0795 | 19.0 | 3876 | 1.2992 | 1.0 | 90.8814 | 10.8660 |
| 1.0368 | 20.0 | 4080 | 1.3040 | 1.0 | 92.0609 | 11.1123 |
| 1.0001 | 21.0 | 4284 | 1.2934 | 1.0 | 93.6321 | 10.9969 |
| 0.9653 | 22.0 | 4488 | 1.3072 | 1.0 | 97.6585 | 11.5471 |
| 0.9493 | 23.0 | 4692 | 1.2985 | 1.0 | 93.9973 | 11.5275 |
| 0.9054 | 24.0 | 4896 | 1.3071 | 1.0 | 95.7446 | 11.7489 |
| 0.8932 | 25.0 | 5100 | 1.3180 | 1.0 | 96.8225 | 11.7898 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/f6e53d4528a3ca2713896281ee674c88
Base model
google-t5/t5-large