f6e53d4528a3ca2713896281ee674c88

This model is a fine-tuned version of google-t5/t5-large on the Helsinki-NLP/opus_books [fr-ru] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3180
  • Data Size: 1.0
  • Epoch Runtime: 96.8225
  • Bleu: 11.7898

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 2.3306 0 7.9301 0.1390
No log 1 204 2.1799 0.0078 8.4599 0.2169
No log 2 408 2.0878 0.0156 9.4211 0.6286
No log 3 612 2.0095 0.0312 12.8142 0.7854
No log 4 816 1.9499 0.0625 15.7047 0.9487
No log 5 1020 1.8639 0.125 21.4848 1.4115
0.1617 6 1224 1.7845 0.25 30.4647 2.2086
1.8586 7 1428 1.6954 0.5 49.4104 3.1995
1.7346 8.0 1632 1.5843 1.0 95.0287 5.0825
1.6211 9.0 1836 1.5146 1.0 93.4640 6.0987
1.5591 10.0 2040 1.4639 1.0 94.7307 6.9579
1.4686 11.0 2244 1.4286 1.0 94.1672 7.8988
1.4076 12.0 2448 1.3924 1.0 93.2586 8.3769
1.3523 13.0 2652 1.3680 1.0 92.6734 8.8596
1.293 14.0 2856 1.3558 1.0 94.5879 9.2092
1.2544 15.0 3060 1.3414 1.0 93.3163 9.8379
1.183 16.0 3264 1.3226 1.0 91.5752 10.1714
1.1611 17.0 3468 1.3159 1.0 95.1535 10.4638
1.1272 18.0 3672 1.3185 1.0 89.5321 10.8573
1.0795 19.0 3876 1.2992 1.0 90.8814 10.8660
1.0368 20.0 4080 1.3040 1.0 92.0609 11.1123
1.0001 21.0 4284 1.2934 1.0 93.6321 10.9969
0.9653 22.0 4488 1.3072 1.0 97.6585 11.5471
0.9493 23.0 4692 1.2985 1.0 93.9973 11.5275
0.9054 24.0 4896 1.3071 1.0 95.7446 11.7489
0.8932 25.0 5100 1.3180 1.0 96.8225 11.7898

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
2
Safetensors
Model size
0.8B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/f6e53d4528a3ca2713896281ee674c88

Base model

google-t5/t5-large
Finetuned
(171)
this model