1fa145fa8b67260aa444d6a25f73ece2

This model is a fine-tuned version of google-t5/t5-large on the Helsinki-NLP/opus_books [it-nl] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0124
  • Data Size: 1.0
  • Epoch Runtime: 29.7277
  • Bleu: 3.2004

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 6.3244 0 2.6261 0.1038
No log 1 58 5.4778 0.0078 3.3770 0.1298
No log 2 116 4.1332 0.0156 4.0611 0.1329
No log 3 174 3.4639 0.0312 6.3753 0.1487
No log 4 232 3.1592 0.0625 7.3335 0.4491
No log 5 290 2.9883 0.125 10.2179 0.5884
0.315 6 348 2.8525 0.25 15.1703 0.5602
0.4281 7 406 2.6831 0.5 20.7401 0.9285
2.0256 8.0 464 2.5202 1.0 33.3593 1.0862
2.6656 9.0 522 2.4199 1.0 31.5950 1.4794
2.5431 10.0 580 2.3434 1.0 30.9194 1.5444
2.4782 11.0 638 2.2809 1.0 31.4412 1.8279
2.3984 12.0 696 2.2351 1.0 31.5773 1.9576
2.262 13.0 754 2.1898 1.0 31.0741 2.1180
2.2022 14.0 812 2.1611 1.0 32.1944 2.3627
2.1468 15.0 870 2.1368 1.0 31.0629 2.3570
2.0988 16.0 928 2.1098 1.0 31.1569 2.5070
2.0671 17.0 986 2.0967 1.0 30.5432 2.7344
2.0133 18.0 1044 2.0732 1.0 30.4927 2.8833
1.9444 19.0 1102 2.0581 1.0 30.2351 2.9093
1.8903 20.0 1160 2.0422 1.0 30.6885 3.0504
1.8688 21.0 1218 2.0334 1.0 30.6839 2.9917
1.8279 22.0 1276 2.0312 1.0 29.5904 3.0427
1.7933 23.0 1334 2.0237 1.0 31.0217 3.1269
1.7654 24.0 1392 2.0120 1.0 31.0672 3.1205
1.7111 25.0 1450 2.0115 1.0 31.6891 3.1282
1.6784 26.0 1508 2.0074 1.0 29.6392 3.1184
1.6338 27.0 1566 2.0056 1.0 32.5466 3.1139
1.6191 28.0 1624 2.0007 1.0 30.7501 3.2825
1.5961 29.0 1682 2.0035 1.0 30.3288 3.2258
1.5631 30.0 1740 2.0008 1.0 30.8249 3.2848
1.5464 31.0 1798 2.0033 1.0 30.2752 3.1802
1.4867 32.0 1856 2.0124 1.0 29.7277 3.2004

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
3
Safetensors
Model size
0.8B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/1fa145fa8b67260aa444d6a25f73ece2

Base model

google-t5/t5-large
Finetuned
(171)
this model