f6e53d4528a3ca2713896281ee674c88

This model is a fine-tuned version of google-t5/t5-large on the Helsinki-NLP/opus_books [fr-ru] dataset. It achieves the following results on the evaluation set:

Loss: 1.3180
Data Size: 1.0
Epoch Runtime: 96.8225
Bleu: 11.7898

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	2.3306	0	7.9301	0.1390
No log	1	204	2.1799	0.0078	8.4599	0.2169
No log	2	408	2.0878	0.0156	9.4211	0.6286
No log	3	612	2.0095	0.0312	12.8142	0.7854
No log	4	816	1.9499	0.0625	15.7047	0.9487
No log	5	1020	1.8639	0.125	21.4848	1.4115
0.1617	6	1224	1.7845	0.25	30.4647	2.2086
1.8586	7	1428	1.6954	0.5	49.4104	3.1995
1.7346	8.0	1632	1.5843	1.0	95.0287	5.0825
1.6211	9.0	1836	1.5146	1.0	93.4640	6.0987
1.5591	10.0	2040	1.4639	1.0	94.7307	6.9579
1.4686	11.0	2244	1.4286	1.0	94.1672	7.8988
1.4076	12.0	2448	1.3924	1.0	93.2586	8.3769
1.3523	13.0	2652	1.3680	1.0	92.6734	8.8596
1.293	14.0	2856	1.3558	1.0	94.5879	9.2092
1.2544	15.0	3060	1.3414	1.0	93.3163	9.8379
1.183	16.0	3264	1.3226	1.0	91.5752	10.1714
1.1611	17.0	3468	1.3159	1.0	95.1535	10.4638
1.1272	18.0	3672	1.3185	1.0	89.5321	10.8573
1.0795	19.0	3876	1.2992	1.0	90.8814	10.8660
1.0368	20.0	4080	1.3040	1.0	92.0609	11.1123
1.0001	21.0	4284	1.2934	1.0	93.6321	10.9969
0.9653	22.0	4488	1.3072	1.0	97.6585	11.5471
0.9493	23.0	4692	1.2985	1.0	93.9973	11.5275
0.9054	24.0	4896	1.3071	1.0	95.7446	11.7489
0.8932	25.0	5100	1.3180	1.0	96.8225	11.7898

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 2

Safetensors

Model size

0.8B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/f6e53d4528a3ca2713896281ee674c88

Base model

google-t5/t5-large

Finetuned

(171)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard