davanstrien
/

fineweb-swe_latn-quality-transformer

@@ -19,17 +19,17 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [EuroBERT/EuroBERT-210m](https://huggingface.co/EuroBERT/EuroBERT-210m) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6033
-- F1: 0.6325
-- Accuracy: 0.6910
-- Confusion Matrix: 26 44
-11 97
-- High Precision: 0.7027
-- High Recall: 0.3714
-- High F1: 0.4860
-- Low Precision: 0.6879
-- Low Recall: 0.8981
-- Low F1: 0.7791
 ## Model description
@@ -57,19 +57,31 @@ The following hyperparameters were used during training:
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.1
-- num_epochs: 3
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | F1     | Accuracy | Confusion Matrix | High Precision | High Recall | High F1 | Low Precision | Low Recall | Low F1 |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:--------:|:----------------:|:--------------:|:-----------:|:-------:|:-------------:|:----------:|:------:|
-| No log        | 1.0   | 5    | 0.6403          | 0.3776 | 0.6067   | 0 70
 0 108       | 0.0            | 0.0         | 0.0     | 0.6067        | 1.0        | 0.7552 |
-| 0.9281        | 2.0   | 10   | 1.0296          | 0.3776 | 0.6067   | 0 70
 0 108       | 0.0            | 0.0         | 0.0     | 0.6067        | 1.0        | 0.7552 |
-| 0.9281        | 3.0   | 15   | 0.6033          | 0.6325 | 0.6910   | 26 44
-11 97      | 0.7027         | 0.3714      | 0.4860  | 0.6879        | 0.8981     | 0.7791 |
 ### Framework versions

 This model is a fine-tuned version of [EuroBERT/EuroBERT-210m](https://huggingface.co/EuroBERT/EuroBERT-210m) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.5507
+- F1: 0.7041
+- Accuracy: 0.7079
+- Confusion Matrix: 53 17
+35 73
+- High Precision: 0.6023
+- High Recall: 0.7571
+- High F1: 0.6709
+- Low Precision: 0.8111
+- Low Recall: 0.6759
+- Low F1: 0.7374
 ## Model description
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 50
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | F1     | Accuracy | Confusion Matrix | High Precision | High Recall | High F1 | Low Precision | Low Recall | Low F1 |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:--------:|:----------------:|:--------------:|:-----------:|:-------:|:-------------:|:----------:|:------:|
+| No log        | 1.0   | 5    | 0.7080          | 0.4341 | 0.4719   | 19 51
+43 65      | 0.3065         | 0.2714      | 0.2879  | 0.5603        | 0.6019     | 0.5804 |
+| 0.8946        | 2.0   | 10   | 0.8359          | 0.3776 | 0.6067   | 0 70
 0 108       | 0.0            | 0.0         | 0.0     | 0.6067        | 1.0        | 0.7552 |
+| 0.8946        | 3.0   | 15   | 0.6091          | 0.6435 | 0.6461   | 50 20
+43 65      | 0.5376         | 0.7143      | 0.6135  | 0.7647        | 0.6019     | 0.6736 |
+| 0.6111        | 4.0   | 20   | 0.7509          | 0.3776 | 0.6067   | 0 70
 0 108       | 0.0            | 0.0         | 0.0     | 0.6067        | 1.0        | 0.7552 |
+| 0.6111        | 5.0   | 25   | 0.7014          | 0.4200 | 0.6180   | 3 67
+1 107       | 0.75           | 0.0429      | 0.0811  | 0.6149        | 0.9907     | 0.7589 |
+| 0.5827        | 6.0   | 30   | 0.5507          | 0.7041 | 0.7079   | 53 17
+35 73      | 0.6023         | 0.7571      | 0.6709  | 0.8111        | 0.6759     | 0.7374 |
+| 0.5827        | 7.0   | 35   | 0.5907          | 0.6963 | 0.6966   | 59 11
+43 65      | 0.5784         | 0.8429      | 0.6860  | 0.8553        | 0.6019     | 0.7065 |
+| 0.3865        | 8.0   | 40   | 0.6183          | 0.6468 | 0.7079   | 26 44
+8 100      | 0.7647         | 0.3714      | 0.5     | 0.6944        | 0.9259     | 0.7937 |
+| 0.3865        | 9.0   | 45   | 1.1120          | 0.5645 | 0.6685   | 16 54
+5 103      | 0.7619         | 0.2286      | 0.3516  | 0.6561        | 0.9537     | 0.7774 |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:272375a3cbbfa6e1e3d6b670ca2a2d8e5cdbe521318447d234b875e80c005d6a
 size 849445136

 version https://git-lfs.github.com/spec/v1
+oid sha256:4f8096e73f6416ce93d468a4e812432592e9cc04bc6f5494ee83bf8a8ea5b979
 size 849445136

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d127e74e35850f289edce130911cb2a71112871e7a6132fd8c6ed91fe35eb2a3
 size 5432

 version https://git-lfs.github.com/spec/v1
+oid sha256:2bfef2f0bfb9b9447f2f0bacabfbf4a7cc1f73ddf435a8ec84dc46155e2c4e0a
 size 5432