QA-DeBERTa-v3-large-threshold-v2
This model is a fine-tuned version of microsoft/deberta-v3-large on the saiteki-kai/Beavertails-it dataset. It achieves the following results on the evaluation set:
- Loss: 0.0802
- Accuracy: 0.6759
- Macro F1: 0.6770
- Macro Precision: 0.6738
- Macro Recall: 0.6865
- Micro F1: 0.7547
- Micro Precision: 0.7414
- Micro Recall: 0.7686
- Flagged/accuracy: 0.8562
- Flagged/precision: 0.8618
- Flagged/recall: 0.8832
- Flagged/f1: 0.8723
- Flagged/aucpr: 0.9050
- Flagged/fpr: 0.1777
- Animal Abuse/accuracy: 0.9948
- Animal Abuse/precision: 0.7917
- Animal Abuse/recall: 0.7456
- Animal Abuse/f1: 0.7680
- Animal Abuse/fpr: 0.0023
- Animal Abuse/threshold: 0.4173
- Child Abuse/accuracy: 0.9963
- Child Abuse/precision: 0.6697
- Child Abuse/recall: 0.6637
- Child Abuse/f1: 0.6667
- Child Abuse/fpr: 0.0018
- Child Abuse/threshold: 0.3082
- Controversial Topics,politics/accuracy: 0.9668
- Controversial Topics,politics/precision: 0.4676
- Controversial Topics,politics/recall: 0.5907
- Controversial Topics,politics/f1: 0.5219
- Controversial Topics,politics/fpr: 0.0213
- Controversial Topics,politics/threshold: 0.2379
- Discrimination,stereotype,injustice/accuracy: 0.9540
- Discrimination,stereotype,injustice/precision: 0.6977
- Discrimination,stereotype,injustice/recall: 0.7441
- Discrimination,stereotype,injustice/f1: 0.7202
- Discrimination,stereotype,injustice/fpr: 0.0278
- Discrimination,stereotype,injustice/threshold: 0.4154
- Drug Abuse,weapons,banned Substance/accuracy: 0.9736
- Drug Abuse,weapons,banned Substance/precision: 0.7451
- Drug Abuse,weapons,banned Substance/recall: 0.8080
- Drug Abuse,weapons,banned Substance/f1: 0.7753
- Drug Abuse,weapons,banned Substance/fpr: 0.0165
- Drug Abuse,weapons,banned Substance/threshold: 0.4045
- Financial Crime,property Crime,theft/accuracy: 0.9604
- Financial Crime,property Crime,theft/precision: 0.7733
- Financial Crime,property Crime,theft/recall: 0.8390
- Financial Crime,property Crime,theft/f1: 0.8048
- Financial Crime,property Crime,theft/fpr: 0.0265
- Financial Crime,property Crime,theft/threshold: 0.4254
- Hate Speech,offensive Language/accuracy: 0.9486
- Hate Speech,offensive Language/precision: 0.7312
- Hate Speech,offensive Language/recall: 0.6740
- Hate Speech,offensive Language/f1: 0.7015
- Hate Speech,offensive Language/fpr: 0.0244
- Hate Speech,offensive Language/threshold: 0.4311
- Misinformation Regarding Ethics,laws And Safety/accuracy: 0.9813
- Misinformation Regarding Ethics,laws And Safety/precision: 0.2497
- Misinformation Regarding Ethics,laws And Safety/recall: 0.2695
- Misinformation Regarding Ethics,laws And Safety/f1: 0.2592
- Misinformation Regarding Ethics,laws And Safety/fpr: 0.0100
- Misinformation Regarding Ethics,laws And Safety/threshold: 0.1225
- Non Violent Unethical Behavior/accuracy: 0.8828
- Non Violent Unethical Behavior/precision: 0.7124
- Non Violent Unethical Behavior/recall: 0.6883
- Non Violent Unethical Behavior/f1: 0.7001
- Non Violent Unethical Behavior/fpr: 0.0689
- Non Violent Unethical Behavior/threshold: 0.4187
- Privacy Violation/accuracy: 0.9812
- Privacy Violation/precision: 0.7919
- Privacy Violation/recall: 0.8402
- Privacy Violation/f1: 0.8153
- Privacy Violation/fpr: 0.0115
- Privacy Violation/threshold: 0.4144
- Self Harm/accuracy: 0.9970
- Self Harm/precision: 0.8966
- Self Harm/recall: 0.6341
- Self Harm/f1: 0.7429
- Self Harm/fpr: 0.0005
- Self Harm/threshold: 0.8032
- Sexually Explicit,adult Content/accuracy: 0.9830
- Sexually Explicit,adult Content/precision: 0.6224
- Sexually Explicit,adult Content/recall: 0.7484
- Sexually Explicit,adult Content/f1: 0.6796
- Sexually Explicit,adult Content/fpr: 0.0112
- Sexually Explicit,adult Content/threshold: 0.2524
- Terrorism,organized Crime/accuracy: 0.9911
- Terrorism,organized Crime/precision: 0.4498
- Terrorism,organized Crime/recall: 0.4844
- Terrorism,organized Crime/f1: 0.4665
- Terrorism,organized Crime/fpr: 0.0048
- Terrorism,organized Crime/threshold: 0.3817
- Violence,aiding And Abetting,incitement/accuracy: 0.9216
- Violence,aiding And Abetting,incitement/precision: 0.8338
- Violence,aiding And Abetting,incitement/recall: 0.8808
- Violence,aiding And Abetting,incitement/f1: 0.8566
- Violence,aiding And Abetting,incitement/fpr: 0.0636
- Violence,aiding And Abetting,incitement/threshold: 0.4182
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 10
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Macro F1 | Macro Precision | Macro Recall | Micro F1 | Micro Precision | Micro Recall | Flagged/accuracy | Flagged/precision | Flagged/recall | Flagged/f1 | Flagged/aucpr | Flagged/fpr | Animal Abuse/accuracy | Animal Abuse/precision | Animal Abuse/recall | Animal Abuse/f1 | Animal Abuse/fpr | Animal Abuse/threshold | Child Abuse/accuracy | Child Abuse/precision | Child Abuse/recall | Child Abuse/f1 | Child Abuse/fpr | Child Abuse/threshold | Controversial Topics,politics/accuracy | Controversial Topics,politics/precision | Controversial Topics,politics/recall | Controversial Topics,politics/f1 | Controversial Topics,politics/fpr | Controversial Topics,politics/threshold | Discrimination,stereotype,injustice/accuracy | Discrimination,stereotype,injustice/precision | Discrimination,stereotype,injustice/recall | Discrimination,stereotype,injustice/f1 | Discrimination,stereotype,injustice/fpr | Discrimination,stereotype,injustice/threshold | Drug Abuse,weapons,banned Substance/accuracy | Drug Abuse,weapons,banned Substance/precision | Drug Abuse,weapons,banned Substance/recall | Drug Abuse,weapons,banned Substance/f1 | Drug Abuse,weapons,banned Substance/fpr | Drug Abuse,weapons,banned Substance/threshold | Financial Crime,property Crime,theft/accuracy | Financial Crime,property Crime,theft/precision | Financial Crime,property Crime,theft/recall | Financial Crime,property Crime,theft/f1 | Financial Crime,property Crime,theft/fpr | Financial Crime,property Crime,theft/threshold | Hate Speech,offensive Language/accuracy | Hate Speech,offensive Language/precision | Hate Speech,offensive Language/recall | Hate Speech,offensive Language/f1 | Hate Speech,offensive Language/fpr | Hate Speech,offensive Language/threshold | Misinformation Regarding Ethics,laws And Safety/accuracy | Misinformation Regarding Ethics,laws And Safety/precision | Misinformation Regarding Ethics,laws And Safety/recall | Misinformation Regarding Ethics,laws And Safety/f1 | Misinformation Regarding Ethics,laws And Safety/fpr | Misinformation Regarding Ethics,laws And Safety/threshold | Non Violent Unethical Behavior/accuracy | Non Violent Unethical Behavior/precision | Non Violent Unethical Behavior/recall | Non Violent Unethical Behavior/f1 | Non Violent Unethical Behavior/fpr | Non Violent Unethical Behavior/threshold | Privacy Violation/accuracy | Privacy Violation/precision | Privacy Violation/recall | Privacy Violation/f1 | Privacy Violation/fpr | Privacy Violation/threshold | Self Harm/accuracy | Self Harm/precision | Self Harm/recall | Self Harm/f1 | Self Harm/fpr | Self Harm/threshold | Sexually Explicit,adult Content/accuracy | Sexually Explicit,adult Content/precision | Sexually Explicit,adult Content/recall | Sexually Explicit,adult Content/f1 | Sexually Explicit,adult Content/fpr | Sexually Explicit,adult Content/threshold | Terrorism,organized Crime/accuracy | Terrorism,organized Crime/precision | Terrorism,organized Crime/recall | Terrorism,organized Crime/f1 | Terrorism,organized Crime/fpr | Terrorism,organized Crime/threshold | Violence,aiding And Abetting,incitement/accuracy | Violence,aiding And Abetting,incitement/precision | Violence,aiding And Abetting,incitement/recall | Violence,aiding And Abetting,incitement/f1 | Violence,aiding And Abetting,incitement/fpr | Violence,aiding And Abetting,incitement/threshold |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.0686 | 1.0 | 8454 | 0.0838 | 0.6649 | 0.6665 | 0.6508 | 0.6927 | 0.7466 | 0.7268 | 0.7675 | 0.8479 | 0.8468 | 0.8871 | 0.8665 | 0.8984 | 0.2013 | 0.9946 | 0.7734 | 0.7442 | 0.7585 | 0.0025 | 0.4957 | 0.9961 | 0.6332 | 0.7207 | 0.6742 | 0.0023 | 0.3612 | 0.9631 | 0.4279 | 0.6075 | 0.5021 | 0.0257 | 0.2783 | 0.9551 | 0.7153 | 0.7234 | 0.7194 | 0.0249 | 0.4196 | 0.9730 | 0.7377 | 0.8089 | 0.7717 | 0.0172 | 0.3702 | 0.9593 | 0.7605 | 0.8489 | 0.8023 | 0.0288 | 0.4163 | 0.9459 | 0.7008 | 0.6908 | 0.6957 | 0.0290 | 0.3320 | 0.9792 | 0.1993 | 0.2353 | 0.2158 | 0.0116 | 0.1848 | 0.8788 | 0.6927 | 0.7007 | 0.6967 | 0.0771 | 0.3540 | 0.9803 | 0.7865 | 0.8237 | 0.8047 | 0.0116 | 0.3863 | 0.9969 | 0.8733 | 0.6390 | 0.7380 | 0.0006 | 0.8278 | 0.9833 | 0.6343 | 0.7263 | 0.6772 | 0.0103 | 0.1968 | 0.9877 | 0.3396 | 0.5634 | 0.4238 | 0.0088 | 0.1689 | 0.9193 | 0.8370 | 0.8654 | 0.8509 | 0.0611 | 0.3895 |
| 0.0769 | 2.0 | 16908 | 0.0809 | 0.6729 | 0.6788 | 0.6649 | 0.6993 | 0.7536 | 0.7394 | 0.7684 | 0.8551 | 0.8568 | 0.8880 | 0.8721 | 0.9036 | 0.1861 | 0.9948 | 0.7852 | 0.7544 | 0.7695 | 0.0024 | 0.4050 | 0.9967 | 0.7102 | 0.6697 | 0.6893 | 0.0015 | 0.2337 | 0.9642 | 0.4418 | 0.6450 | 0.5244 | 0.0258 | 0.2379 | 0.9535 | 0.6922 | 0.7481 | 0.7191 | 0.0287 | 0.3702 | 0.9738 | 0.7484 | 0.8048 | 0.7756 | 0.0161 | 0.5138 | 0.9610 | 0.7789 | 0.8373 | 0.8070 | 0.0256 | 0.3657 | 0.9489 | 0.7312 | 0.6787 | 0.7040 | 0.0245 | 0.3407 | 0.9797 | 0.2279 | 0.2818 | 0.2520 | 0.0118 | 0.2018 | 0.8838 | 0.7207 | 0.6780 | 0.6987 | 0.0652 | 0.3739 | 0.9816 | 0.8102 | 0.8189 | 0.8146 | 0.0100 | 0.5091 | 0.9967 | 0.7849 | 0.7122 | 0.7468 | 0.0013 | 0.1968 | 0.9839 | 0.6480 | 0.7215 | 0.6828 | 0.0097 | 0.4407 | 0.9896 | 0.3944 | 0.5593 | 0.4626 | 0.0069 | 0.2056 | 0.9219 | 0.8347 | 0.8811 | 0.8573 | 0.0633 | 0.4579 |
| 0.0619 | 3.0 | 25362 | 0.0802 | 0.6762 | 0.6771 | 0.6717 | 0.6874 | 0.7548 | 0.7418 | 0.7684 | 0.8563 | 0.8620 | 0.8832 | 0.8725 | 0.9051 | 0.1774 | 0.9948 | 0.7926 | 0.7442 | 0.7676 | 0.0023 | 0.4177 | 0.9963 | 0.6697 | 0.6637 | 0.6667 | 0.0018 | 0.3082 | 0.9668 | 0.4676 | 0.5907 | 0.5219 | 0.0213 | 0.2379 | 0.9541 | 0.6978 | 0.7450 | 0.7206 | 0.0279 | 0.4149 | 0.9737 | 0.7482 | 0.8048 | 0.7755 | 0.0162 | 0.4106 | 0.9604 | 0.7734 | 0.8388 | 0.8048 | 0.0265 | 0.4254 | 0.9486 | 0.7307 | 0.6748 | 0.7016 | 0.0245 | 0.4301 | 0.9813 | 0.2497 | 0.2695 | 0.2592 | 0.0100 | 0.1225 | 0.8831 | 0.7140 | 0.6870 | 0.7003 | 0.0682 | 0.4220 | 0.9812 | 0.7919 | 0.8402 | 0.8153 | 0.0115 | 0.4144 | 0.9969 | 0.8590 | 0.6537 | 0.7424 | 0.0007 | 0.7178 | 0.9830 | 0.6224 | 0.7484 | 0.6796 | 0.0112 | 0.2524 | 0.9912 | 0.4531 | 0.4823 | 0.4673 | 0.0047 | 0.3844 | 0.9215 | 0.8338 | 0.8807 | 0.8566 | 0.0636 | 0.4182 |
| 0.0665 | 4.0 | 33816 | 0.0801 | 0.6693 | 0.6764 | 0.6642 | 0.6971 | 0.7513 | 0.7364 | 0.7669 | 0.8551 | 0.8578 | 0.8865 | 0.8719 | 0.9037 | 0.1843 | 0.9950 | 0.7884 | 0.7689 | 0.7785 | 0.0024 | 0.4812 | 0.9967 | 0.7152 | 0.6637 | 0.6885 | 0.0015 | 0.5746 | 0.9664 | 0.4616 | 0.5836 | 0.5155 | 0.0215 | 0.2783 | 0.9528 | 0.6868 | 0.7462 | 0.7153 | 0.0294 | 0.3612 | 0.9726 | 0.7272 | 0.8204 | 0.7710 | 0.0184 | 0.4627 | 0.9607 | 0.7777 | 0.8351 | 0.8053 | 0.0257 | 0.4182 | 0.9494 | 0.7494 | 0.6542 | 0.6985 | 0.0215 | 0.4489 | 0.9797 | 0.2338 | 0.2955 | 0.2610 | 0.0119 | 0.1700 | 0.8804 | 0.7030 | 0.6893 | 0.6961 | 0.0722 | 0.4059 | 0.9820 | 0.8156 | 0.8200 | 0.8178 | 0.0096 | 0.4904 | 0.9969 | 0.8299 | 0.6780 | 0.7463 | 0.0010 | 0.4438 | 0.9826 | 0.6106 | 0.7650 | 0.6791 | 0.0120 | 0.2830 | 0.9886 | 0.3628 | 0.5634 | 0.4414 | 0.0080 | 0.2451 | 0.9216 | 0.8368 | 0.8761 | 0.8560 | 0.0619 | 0.5003 |
| 0.0607 | 5.0 | 42270 | 0.0824 | 0.6670 | 0.6729 | 0.6608 | 0.6997 | 0.7463 | 0.7353 | 0.7576 | 0.8499 | 0.8537 | 0.8814 | 0.8673 | 0.9005 | 0.1896 | 0.9949 | 0.7856 | 0.7616 | 0.7734 | 0.0024 | 0.5870 | 0.9968 | 0.7331 | 0.6517 | 0.6900 | 0.0013 | 0.5794 | 0.9661 | 0.4570 | 0.5689 | 0.5069 | 0.0214 | 0.3372 | 0.9538 | 0.7116 | 0.7050 | 0.7083 | 0.0247 | 0.4632 | 0.9735 | 0.7504 | 0.7930 | 0.7711 | 0.0157 | 0.5128 | 0.9593 | 0.7625 | 0.8446 | 0.8015 | 0.0284 | 0.4116 | 0.9481 | 0.7237 | 0.6807 | 0.7016 | 0.0256 | 0.4301 | 0.9745 | 0.1890 | 0.3338 | 0.2413 | 0.0176 | 0.1634 | 0.8829 | 0.7243 | 0.6628 | 0.6922 | 0.0626 | 0.4414 | 0.9820 | 0.8370 | 0.7876 | 0.8115 | 0.0080 | 0.5813 | 0.9968 | 0.8114 | 0.6927 | 0.7474 | 0.0011 | 0.5446 | 0.9815 | 0.5843 | 0.7975 | 0.6745 | 0.0140 | 0.2674 | 0.9872 | 0.3421 | 0.6466 | 0.4475 | 0.0100 | 0.1645 | 0.9209 | 0.8388 | 0.8696 | 0.8539 | 0.0606 | 0.4858 |
Framework versions
- Transformers 4.57.1
- Pytorch 2.7.1+cu118
- Datasets 4.4.1
- Tokenizers 0.22.1
- Downloads last month
- 3
Model tree for saiteki-kai/QA-DeBERTa-v3-large-threshold-v2
Base model
microsoft/deberta-v3-largeEvaluation results
- Accuracy on saiteki-kai/Beavertails-itself-reported0.676