arxiv:2503.05731
Satya
skrishna
AI & ML interests
Safe A(G)I
Organizations
models 41
skrishna/smolm-toxicity-classifier
Text Classification • 0.1B • Updated • 4
skrishna/sft-ref-policy-copy
Text Generation • 0.1B • Updated
skrishna/sft-model-copy
Text Generation • 0.1B • Updated
skrishna/gpt2-toxicity-classifier
Updated
skrishna/gpt2-fineweb-soap-20250422_112211
Text Generation • 0.1B • Updated
skrishna/gpt2-fineweb-20250421_194111-64
Text Generation • 0.1B • Updated • 1
skrishna/gpt2-fineweb
Updated
skrishna/ethicsU-llama3-8b-w2s
Updated
skrishna/ethicsU-gptxl-weak2
Updated
skrishna/ethicsU-gptxl-weak
Updated
datasets 76
skrishna/toxigen_annotated_mod
Viewer • Updated • 8.96k • 16
skrishna/toy-toxicity-dataset
Viewer • Updated • 40k • 11
skrishna/toxicity-reward-dataset
Viewer • Updated • 40k • 12
skrishna/SECURE-VOOD
Viewer • Updated • 466 • 8
skrishna/SECURE-RERT
Viewer • Updated • 1k • 21
skrishna/SECURE-MAET
Viewer • Updated • 1.07k • 715
skrishna/SECURE-KCV
Viewer • Updated • 466 • 13
skrishna/SECURE-CPST
Viewer • Updated • 100 • 7
skrishna/SECURE-CWET
Viewer • Updated • 965 • 7
skrishna/cti-rcm
Viewer • Updated • 1k • 7