nvidia
/

Nemotron-4-340B-Instruct

Model card Files Files and versions

okuchaiev commited on Jun 14, 2024

Commit

5738a00

·

verified ·

1 Parent(s): 70af0cf

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -311,7 +311,7 @@ Evaluated using the CantTalkAboutThis Dataset as introduced in the CantTalkAbout
 The Nemotron-4 340B-Instruct model underwent extensive safety evaluation including adversarial testing via three distinct methods:
 - [Garak](https://docs.garak.ai/garak), is an automated LLM vulnerability scanner that probes for common weaknesses, including prompt injection and data leakage.
-- [AEGIS](https://arxiv.org/pdf/2404.05993), is a content safety evaluation dataset and LLM based content safety classifier model, that adheres to a broad taxonomy of 13 categories of critical risks in human-LLM interactions.
 - Human Content Red Teaming leveraging human interaction and evaluation of the models' responses.
 ### Limitations

 The Nemotron-4 340B-Instruct model underwent extensive safety evaluation including adversarial testing via three distinct methods:
 - [Garak](https://docs.garak.ai/garak), is an automated LLM vulnerability scanner that probes for common weaknesses, including prompt injection and data leakage.
+- AEGIS, is a content safety evaluation dataset and LLM based content safety classifier model, that adheres to a broad taxonomy of 13 categories of critical risks in human-LLM interactions.
 - Human Content Red Teaming leveraging human interaction and evaluation of the models' responses.
 ### Limitations