view article Article FineWeb-C: A Community-Driven Dataset for Educational Quality Annotations in 122 Languages Jul 8 β’ 32
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. β’ 11 items β’ Updated 8 days ago β’ 103
SIFT: Grounding LLM Reasoning in Contexts via Stickers Paper β’ 2502.14922 β’ Published Feb 19 β’ 32
MMTEB: Massive Multilingual Text Embedding Benchmark Paper β’ 2502.13595 β’ Published Feb 19 β’ 42
LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps Paper β’ 2412.15035 β’ Published Dec 19, 2024 β’ 4
Word Sense Linking: Disambiguating Outside the Sandbox Paper β’ 2412.09370 β’ Published Dec 12, 2024 β’ 10
Word Sense Linking Collection Word Sense Linking is the task designed to identify and disambiguate spans of text to their most suitable senses from a reference inventory. β’ 6 items β’ Updated Jan 13 β’ 6
Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-OASIS Paper β’ 2411.19655 β’ Published Nov 29, 2024 β’ 20
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders Paper β’ 2410.22366 β’ Published Oct 28, 2024 β’ 84
M-ALERT Collection evaluating in LLMs on a large scale and with policy compliance. multilingual eval available! β’ 4 items β’ Updated Jan 24 β’ 2
Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models Paper β’ 2404.05567 β’ Published Apr 8, 2024 β’ 10
ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming Paper β’ 2404.08676 β’ Published Apr 6, 2024 β’ 3
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Paper β’ 2404.00399 β’ Published Mar 30, 2024 β’ 42