Running on CPU Upgrade 183 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 183 Explore synthetic data experiments on a virtual bookshelf
PII & De-Identification Collection Models for extracting PII entities and de-identifying clinical text, with support for HIPAA and GDPR compliance. • 278 items • Updated 6 days ago • 33
OpenMed/OpenMed-PII-BioClinicalModern-Large-395M-v1 Token Classification • 0.4B • Updated Jan 13 • 18.7k • • 9
AstroBench Collection Datasets to evaluate LLMs/SLMs in astronautics and space mission engineering • 1 item • Updated Jan 5