Large web-mined general corpus based on CommonCrawl.
Amir Hossein Kargaran
kargaranamir
AI & ML interests
#NLP, checkout https://huggingface.co/cis-lmu
Recent Activity
liked
a dataset
5 days ago
openlanguagedata/flores_plus
upvoted
a
collection
5 days ago
OLDI and friends
upvoted
an
article
11 days ago
Continuous batching from first principles