❄️January 2025 - Open releases from the Chinese community
-
Any-to-Any • Updated • 51.7k • 3.53k -
deepseek-ai/Janus-Pro-1B
Any-to-Any • Updated • 8.61k • 465 -
tencent/Hunyuan3D-2
Image-to-3D • Updated • 80.9k • 1.68k -
tencent/Hunyuan-7B-Instruct-0124
Text Generation • Updated • 86 • 50
ByteDance/Sa2VA-4B
Image-Text-to-Text • 4B • Updated • 173k • • 90Note A unified model for dense grounded understanding of images & videos.
-
ByteDance-Seed/UI-TARS-72B-DPO
Image-Text-to-Text • 73B • Updated • 2.27k • 147
deepseek-ai/DeepSeek-R1
Text Generation • 685B • Updated • 1.21M • • 12.9kNote 660B reasoning models with MIT license
-
deepseek-ai/DeepSeek-R1-Zero
Text Generation • 685B • Updated • 5k • 937
MiniMaxAI/MiniMax-VL-01
Image-Text-to-Text • 456B • Updated • 87.7k • 280Note A non transformer based ( ViT-MLP-LLM framework) VLM
MiniMaxAI/MiniMax-Text-01
Text Generation • 456B • Updated • 1.48k • 650Note 456B LLM with 1M tokens training context
Qwen/Qwen2.5-Math-PRM-7B
Text Classification • 8B • Updated • 23.2k • 80Note Math model
-
Qwen/Qwen2.5-14B-Instruct-1M
Text Generation • 15B • Updated • 11.8k • • 329
openbmb/MiniCPM-o-2_6
Any-to-Any • 9B • Updated • 96.9k • 1.27kNote End-side multimodal LLM that supports real time conversation and video understanding.
-
ICTNLP/llava-mini-llama-3.1-8b
Image-Text-to-Text • 9B • Updated • 919 • 56
BlinkDL/rwkv-7-world
Text Generation • Updated • 104Note RNN+Transfomers
HKUSTAudio/Llasa-3B
Text-to-Speech • 4B • Updated • 1.72k • 522Note TTS
-
DAMO-NLP-SG/VideoLLaMA3-7B
Video-Text-to-Text • 8B • Updated • 87.5k • 71 -
internlm/internlm3-8b-instruct
Text Generation • 9B • Updated • 16.3k • 227
baichuan-inc/Baichuan-M1-14B-Base
14B • Updated • 75 • 30Note Medical LLM
opencsg/Fineweb-Edu-Chinese-V2.1
Viewer • Updated • 958M • 18.9k • 51Note Dataset designed specifically for natural language processing (NLP) tasks in the education sector.
DAMO-NLP-SG/multimodal_textbook
Updated • 5.07k • 156Note A multimodel dataset for vision language pretraining , includes 6.5M images + 0.8B text from 22k hours of instructional videos
-
hithink-ai/MME-Finance
Viewer • Updated • 2.06k • 290 • 8 -
KlingTeam/GameFactory-Dataset
Updated • 302 • 14 -
m-a-p/YuE-s1-7B-anneal-zh-cot
Text Generation • 6B • Updated • 349 • 40 -
m-a-p/YuE-s1-7B-anneal-jp-kr-cot
Text Generation • 6B • Updated • 424 • 21 -
m-a-p/YuE-s1-7B-anneal-en-cot
Text Generation • 6B • Updated • 20.2k • 434 -
Qwen/Qwen2.5-VL-3B-Instruct
Image-Text-to-Text • 4B • Updated • 7.75M • 567 -
Qwen/Qwen2.5-VL-7B-Instruct
Image-Text-to-Text • 8B • Updated • 3.37M • • 1.38k -
Hunyuan3D-2.0
🌍3.11kText-to-3D and Image-to-3D Generation
-
UI-TARS
🌖65Find click coordinates on images based on instructions
-
MiniMaxVL01
💬64Generate responses to text and images in a chat interface
-
Chat With Janus-Pro-7B
🌍2.01kA unified multimodal understanding and generation model.