video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions.
AI & ML interests
https://www.ee.tsinghua.edu.cn/en/
Recent Activity
View all activity
Organization Card
Department of Electronic Engineering, Tsinghua University
models 16
tsinghua-ee/video_SALMONN2plus_3B_audioAlign
5B ⢠Updated
⢠9
tsinghua-ee/D-ORCA-8B-0210
10B ⢠Updated
⢠22 ⢠1
tsinghua-ee/WAVE-7B
Updated
⢠23 ⢠1
tsinghua-ee/video_SALMONN2_7B_audioAlign
Updated
⢠21
tsinghua-ee/video_SALMONN2plus_72B_audioAlign
Updated
⢠4
tsinghua-ee/video_SALMONN2plus_7B_audioAlign
9B ⢠Updated
⢠490
tsinghua-ee/SALMONN
Automatic Speech Recognition ⢠Updated
⢠50
tsinghua-ee/video-SALMONN-2_plus_72B
Updated
⢠6 ⢠2
tsinghua-ee/video-SALMONN-2_plus_3B
Updated
⢠1.48k ⢠3
tsinghua-ee/video-SALMONN-2_plus_7B
Updated
⢠789 ⢠6
datasets 8
tsinghua-ee/ELViM
Viewer
⢠Updated
⢠211 ⢠17
tsinghua-ee/SACRED-Bench
Viewer
⢠Updated
⢠2.48k ⢠54
tsinghua-ee/F-16-NBA
Preview
⢠Updated
⢠49
tsinghua-ee/AVUTBenchmark
Viewer
⢠Updated
⢠3.28k ⢠5.19k ⢠1
tsinghua-ee/video-SALMONN_2_testset
Preview
⢠Updated
⢠135
tsinghua-ee/QualiSpeech
Viewer
⢠Updated
⢠14.6k ⢠583 ⢠21
tsinghua-ee/RivaBench
Viewer
⢠Updated
⢠542 ⢠446 ⢠2
tsinghua-ee/SAVEBench
Preview
⢠Updated
⢠69 ⢠3