DataTransfer

community

AI & ML interests

None defined yet.

Recent Activity

chenjoya updated a dataset 17 days ago

DataTransfer111/marker

neversa updated a dataset 18 days ago

DataTransfer111/marker

neversa authored a paper 7 months ago

ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation

View all activity

updated a dataset 17 days ago

DataTransfer111/marker

Updated 17 days ago • 150

updated a dataset 18 days ago

DataTransfer111/marker

Updated 17 days ago • 150

authored 4 papers 7 months ago

ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation

Paper • 2312.13108 • Published Dec 20, 2023 • 3

TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation

Paper • 2502.07870 • Published Feb 11, 2025 • 45

From Charts to Code: A Hierarchical Benchmark for Multimodal Models

Paper • 2510.17932 • Published Oct 20, 2025 • 8

VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

Paper • 2511.02778 • Published Nov 4, 2025 • 103

authored 7 papers about 1 year ago

Is Heuristic Sampling Necessary in Training Deep Object Detectors?

Paper • 1909.04868 • Published Sep 11, 2019

Bootstrapping SparseFormers from Vision Foundation Models

Paper • 2312.01987 • Published Dec 4, 2023 • 1

AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn

Paper • 2306.08640 • Published Jun 14, 2023 • 27

Learning Video Context as Interleaved Multimodal Sequences

Paper • 2407.21757 • Published Jul 31, 2024

VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation

Paper • 2408.16730 • Published Aug 29, 2024

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published May 11, 2025 • 158

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Paper • 2504.16030 • Published Apr 22, 2025 • 38

authored a paper over 1 year ago

One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos

Paper • 2409.19603 • Published Sep 29, 2024 • 19

authored a paper almost 2 years ago

VideoLLM-online: Online Video Large Language Model for Streaming Video

Paper • 2406.11816 • Published Jun 17, 2024 • 26

authored a paper almost 2 years ago

VideoLLM-online: Online Video Large Language Model for Streaming Video

Paper • 2406.11816 • Published Jun 17, 2024 • 26

authored a paper almost 3 years ago

UniVTG: Towards Unified Video-Language Temporal Grounding

Paper • 2307.16715 • Published Jul 31, 2023 • 12