Collections
Discover the best community collections!
Collections including paper arxiv:2507.04009
-
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code
Paper • 2508.18106 • Published • 345 -
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems
Paper • 2411.02959 • Published • 70 -
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents
Paper • 2507.04009 • Published • 51 -
MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm
Paper • 2506.05218 • Published • 2
-
Step-Audio-R1 Technical Report
Paper • 2511.15848 • Published • 51 -
OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Paper • 2410.17799 • Published • 5 -
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents
Paper • 2507.04009 • Published • 51
-
Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents
Paper • 2509.06917 • Published • 41 -
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents
Paper • 2507.04009 • Published • 51 -
WebDancer: Towards Autonomous Information Seeking Agency
Paper • 2505.22648 • Published • 33
-
DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning
Paper • 2504.07128 • Published • 86 -
BM25S: Orders of magnitude faster lexical search via eager sparse scoring
Paper • 2407.03618 • Published • 13 -
Deep Think with Confidence
Paper • 2508.15260 • Published • 88 -
R-Zero: Self-Evolving Reasoning LLM from Zero Data
Paper • 2508.05004 • Published • 130
-
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents
Paper • 2507.04009 • Published • 51 -
AgentInstruct: Toward Generative Teaching with Agentic Flows
Paper • 2407.03502 • Published • 51 -
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 70
-
Step-Audio-R1 Technical Report
Paper • 2511.15848 • Published • 51 -
OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Paper • 2410.17799 • Published • 5 -
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents
Paper • 2507.04009 • Published • 51
-
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code
Paper • 2508.18106 • Published • 345 -
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems
Paper • 2411.02959 • Published • 70 -
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents
Paper • 2507.04009 • Published • 51 -
MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm
Paper • 2506.05218 • Published • 2
-
Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents
Paper • 2509.06917 • Published • 41 -
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents
Paper • 2507.04009 • Published • 51 -
WebDancer: Towards Autonomous Information Seeking Agency
Paper • 2505.22648 • Published • 33
-
DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning
Paper • 2504.07128 • Published • 86 -
BM25S: Orders of magnitude faster lexical search via eager sparse scoring
Paper • 2407.03618 • Published • 13 -
Deep Think with Confidence
Paper • 2508.15260 • Published • 88 -
R-Zero: Self-Evolving Reasoning LLM from Zero Data
Paper • 2508.05004 • Published • 130
-
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents
Paper • 2507.04009 • Published • 51 -
AgentInstruct: Toward Generative Teaching with Agentic Flows
Paper • 2407.03502 • Published • 51 -
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 70