Building on HF

15 721 283

Taufiq Dwi Purnomo

taufiqdp

https://taufiqdp.com

AI & ML interests

SLM, VLM

Recent Activity

upvoted a paper 5 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

liked a model 5 days ago

deepseek-ai/DeepSeek-V3.2

upvoted a collection 5 days ago

Ministral 3

View all activity

Organizations

upvoted a paper 5 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published 6 days ago • 172

upvoted 2 collections 5 days ago

Ministral 3

Collection

A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated 5 days ago • 115

Mistral Large 3

Collection

A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated 5 days ago • 70

upvoted an article 6 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

7 days ago

•

224

upvoted a paper 14 days ago

SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published 17 days ago • 107

upvoted a paper 20 days ago

DoPE: Denoising Rotary Position Embedding

Paper • 2511.09146 • Published 26 days ago • 92

upvoted a paper 24 days ago

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published 29 days ago • 128

upvoted 5 papers about 1 month ago

INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

Paper • 2510.25602 • Published Oct 29 • 76

Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation

Paper • 2510.22115 • Published Oct 25 • 83

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29 • 219

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17 • 89

Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning

Paper • 2510.19338 • Published Oct 22 • 114

upvoted 3 papers about 2 months ago

Robot Learning: A Tutorial

Paper • 2510.12403 • Published Oct 14 • 114

Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published Oct 13 • 165

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published Oct 13 • 176

upvoted 2 papers 2 months ago

Apriel-1.5-15b-Thinker

Paper • 2510.01141 • Published Oct 1 • 117

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26 • 136

upvoted a collection 3 months ago

Qwen3-Omni

Collection

6 items • Updated Oct 9 • 168

upvoted 2 papers 3 months ago

WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research

Paper • 2509.13312 • Published Sep 16 • 105

Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing

Paper • 2509.08721 • Published Sep 10 • 660

Taufiq Dwi Purnomo

AI & ML interests

Recent Activity

Organizations

taufiqdp's activity

Transformers v5: Simple model definitions powering the AI ecosystem