1 96 172

Unknown Entity

unknownentity

AI & ML interests

None yet

Recent Activity

reacted to SeaWolf-AI's post with ❤️ about 22 hours ago

🧬 Introducing Darwin-9B-NEG — the first model with Native Entropy Gating (NEG) 🔗 Try it now: https://huggingface.co/FINAL-Bench/Darwin-9B-NEG 🔗 Q4 bit : https://huggingface.co/FINAL-Bench/Darwin-9B-MFP4 We're thrilled to release Darwin-9B-NEG, a 9B-parameter reasoning model that embeds an architecturally-internalised sense of self-confidence directly into the transformer — our proprietary Native Entropy Gating (NEG) technology. 📊 GPQA Diamond (198 PhD-level questions): ▸ Baseline Darwin-9B (no NEG) → 51.01 % ▸ Pure NEG (greedy · 1× cost) → 63.64 % 🔥 +12.63 %p ▸ + Permutation (4× cost) → 76.26 % ▸ + Ensemble Refinement (~20×) → 84.34 % 🏆 With only 9 billion parameters and 1× inference cost, Pure NEG jumps +12.63 %p over the same model without NEG. Going all-in with ensemble refinement pushes it to 84.34 % — surpassing the published Qwen3.5-9B leaderboard score (81.7 %) by +2.64 %p. 🔬 What makes NEG different from Multi-Turn Iteration (MTI)? Classical MTI needs 3-8× extra inference passes. NEG instead lives INSIDE the single decoding loop. Two tiny modules ride with the transformer: NEG-Head predicts per-token entropy from the last hidden state, and NEG-Gate conditionally restricts the top-k choice when confidence is low. The gate activates in only 4.36 % of tokens — essentially free at inference time. ✨ Key differentiators • Architecturally internalised — model file *is* the feature • 1× inference cost (vs. 3-8× for MTI) • Drop-in with vLLM / SGLang / TGI / transformers — no extra engine • +12.63 %p reasoning at zero latency overhead • Single-file deployment, Apache 2.0 licensed 🧬 Lineage Qwen/Qwen3.5-9B → Darwin-9B-Opus (V7 evolutionary merge) → Darwin-9B-NEG (V8 + NEG training) #Darwin #NEG #NativeEntropyGating #GPQA #Reasoning #LLM #OpenSource #Apache2

liked a model 1 day ago

deepseek-ai/DeepSeek-V4-Pro

upvoted a paper 3 days ago

Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model

View all activity

Organizations

None yet

upvoted 4 papers 3 days ago

Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model

Paper • 2603.21986 • Published Mar 23 • 124

Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory

Paper • 2604.08995 • Published 16 days ago • 48

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

Paper • 2604.14268 • Published 11 days ago • 110

CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation

Paper • 2604.19636 • Published 5 days ago • 82

upvoted 2 papers 13 days ago

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published 17 days ago • 240

ELT: Elastic Looped Transformers for Visual Generation

Paper • 2604.09168 • Published 16 days ago • 19

upvoted a collection 29 days ago

UnifoLM_WBT_Dataset

Collection

11 items • Updated 10 days ago • 84

upvoted an article about 1 month ago

Article

The First Healthcare Robotics Dataset and Foundational Physical AI Models for Healthcare Robotics

Mar 16

•

upvoted a paper about 1 month ago

Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training

Paper • 2603.12255 • Published Mar 12 • 91

upvoted a collection about 2 months ago

Qwen3.5

Collection

21 items • Updated Mar 9 • 1.57k

upvoted a paper 2 months ago

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

Paper • 2602.14041 • Published Feb 15 • 53

upvoted 6 papers 4 months ago

PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence

Paper • 2512.16793 • Published Dec 18, 2025 • 76

LongVie 2: Multimodal Controllable Ultra-Long Video World Model

Paper • 2512.13604 • Published Dec 15, 2025 • 76

RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards

Paper • 2512.00473 • Published Nov 29, 2025 • 27

InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models

Paper • 2512.08829 • Published Dec 9, 2025 • 21

OmniPSD: Layered PSD Generation with Diffusion Transformer

Paper • 2512.09247 • Published Dec 10, 2025 • 51

Composing Concepts from Images and Videos via Concept-prompt Binding

Paper • 2512.09824 • Published Dec 10, 2025 • 28

upvoted 3 papers 5 months ago

StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation

Paper • 2512.09363 • Published Dec 10, 2025 • 74

Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality

Paper • 2512.07951 • Published Dec 8, 2025 • 51

Light-X: Generative 4D Video Rendering with Camera and Illumination Control

Paper • 2512.05115 • Published Dec 4, 2025 • 11

Unknown Entity

AI & ML interests

Recent Activity

Organizations

unknownentity's activity

The First Healthcare Robotics Dataset and Foundational Physical AI Models for Healthcare Robotics