Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

SeaWolf-AI 
posted an update 2 days ago
view post
Post
4497
ALL Bench — Global AI Model Unified Leaderboard

FINAL-Bench/all-bench-leaderboard

If you've ever tried to compare GPT-5.2 and Claude Opus 4.6 side by side, you've probably hit the same wall: the official Hugging Face leaderboard only tracks open-source models, so the most widely used AI systems simply aren't there. ALL Bench fixes that by bringing closed-source models, open-weight models, and — uniquely — all four teams under South Korea's national sovereign AI program into a single leaderboard. Thirty-one frontier models, one consistent scoring scale.
Scoring works differently here too. Most leaderboards skip benchmarks a model hasn't submitted, which lets models game their ranking by withholding results. ALL Bench treats every missing entry as zero and divides by ten, so there's no advantage in hiding your weak spots.
The ten core benchmarks span reasoning (GPQA Diamond, AIME 2025, HLE, ARC-AGI-2), coding (SWE-bench Verified, LiveCodeBench), and instruction-following (IFEval, BFCL). The standout is FINAL Bench — the world's only benchmark measuring whether a model can catch and correct its own mistakes. It reached rank five in global dataset popularity on Hugging Face in February 2026 and has been covered by Seoul Shinmun, Asia Economy, IT Chosun, and Behind.
Nine interactive charts let you explore everything from composite score rankings and a full heatmap to an open-vs-closed scatter plot. Operational metrics like context window, output speed, and pricing are included alongside benchmark scores.
All data is sourced from Artificial Analysis Intelligence Index v4.0, arXiv technical reports, Chatbot Arena ELO ratings, and the Korean Ministry of Science and ICT's official evaluation results. Updates monthly.
MaziyarPanahi 
posted an update 2 days ago
view post
Post
3643
DNA, mRNA, proteins, AI. I spent the last year going deep into computational biology as an ML engineer. This is Part I of what I found. 🧬

In 2024, AlphaFold won the Nobel Prize in Chemistry.

By 2026, the open-source community had built alternatives that outperform it.

That's the story I find most interesting about protein AI right now. Not just the science (which is incredible), but the speed at which open-source caught up. Multiple teams, independently, reproduced and then exceeded AlphaFold 3's accuracy with permissive licenses. The field went from prediction to generation: we're not just modeling known proteins anymore, we're designing new ones.

I spent months mapping this landscape for ML engineers. What the architectures actually are (spoiler: transformers and diffusion models), which tools to use for what, and which ones you can actually ship commercially.

New post on the Hugging Face blog: https://huggingface.co/blog/MaziyarPanahi/protein-ai-landscape

Hope you all enjoy! 🤗
  • 1 reply
·
kostakoff 
posted an update 2 days ago
view post
Post
1569
Mining GPU Nvidia CMP 170HX - let's run some models!

To satisfy my curiosity, I investigated different GPUs and found this: a mining version of the A100 — the CMP 170HX.

It is a very interesting GPU. Based on public documentation, it has hardware similar to the datacenter A100. If you open it up and look at the board, you will see that it's very similar to an A100 board; it even has NVLink connectors.

Online, I found almost no information about how to run it, whether it works with LLMs, or if it's supported by default Nvidia drivers and CUDA. So, I decided to test it myself.
I installed it in my lab (see previous post https://huggingface.co/posts/kostakoff/584269728210158) and found that the default nvidia-driver-570 works with it out of the box. After that, I checked if CUDA was available, and it worked too.

The next step was to try running some models:
- Stable Diffusion XL with BNB4 quantization: It took around two minutes to generate an image, but it works!
- Compiled llama.cpp for CUDA (https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md#compilation): I run Mistral 7B Q4_K_M, and this actually worked even better. It was able to generate 33 tokens per second and read 400 tokens per second.

There are some limitations related to power utilization:
- When running PyTorch, it doesn't utilize more than 80 watts.
- When running llama.cpp, utilization is a bit better but still limited to 113 watts.

I found this GitHub thread about the Nvidia CMP https://github.com/dartraiden/NVIDIA-patcher/issues/73, and it looks like this mining GPU has an internal rate limiter based on FMA compute calls. I haven't found a solution to bypass it yet.

llmlaba
  • 1 reply
·
mayafree 
posted an update 3 days ago
view post
Post
4205
I built a Space that lets you switch between all three Qwen3.5 official collection models in a single interface.

MAYA-AI/QWEN-3_5-CHAT

The architecture is the key part. Instead of using Gradio as the UI, I use it purely as an API engine. FastAPI serves a fully custom HTML/JS frontend that calls /gradio_api/call/chat via SSE streaming. No DOM conflicts, no layout constraints.

Four main features: instant model switching with automatic spec adjustment (max tokens, temperature ceiling, Vision availability all update per model), Thinking Mode via /think prefix with collapsible reasoning chain, Vision image upload via base64 conversion, and HF OAuth implemented directly at the FastAPI level.

For model selection: 122B-A10B with Thinking Mode for math, logic, and agents. 27B for writing, translation, and instruction following. 35B-A3B for fast everyday questions.

A few surprises during development — Gradio 6.x removed several parameters quietly, base64 image strings broke gr.Image(type="pil") so I switched to gr.Textbox with backend PIL conversion, and Thinking Mode parsing needed a full rewrite with indexOf instead of regex.

Thanks to the Qwen team for making this possible. Try it out and let me know what you think.

#Qwen3 #Qwen35 #OpenSourceAI #HuggingFace #LLM #ThinkingAI #vidraft #MultimodalAI
abdurrahmanbutler 
posted an update 3 days ago
view post
Post
2468
🚀 𝗜𝗻𝘁𝗿𝗼𝗱𝘂𝗰𝗶𝗻𝗴 𝗞𝗮𝗻𝗼𝗻 𝟮 𝗘𝗻𝗿𝗶𝗰𝗵𝗲𝗿: 𝘁𝗵𝗲 𝘄𝗼𝗿𝗹𝗱’𝘀 𝗳𝗶𝗿𝘀𝘁 𝗵𝗶𝗲𝗿𝗮𝗿𝗰𝗵𝗶𝗰𝗮𝗹 𝗴𝗿𝗮𝗽𝗵𝗶𝘁𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗺𝗼𝗱𝗲𝗹

Today we’re publicly releasing Kanon 2 Enricher, and with it, an entirely new class of AI model that we’re calling a hierarchical graphitization model.
This is fundamentally different from both universal extraction models and generative models.

As a hierarchical graphitization model, Kanon 2 Enricher natively outputs a 𝗸𝗻𝗼𝘄𝗹𝗲𝗱𝗴𝗲 𝗴𝗿𝗮𝗽𝗵 rather than tokens, which makes it architecturally incapable of hallucinating or inventing text that wasn’t present in the input.

What that enables in practice is unlike any other model or ML architecture on the market:

• 𝗡𝗼 𝗵𝗮𝗹𝗹𝘂𝗰𝗶𝗻𝗮𝘁𝗶𝗼𝗻𝘀 🤖
It cannot hallucinate. All references and links are stored as spans, meaning exact character offsets anchored to the original text.

• 𝗛𝗶𝗲𝗿𝗮𝗿𝗰𝗵𝗶𝗰𝗮𝗹 𝘀𝗲𝗴𝗺𝗲𝗻𝘁𝗮𝘁𝗶𝗼𝗻, 𝗻𝗼𝘁 𝗷𝘂𝘀𝘁 𝗲𝘅𝘁𝗿𝗮𝗰𝘁𝗶𝗼𝗻 📑
It deconstructs a document’s full nested hierarchy, down to chapters, sections, clauses, schedules, signatures, and even singular sentences, and classifies each span with dozens of contextual features.

• 𝗘𝗻𝘁𝗶𝘁𝘆 𝗲𝘅𝘁𝗿𝗮𝗰𝘁𝗶𝗼𝗻, 𝗱𝗶𝘀𝗮𝗺𝗯𝗶𝗴𝘂𝗮𝘁𝗶𝗼𝗻, 𝗮𝗻𝗱 𝗹𝗶𝗻𝗸𝗶𝗻𝗴 🔗
It resolves what references actually point to, then links entities, citations, and cross-references into a single coherent graph.

• 𝗚𝗿𝗮𝗽𝗵-𝗳𝗶𝗿𝘀𝘁 𝗲𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝗰𝘆 🏃‍➡️
Small enough to run locally on a consumer PC with sub-second latency, and it stays reliable on long documents where front

To read more about our new model, check out our latest Hugging Face article:
https://huggingface.co/blog/isaacus/introducing-kanon-2-enricher
imnotkitty 
posted an update 3 days ago
view post
Post
2164
The most popular OpenClaw tool has been released!

🥇 Claw for All: The ultimate all-rounder. Simplifies deployment for both devs & pros with a seamless web/mobile experience.
🥈 OpenClaw Launch: Speed is king. Deploy your apps in under 30 seconds with a single click.
🥉 ClawTeam: Skip the setup. Get pre-configured AI agent blueprints built specifically for OpenClaw.
4️⃣ vibeclaw: Local-first. Run OpenClaw in your browser sandbox in literally 1 second.
5️⃣ Tinkerclaw: The startup favorite. Zero-code platform to deploy, manage, and scale AI assistants.
6️⃣ ClawWrapper: The "last mile" tool. Simplifies the entire packaging and launch process.

Which one are you adding to your stack? 🛠️
(Source: OpenClaw Directory)
  • 2 replies
·
BestWishYsh 
posted an update about 23 hours ago
view post
Post
1379
🚀 Introducing Helios: a 14B real-time long-video generation model!

It’s completely wild—faster than 1.3B models and achieves this without using self-forcing. Welcome to the new era of video generation! 😎👇

💻 Code: https://github.com/PKU-YuanGroup/Helios
🏠 Page: https://pku-yuangroup.github.io/Helios-Page
📄 Paper: Helios: Real Real-Time Long Video Generation Model (2603.04379)

🔹 True Single-GPU Extreme Speed ⚡️
No need to rely on traditional workarounds like KV-cache, quantization, sparse/linear attention, or TinyVAE. Helios hits an end-to-end 19.5 FPS on a single H100!

Training is also highly accessible: an 80GB VRAM can fit four 14B models.

🔹 Solving Long-Video "Drift" from the Core 🎥
Tired of visual drift and repetitive loops? We ditched traditional hacks (like error banks, self-forcing, or keyframe sampling).

Instead, our innovative training strategy simulates & eliminates drift directly, keeping minute-long videos incredibly coherent with stunning quality. ✨

🔹 3 Model Variants for Full Coverage 🛠️
With a unified architecture natively supporting T2V, I2V, and V2V, we are open-sourcing 3 flavors:

1️⃣ Base: Single-stage denoising for extreme high-fidelity.
2️⃣ Mid: Pyramid denoising + CFG-Zero for the perfect balance of quality & throughput.
3️⃣ Distilled: Adversarial Distillation (DMD) for ultra-fast, few-step generation.

🔹 Day-0 Ecosystem Ready 🌍
We wanted deployment to be a breeze from the second we launched. Helios drops with comprehensive Day-0 hardware and framework support:

✅ Huawei Ascend-NPU
✅ HuggingFace Diffusers
✅ vLLM-Omni
✅ SGLang-Diffusion

Try it out and let us know what you think!
  • 3 replies
·
ajibawa-2023 
posted an update 1 day ago
view post
Post
2545
Cpp-Code-Large
Dataset: ajibawa-2023/Cpp-Code-Large

Cpp-Code-Large is a large-scale corpus of C++ source code comprising more than 5 million lines of C++ code. The dataset is designed to support research in large language model (LLM) pretraining, code intelligence, software engineering automation, and static program analysis for the C++ ecosystem.

By providing a high-volume, language-specific corpus, Cpp-Code-Large enables systematic experimentation in C++-focused model training, domain adaptation, and downstream code understanding tasks.

Cpp-Code-Large addresses the need for a dedicated C++-only dataset at substantial scale, enabling focused research across systems programming, performance-critical applications, embedded systems, game engines, and large-scale native software projects.
·
danielhanchen 
posted an update 4 days ago
alvarobartt 
posted an update about 12 hours ago
view post
Post
973
Learn how to deploy Microsoft Research VibeVoice ASR on Microsoft Azure Foundry with Hugging Face to generate rich audio transcriptions with Who, When, and What! 💥

> 🕒 60-minute single-pass processing, no chunking or stitching
> 👤 Customized hotwords to guide recognition on domain-specific content
> 📝 Rich transcription: joint ASR + diarization + timestamping in one pass
> 🌍 50+ languages with automatic detection and code-switching support
> 🤗 Deployed on Microsoft Foundry via an OpenAI-compatible Chat Completions API

https://huggingface.co/docs/microsoft-azure/foundry/examples/deploy-vibevoice-asr