HuggingFaceBio (Hugging Face Biology Research)

posted an update 5 days ago

Post

173

Uhh did Opus 4.8 cheat on PostTrainBench??

it found an API key in the PostTrainBench environment that allowed it to generate synthetic training data without using GPU hours, boosting the base model by 0.4913

Source: https://posttrainbench.com/traces/run.html?id=claude_non_api_max_claude-opus-4-8_10h_run1__healthbench_Qwen_Qwen3-4B-Base_17315102#tab=trace

1 reply

·

cgeorgiaw

updated a dataset 7 days ago

HuggingFaceBio/vepqa

Viewer • Updated 6 days ago • 14k • 182 • 2

lewtun

submitted 2 papers to Daily Papers 5 months ago

Single-minus gluon tree amplitudes are nonzero

Paper • 2602.12176 • Published Feb 12 • 8

Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL

Paper • 2602.03773 • Published Feb 3 • 14

cgeorgiaw

posted an update 7 months ago

Post

2144

🚀🚀🚀Huge biotech data drop today🚀🚀🚀

The largest drug-target dataset ever created was just released on Hugging Face—and it's still growing...

EvE Bio is further updating the dataset every 8 weeks. Drug development dream.

Read the blog: https://huggingface.co/blog/hugging-science/eve-bio-mapping-the-pharmone-drug-interaction
Play with the data: eve-bio/drug-target-activity

abidlabs

authored 3 papers 8 months ago

posted an update 8 months ago

Post

11529

Why I think local, open-source models will eventually win.

The most useful AI applications are moving toward multi-turn agentic behavior: systems that take hundreds or even thousands of iterative steps to complete a task, e.g. Claude Code, computer-control agents that click, type, and test repeatedly.

In these cases, the power of the model is not how smart it is per token, but in how quickly it can interact with its environment and tools across many steps. In that regime, model quality becomes secondary to latency.

An open-source model that can call tools quickly, check that the right thing was clicked, or verify that a code change actually passes tests can easily outperform a slightly “smarter” closed model that has to make remote API calls for every move.

Eventually, the balance tips: it becomes impractical for an agent to rely on remote inference for every micro-action. Just as no one would tolerate a keyboard that required a network request per keystroke, users won’t accept agent workflows bottlenecked by latency. All devices will ship with local, open-source models that are “good enough” and the expectation will shift toward everything running locally. It’ll happen sooner than most people think.

8 replies

·

merve

posted an update 8 months ago

Post

13399

deepseek-ai/DeepSeek-OCR is out! 🔥 my take ⤵️
> pretty insane it can parse and re-render charts in HTML
> it uses CLIP and SAM features concatenated, so better grounding
> very efficient per vision tokens/performance ratio
> covers 100 languages

4 replies

·

thomwolf

authored a paper 9 months ago

Robot Learning: A Tutorial

Paper • 2510.12403 • Published Oct 14, 2025 • 137

lvwerra

authored a paper 9 months ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9, 2025 • 40

abidlabs

posted an update 9 months ago

Post

1616

What other features would you like to see on the Trackio Dashboard? ( gradio-templates/trackio-dashboard)

merve

posted an update 9 months ago

Post

7061

large AI labs open-sourced a ton of models last week 🔥
here's few picks, find even more here merve/sep-16-releases-68d13ea4c547f02f95842f05 🤝
> IBM released a new Docling model with 258M params based on Granite (A2.0) 📝 ibm-granite/granite-docling-258M
> Xiaomi released 7B audio LM with base and instruct variants (MIT) XiaomiMiMo/mimo-audio-68cc7202692c27dae881cce0
> DecartAI released Lucy Edit, open Nano Banana 🍌 (NC) decart-ai/Lucy-Edit-Dev
> OpenGVLab released a family of agentic computer use models (3B/7B/32B) with the dataset 💻 OpenGVLab/scalecua-68c912cf56f7ff4c8e034003
> Meituan Longcat released thinking version of LongCat-Flash 💭 meituan-longcat/LongCat-Flash-Thinking

2 replies

·

merve

posted an update 10 months ago

Post

3569

IBM just released small swiss army knife for the document models: granite-docling-258M on Hugging Face 🔥

> not only a document converter but also can do document question answering, understand multiple languages 🤯
> best part: released with Apache 2.0 license 👏 use it with your commercial projects!
> it supports transformers, vLLM and MLX from the get-go! 🤗
> built on SigLIP2 & granite-165M

model: ibm-granite/granite-docling-258M
demo: ibm-granite/granite-docling-258m-demo 💗

merve

posted an update 10 months ago

Post

1310

a ton of image/video generation models and LLMs from big labs 🔥

> Meta released facebook/mobilellm-r1-68c4597b104fac45f28f448e, smol LLMs for on-device use 💬
> Tencent released tencent/SRPO, high res image generation model and tencent/POINTS-Reader, cutting edge OCR 📝
> ByteDance released bytedance-research/HuMo, video generation from any input ⏯️

find more models, datasets, demos here merve/sep-11-releases-68c7dbfa26bea8cd921fa0ac

merve

posted an update 10 months ago

Post

1093

fan-favorite vision LM Florence-2 is now officially supported in transformers 🤗

find all the models in

florence-community org 🫡

merve

posted an update 10 months ago

Post

1884

past week was great for open LLMs 🔥 merve/sep-1-releases-68bede0e729c12597eefd050

> Google released google/embeddinggemma-300m, new embedding model with 300M params
> new update to Kimi-K2 just landed moonshotai/Kimi-K2-Instruct-0905 😍
> OpenBMB released a new version to MiniCPM with 8B params openbmb/MiniCPM4.1-8B

also soooo many Qwen-Image & Kontext LoRAs dropped!

merve

posted an update 10 months ago

Post

3780

upgrade your transformers 🔥
it comes with insanely capable models like merve/sam2-66ac9deac6fca3bc5482fe30, microsoft/kosmos-2.5, and more 🫡
I built a notebook you can run with free Colab T4 to walk through the API for new models 🙋🏻‍♀️ merve/smol-vision

fine-tuning will follow-up soon!

cgeorgiaw

posted an update 10 months ago

Post

6155

🚀🚀🚀 The largest ever dataset of co-folded 3D protein-ligand structures just dropped on HF!!

Meet SAIR (Structurally Augmented IC₅₀ Repository): 5M+ AI-generated complexes with experimentally measured drug potency data from SandboxAQ. 🚀🚀🚀

Check it out and explore here: SandboxAQ/SAIR

3 replies

·

AI & ML interests

Team members 20

HuggingFaceBio's activity