Ron Wolf's picture

Ron Wolf

ron-wolf

·

AI & ML interests

None yet

Recent Activity

updated a collection 5 days ago

upvoted a paper 5 days ago

Intelligence per Watt: Measuring Intelligence Efficiency of Local AI

liked a model 5 days ago

bartowski/stepfun-ai_Step-3.5-Flash-GGUF

View all activity

Organizations

None yet

upvoted a paper 5 days ago

Intelligence per Watt: Measuring Intelligence Efficiency of Local AI

Paper • 2511.07885 • Published Nov 11, 2025 • 10

upvoted a collection 16 days ago

MiniMax-M2.1

3 items • Updated 11 days ago • 13

upvoted a paper 18 days ago

SETOL: A Semi-Empirical Theory of (Deep) Learning

Paper • 2507.17912 • Published Jul 23, 2025 • 1

upvoted a collection 19 days ago

DeepSeekCoder-V2

6 items • Updated Nov 27, 2025 • 113

upvoted a paper 2 months ago

Not All Bits Are Equal: Scale-Dependent Memory Optimization Strategies for Reasoning Models

Paper • 2510.10964 • Published Oct 13, 2025 • 3

upvoted 2 collections 2 months ago

Devstral 2

A couple of agentic LLMs for software engineering tasks, excelling at using tools to explore codebases, edit multiple files, and power SWE Agents. • 3 items • Updated Dec 9, 2025 • 44

Nemotron-Post-Training-v3

Collection of datasets used in the post-training phase of Nemotron Nano v3. • 8 items • Updated about 14 hours ago • 65

upvoted 4 papers 4 months ago

Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples

Paper • 2510.07192 • Published Oct 8, 2025 • 5

Language Models are Injective and Hence Invertible

Paper • 2510.15511 • Published Oct 17, 2025 • 69

Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation

Paper • 2408.13586 • Published Aug 24, 2024 • 3

Locally Typical Sampling

Paper • 2202.00666 • Published Feb 1, 2022 • 4

upvoted 9 collections 6 months ago

Lingshu MLLMs

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning • 5 items • Updated about 7 hours ago • 21

Kimi-VL-A3B

Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated 28 days ago • 78

Amoral Collection - Gemma 3 QAT

4 items • Updated May 1, 2025 • 7

Gemma 3 Release

28 items • Updated Aug 11, 2025 • 613

Gemma 3 Collection

Some fun things I've made on Gemma 3 • 6 items • Updated Apr 18, 2025 • 2

DeepSeek-V3.1

4 items • Updated Nov 27, 2025 • 261

CardProjector-v3

8 items • Updated Apr 2, 2025 • 5

RpR Models

RpR (RolePlay with Reasoning) models which are built on RPMax datasets with properly trained multi-turn reasoning. • 8 items • Updated Jun 25, 2025 • 18

GPT-OSS General (4.2B to 20B)

Collection of pruned GPT-OSS models spanning 1-32 experts, maintaining general capabilities across domains while reducing computational requirements. • 29 items • Updated Aug 13, 2025 • 10