Jonatan Borkowski PRO

j14i

jborkowski

AI & ML interests

None yet

Recent Activity

liked a model about 18 hours ago

Qwen/Qwen3.5-397B-A17B

reacted to qgallouedec's post with 🔥 3 days ago

@CohereLabs just released 🌿 Tiny Aya: a fully open-source 3B parameter model that speaks 70+ languages 🌍! But there’s a catch: Tiny Aya is just a language model. It doesn’t support tool calling, the key capability that turns frontier models into powerful *agents*. So the real question is: How hard is it to turn Tiny Aya into an agent? Turns out… it’s simple, thanks to Hugging Face TRL. We’re sharing a hands-on example showing how to train Tiny Aya to turn it into a tool-calling agent using TRL, unlocking what could become the first *massively multilingual open agent*. Small model. Global reach. Agent capabilities. 👉 https://github.com/huggingface/trl/blob/main/examples/notebooks/sft_tool_calling.ipynb

liked a model 12 days ago

unsloth/GLM-5

View all activity

Organizations

liked a model about 18 hours ago

Qwen/Qwen3.5-397B-A17B

Image-Text-to-Text • 403B • Updated 3 days ago • 303k • • 907

reacted to qgallouedec's post with 🔥 3 days ago

Post

2379

@CohereLabs just released 🌿 Tiny Aya: a fully open-source 3B parameter model that speaks 70+ languages 🌍! But there’s a catch:

Tiny Aya is just a language model. It doesn’t support tool calling, the key capability that turns frontier models into powerful *agents*.
So the real question is:

How hard is it to turn Tiny Aya into an agent?

Turns out… it’s simple, thanks to Hugging Face TRL.
We’re sharing a hands-on example showing how to train Tiny Aya to turn it into a tool-calling agent using TRL, unlocking what could become the first *massively multilingual open agent*.

Small model. Global reach. Agent capabilities.

👉 https://github.com/huggingface/trl/blob/main/examples/notebooks/sft_tool_calling.ipynb

1 reply

liked a model 12 days ago

unsloth/GLM-5

Text Generation • Updated 11 days ago • 122 • 16

upvoted 2 articles 14 days ago

Article

From Golden Gate Bridge to Broken JSON: Why Anthropic's SAE Steering Fails for Structured Output

15 days ago

•

Article

Transformers.js v4 Preview: Now Available on NPM!

14 days ago

•

reacted to danielhanchen's post with ❤️ 17 days ago

Post

3669

We created a tool-calling guide for local LLMs!

Learn how to use any open model like Qwen3-Coder-Next and GLM-4.7-Flash for function calling.

Guide: https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms

We provide hands-on examples for: story writing, Python execution, terminal tool calls, maths and more.

7 replies

liked a model 17 days ago

unsloth/Qwen3-Coder-Next-GGUF

Text Generation • 80B • Updated about 1 hour ago • 502k • 387

liked a model 24 days ago

ayanami-kitasan/code-pruner

Token Classification • 0.6B • Updated 27 days ago • 172 • 5

reacted to codelion's post with 🔥 24 days ago

Post

3146

Reverse Engineering a $500M Mystery: From HashHop to Memory-Augmented Language Models

I wrote a deep dive into how Magic AI's 100M token context window might work, starting from their HashHop benchmark and building up to MALM - a Memory-Augmented Language Model.

Key insight: treating each key as a single token enables perfect retrieval at unlimited context lengths.

The article covers:

- How HashHop works and why its perfect accuracy is suspicious
- Building a tokenized solver that achieves 100% accuracy
- Scaling to MALM for real code search tasks
- Why this approach could handle 100M+ tokens

Read the full article: https://huggingface.co/blog/codelion/reverse-engineering-magic-hashhop

Try the model: codelion/malm-165m

Code: https://github.com/codelion/hash-hop

1 reply

upvoted an article 26 days ago

Article

Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms

Nov 20, 2025

•

liked a model 26 days ago

unsloth/Kimi-K2-Thinking-GGUF

1T • Updated 27 days ago • 5.86k • 113

liked a model 29 days ago

kyutai/tts-voices

Updated 10 days ago • 132

reacted to danielhanchen's post with ❤️ about 1 month ago

Post

2615

You can now fine-tune embedding models in our free Unsloth notebook! 🤗

Fine-tuning embedding models improves retrieval & RAG by aligning vectors to your domain-specific notion of similarity, improving search, clustering, and recommendations on your data.

⭐ Blog + Notebooks: https://unsloth.ai/docs/new/embedding-finetuning

Unsloth trains embedding models 1.8-3.3x faster with 20% less VRAM, 2x longer context & no accuracy loss vs. FA2 setups.

We'd like to thank Hugging Face and Unsloth contributor: electroglyph for making this possible!

3 replies

reacted to mahimairaja's post with 🔥 about 1 month ago

Post

2129

My Favorite Open Source Models for Jan 2026

1. General Use - deepseek-ai/DeepSeek-V3.2
2. Reasoning - deepseek-ai/DeepSeek-V3.2-Speciale
3. Coding - Qwen/Qwen3-Coder-30B-A3B-Instruct
4. OCR - Qwen/Qwen3-VL-8B-Instruct
5. Image Generation - black-forest-labs/FLUX.2-dev
6. Image Editing - Qwen/Qwen-Image-Edit-2509

What model do you use regularly?

4 replies

reacted to sergiopaniego's post with 🔥 about 1 month ago

Post

1634

FunctionGemma Tuning Lab is a new no-code tool by @google that lets you fine-tune a model directly from the browser, with no coding knowledge required, using TRL behind the scenes.

blog: https://developers.googleblog.com/a-guide-to-fine-tuning-functiongemma/

try it out: google/functiongemma-tuning-lab

This example builds on a more advanced one for learning fine-tuning with SFT using TRL: https://ai.google.dev/gemma/docs/functiongemma/finetuning-with-functiongemma

1 reply

liked 3 models about 1 month ago

upvoted 2 articles about 1 month ago

Article

Scaling OpenEnv: From Free Usage to Thousands of Concurrent Environments

Jan 20

•

Article

NVIDIA brings agents to life with DGX Spark and Reachy Mini

Jan 5

•

Jonatan Borkowski PRO

AI & ML interests

Recent Activity

Organizations

j14i's activity

From Golden Gate Bridge to Broken JSON: Why Anthropic's SAE Steering Fails for Structured Output

Transformers.js v4 Preview: Now Available on NPM!

Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms

Scaling OpenEnv: From Free Usage to Thousands of Concurrent Environments

NVIDIA brings agents to life with DGX Spark and Reachy Mini