Quentin Gallouédec's picture

Hiring 💼

Quentin Gallouédec PRO

qgallouedec

huggingface

·

AI & ML interests

None yet

Recent Activity

updated a Space about 3 hours ago

qgallouedec/huggingface-static-fc6272

published a Space about 3 hours ago

qgallouedec/huggingface-static-fc6272

updated a bucket about 3 hours ago

qgallouedec/huggingface-static-fc6272-bucket

View all activity

Organizations

buckets 68

qgallouedec/huggingface-static-fc6272-bucket

qgallouedec/chunked-nll-benchmark-2-bucket

qgallouedec/huggingface-static-0b8730-bucket

qgallouedec/huggingface-static-3c554d-bucket

qgallouedec/huggingface-static-ef3ae9-bucket

qgallouedec/huggingface-static-29d59c-bucket

View 68 buckets

Posts 4

Post

7578

TRL v1.3 ships day-one training support for Qwen 3.6 🚀

The new Qwen 3.6 family (Qwen/Qwen3.6-27B, Qwen/Qwen3.6-35B-A3B) reuses the Qwen3.5-MoE architecture but ships a slightly different chat template, so we updated the stack end-to-end: new training template with {% generation %} markers, tool-call response schema routing, tiny test models for the VLM matrix.

SFT with assistant-only loss works out of the box:

from trl import SFTConfig, SFTTrainer

trainer = SFTTrainer(
    model="Qwen/Qwen3.6-27B",
    args=SFTConfig(assistant_only_loss=True),
    train_dataset=dataset,
)
trainer.train()

So does GRPO tool-calling — just hand tools=[...] to GRPOTrainer.

v1.3 also brings a new experimental TPO trainer (Triple Preference Optimization), speculative decoding in trl vllm-serve (Qwen3 MTP / Eagle3 drafts), 12 more KTO ↔ DPO alignment PRs (KTO promotion to stable is now in reach), three more {% generation %} chat templates (Gemma/Gemma 2, Phi-3, GLM-4-MoE), and a chunky SFT entropy bug fix.

Full release notes: https://github.com/huggingface/trl/releases/tag/v1.3.0

Articles 14

Article

50

TRL v1.0: Post-Training Library Built to Move with the Field

View all Articles

Papers 4

arxiv:2402.09844

arxiv:2402.03046

arxiv:2208.14928

arxiv:2106.13687

spaces 94

Diff Viewer

Compare two code files side‑by‑side

Huggingface Static Fc6272

View project metrics with an interactive dashboard

Huggingface Static 0b8730

View and monitor your data with an interactive dashboard

Huggingface Static 3c554d

Track and visualize your project data on an interactive dashboard

Huggingface Static Ef3ae9

Track and visualize your data with a real‑time dashboard

Huggingface Static 29d59c

View and manage your tracking data with an interactive dashboard

models 789

qgallouedec/Qwen3-0.6B-SFT-20251113165959

Text Generation • 0.6B • Updated 20 days ago • 292

qgallouedec/tiny-aya-global-SFT

qgallouedec/tiny-aya-global-tool-calling-SFT

qgallouedec/my-other-awesome-model

Text Generation • 0.5B • Updated Feb 14 • 12

qgallouedec/my-awesome-model

Text Generation • 0.5B • Updated Feb 14 • 13

qgallouedec/trainer_output

Text Generation • 0.5B • Updated Feb 14 • 10

qgallouedec/test_push_output_4

Text Classification • 87.5k • Updated Feb 14 • 6

qgallouedec/qwen2-0.5b-deepmath-grpo

qgallouedec/my-finetuned-model

0.8B • Updated Jan 2 • 1

qgallouedec/Qwen3-0.6B-SFT-20251113163732

Updated Nov 13, 2025

View 789 models

datasets 85

qgallouedec/test-grpo-vlm-log-completions

Viewer • Updated Mar 20 • 435 • 136

qgallouedec/llama_star_formatted

Viewer • Updated Feb 21 • 7.21k • 7

qgallouedec/deepmath-completions-logs2

Viewer • Updated Jan 22 • 48 • 59

qgallouedec/deepmath-completions-logs

Viewer • Updated Jan 13 • 232 • 54 • 1

qgallouedec/Dolci-Think-DPO-7B

Viewer • Updated Nov 28, 2025 • 150k • 14

qgallouedec/biogrid_qa

Viewer • Updated Nov 18, 2025 • 59.4k • 278

qgallouedec/human_gene_interaction_qa_v2

Viewer • Updated Nov 18, 2025 • 79.2k • 10

qgallouedec/human_gene_interaction_qa

Viewer • Updated Nov 17, 2025 • 1.84M • 14

qgallouedec/biogrid

Viewer • Updated Nov 17, 2025 • 2.82M • 927

qgallouedec/trl-metrics

Viewer • Updated Oct 7, 2025 • 148k • 53 • 1

View 85 datasets