Diffutron

non-profit

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

suayptalha authored a paper about 1 month ago

Selectivity and Shape in the Design of Forward-Forward Goodness Functions

Q-bert authored a paper about 1 month ago

Selectivity and Shape in the Design of Forward-Forward Goodness Functions

Q-bert submitted a paper 2 months ago

Diffutron: A Masked Diffusion Language Model for Turkish Language

View all activity

Papers

Diffutron: A Masked Diffusion Language Model for Turkish Language

View all Papers

Organization Card

Community About org cards

Diffutron: A Masked Diffusion Language Model for Turkish Language

| 🤗 Models | 📊 Pre-training Dataset | 📄 Paper |

Overview

Diffutron is a lightweight, non-autoregressive Masked Diffusion Language Model (MDLM) specifically optimized for the Turkish language. By utilizing a discrete diffusion process, Diffutron generates text through iterative refinement, allowing for bi-directional context awareness and high parameter efficiency.

Core Features

Architecture: Discrete Masked Diffusion (MDLM) using a 307M parameter encoder backbone.
Efficiency: Achieves competitive performance against 2B+ parameter autoregressive models on Turkish benchmarks.
Adaptation: LoRA-based (r=256) continual pre-training on a 2M sequence Turkish corpus.
Instruction Tuning: Progressive strategy using LlamaTurk and InstrucTurca datasets for enhanced command following.

Benchmarks

Diffutron achieves a significant reduction in perplexity and competitive scores across the CETVEL benchmark suite:

Benchmark	Diffutron-1st-Stage (0.3B)	Diffutron-2nd-Stage (0.3B)	TURNA (1.1B)	Kumru (2B)	Kanarya (2B)	Llama-3.2 (3B)	Trendyol (7B)	Aya-101 (13B)
Belebele_TR	22.22	27.00	22.56	29.00	28.11	55.78	36.22	22.89
EXAMS_TR	25.95	27.74	23.66	30.03	30.03	26.21	28.50	22.90
IronyTR	50.67	52.00	48.33	51.00	50.00	50.17	50.00	52.17
News_Cat	23.20	32.40	32.80	26.40	66.80	64.00	81.20	20.00
MNLI_TR	33.29	32.81	34.94	36.42	33.40	34.76	35.19	27.90
STS_TR	17.77	18.78	14.21	11.75	12.91	12.91	15.52	16.97
XCOPA_TR	53.80	52.00	55.80	54.00	64.20	54.60	61.00	59.60
Average	32.41	34.68	33.19	34.09	40.78	42.63	43.95	31.78

Citation

@misc{diffutron2026,
      title={Diffutron: A Masked Diffusion Language Model for Turkish Language}, 
      author={Şuayp Talha Kocabay and Talha Rüzgar Akkuş},
      year={2026},
      eprint={2603.20466},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2603.20466}, 
}

Collections 1

models 3

diffutron/DiffutronLM-0.3B-Base

0.3B • Updated Mar 24 • 3 • 1

diffutron/DiffutronLM-0.3B-Instruct

Text Generation • 0.3B • Updated Mar 24 • 11 • 4

Diffutron

AI & ML interests

Recent Activity

Papers

Diffutron: A Masked Diffusion Language Model for Turkish Language

Overview

Core Features

Benchmarks

Citation

Collections 1

Diffutron: A Masked Diffusion Language Model for Turkish Language

diffutron/DiffutronLM-0.3B-Instruct

diffutron/DiffutronLM-0.3B-1st-Stage

diffutron/DiffutronLM-0.3B-Base

Diffutron: A Masked Diffusion Language Model for Turkish Language

diffutron/DiffutronLM-0.3B-Instruct

diffutron/DiffutronLM-0.3B-1st-Stage

diffutron/DiffutronLM-0.3B-Base

models 3

diffutron/DiffutronLM-0.3B-Base

diffutron/DiffutronLM-0.3B-Instruct

diffutron/DiffutronLM-0.3B-1st-Stage

datasets 1

diffutron/DiffutronLM-Pretraining-Corpus

AI & ML interests

Recent Activity

Papers

Team members 2

Diffutron: A Masked Diffusion Language Model for Turkish Language

Overview

Core Features

Benchmarks

Citation

Collections 1

models 3 Sort: Recently updated

datasets 1

models 3