mmBERT-32K Feedback Detector (Merged)

A 4-class user feedback classifier based on mmbert-32k-yarn. This is the merged version with LoRA weights integrated - no PEFT library required.

Model Description

This model classifies user messages into 4 feedback categories to help conversational AI systems understand user satisfaction:

Label ID Description
SAT 0 User is satisfied with the response
NEED_CLARIFICATION 1 User needs more explanation or details
WRONG_ANSWER 2 User indicates the response was incorrect
WANT_DIFFERENT 3 User wants an alternative approach/answer

Performance

Validation Results (2,985 samples):

Metric Value
Accuracy 98.83%
F1 (macro) 98.24%
F1 (weighted) 98.83%

Per-Class Performance:

Class Precision Recall F1-Score Support
SAT 1.0000 1.0000 1.0000 1,491
NEED_CLARIFICATION 0.9980 0.9980 0.9980 498
WRONG_ANSWER 0.9604 0.9739 0.9671 498
WANT_DIFFERENT 0.9715 0.9578 0.9646 498

Quick Start

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained(
    "llm-semantic-router/mmbert32k-feedback-detector-merged"
)
tokenizer = AutoTokenizer.from_pretrained(
    "llm-semantic-router/mmbert32k-feedback-detector-merged"
)
model.eval()

# Label mapping
labels = ["SAT", "NEED_CLARIFICATION", "WRONG_ANSWER", "WANT_DIFFERENT"]

# Example inference
texts = [
    "Thank you, that's exactly what I needed!",
    "I don't understand, can you explain more?",
    "That's incorrect, the answer should be different.",
    "Can you give me another approach?",
]

for text in texts:
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
    with torch.no_grad():
        outputs = model(**inputs)
    
    probs = torch.softmax(outputs.logits, dim=-1)
    pred = outputs.logits.argmax(-1).item()
    conf = probs[0][pred].item()
    
    print(f"{labels[pred]:20} ({conf:.1%}) | {text}")

Output:

SAT                  (81.2%) | Thank you, that's exactly what I needed!
NEED_CLARIFICATION   (100.0%) | I don't understand, can you explain more?
WRONG_ANSWER         (100.0%) | That's incorrect, the answer should be different.
WANT_DIFFERENT       (100.0%) | Can you give me another approach?

Batch Inference

# Efficient batch processing
texts = ["Your text 1", "Your text 2", "Your text 3"]
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)

predictions = outputs.logits.argmax(-1).tolist()
feedback_types = [labels[p] for p in predictions]

Training Details

This model was fine-tuned using LoRA (Low-Rank Adaptation) with the following configuration:

Parameter Value
Base Model llm-semantic-router/mmbert-32k-yarn
LoRA Rank 64
LoRA Alpha 128
Learning Rate 2e-5
Batch Size 16
Epochs 10 (early stopped at ~5.4)
Precision bf16

Training Data

Hardware

  • GPU: AMD Instinct MI300X
  • Training Time: ~10 minutes

Model Architecture

  • Architecture: ModernBERT (Sequence Classification)
  • Parameters: ~321M (base) with merged LoRA weights
  • Max Context: 32,768 tokens (YaRN-scaled RoPE)
  • Hidden Size: 768
  • Layers: 22
  • Attention Heads: 12

Multilingual Support

Supports 1800+ languages via Glot500 tokenizer. Best performance on:

  • English (primary)
  • Chinese
  • French
  • Spanish

Use Cases

  • Conversational AI: Detect user satisfaction in chatbots
  • Customer Support: Route conversations based on feedback type
  • Quality Monitoring: Track user satisfaction trends
  • Dialog Systems: Trigger clarification or correction flows

Comparison: LoRA vs Merged

Version Size Requires PEFT Use Case
LoRA ~54MB Yes Fine-tuning, research
Merged (this) ~615MB No Production, inference

Citation

@misc{mmbert32k-feedback-detector,
  title={mmBERT-32K Feedback Detector},
  author={LLM Semantic Router Team},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/llm-semantic-router/mmbert32k-feedback-detector-merged}
}

License

Apache 2.0

Downloads last month
377
Safetensors
Model size
0.3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for llm-semantic-router/mmbert32k-feedback-detector-merged

Quantized
(6)
this model

Dataset used to train llm-semantic-router/mmbert32k-feedback-detector-merged

Collection including llm-semantic-router/mmbert32k-feedback-detector-merged

Evaluation results