mom-multilingual-class
Collection
long context models for MoM multilingual classifier (domain, jailbreak, pii, factual, feedback)
•
10 items
•
Updated
A 4-class user feedback classifier based on mmbert-32k-yarn. This is the merged version with LoRA weights integrated - no PEFT library required.
This model classifies user messages into 4 feedback categories to help conversational AI systems understand user satisfaction:
| Label | ID | Description |
|---|---|---|
| SAT | 0 | User is satisfied with the response |
| NEED_CLARIFICATION | 1 | User needs more explanation or details |
| WRONG_ANSWER | 2 | User indicates the response was incorrect |
| WANT_DIFFERENT | 3 | User wants an alternative approach/answer |
Validation Results (2,985 samples):
| Metric | Value |
|---|---|
| Accuracy | 98.83% |
| F1 (macro) | 98.24% |
| F1 (weighted) | 98.83% |
Per-Class Performance:
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| SAT | 1.0000 | 1.0000 | 1.0000 | 1,491 |
| NEED_CLARIFICATION | 0.9980 | 0.9980 | 0.9980 | 498 |
| WRONG_ANSWER | 0.9604 | 0.9739 | 0.9671 | 498 |
| WANT_DIFFERENT | 0.9715 | 0.9578 | 0.9646 | 498 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained(
"llm-semantic-router/mmbert32k-feedback-detector-merged"
)
tokenizer = AutoTokenizer.from_pretrained(
"llm-semantic-router/mmbert32k-feedback-detector-merged"
)
model.eval()
# Label mapping
labels = ["SAT", "NEED_CLARIFICATION", "WRONG_ANSWER", "WANT_DIFFERENT"]
# Example inference
texts = [
"Thank you, that's exactly what I needed!",
"I don't understand, can you explain more?",
"That's incorrect, the answer should be different.",
"Can you give me another approach?",
]
for text in texts:
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=-1)
pred = outputs.logits.argmax(-1).item()
conf = probs[0][pred].item()
print(f"{labels[pred]:20} ({conf:.1%}) | {text}")
Output:
SAT (81.2%) | Thank you, that's exactly what I needed!
NEED_CLARIFICATION (100.0%) | I don't understand, can you explain more?
WRONG_ANSWER (100.0%) | That's incorrect, the answer should be different.
WANT_DIFFERENT (100.0%) | Can you give me another approach?
# Efficient batch processing
texts = ["Your text 1", "Your text 2", "Your text 3"]
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=512)
with torch.no_grad():
outputs = model(**inputs)
predictions = outputs.logits.argmax(-1).tolist()
feedback_types = [labels[p] for p in predictions]
This model was fine-tuned using LoRA (Low-Rank Adaptation) with the following configuration:
| Parameter | Value |
|---|---|
| Base Model | llm-semantic-router/mmbert-32k-yarn |
| LoRA Rank | 64 |
| LoRA Alpha | 128 |
| Learning Rate | 2e-5 |
| Batch Size | 16 |
| Epochs | 10 (early stopped at ~5.4) |
| Precision | bf16 |
Supports 1800+ languages via Glot500 tokenizer. Best performance on:
| Version | Size | Requires PEFT | Use Case |
|---|---|---|---|
| LoRA | ~54MB | Yes | Fine-tuning, research |
| Merged (this) | ~615MB | No | Production, inference |
@misc{mmbert32k-feedback-detector,
title={mmBERT-32K Feedback Detector},
author={LLM Semantic Router Team},
year={2026},
publisher={Hugging Face},
url={https://huggingface.co/llm-semantic-router/mmbert32k-feedback-detector-merged}
}
Apache 2.0
Base model
jhu-clsp/mmBERT-base