mom-multilingual-class
Collection
long context models for MoM multilingual classifier (domain, jailbreak, pii, factual, feedback)
•
10 items
•
Updated
Binary classifier for determining if a query requires fact-checking, based on mmBERT-32K-YaRN.
This model classifies queries to determine if they need fact-checking (factual questions) vs. creative/opinion queries that don't require verification.
Factual (FACT_CHECK_NEEDED):
Non-Factual (NO_FACT_CHECK_NEEDED):
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from peft import PeftModel
import torch
base_model = "llm-semantic-router/mmbert-32k-yarn"
adapter = "llm-semantic-router/mmbert32k-factcheck-classifier-lora"
tokenizer = AutoTokenizer.from_pretrained(adapter)
model = AutoModelForSequenceClassification.from_pretrained(base_model, num_labels=2)
model = PeftModel.from_pretrained(model, adapter)
# Example queries
queries = [
"What is the capital of France?", # -> FACT_CHECK_NEEDED
"Write me a poem about the ocean", # -> NO_FACT_CHECK_NEEDED
]
for query in queries:
inputs = tokenizer(query, return_tensors="pt", truncation=True)
with torch.no_grad():
outputs = model(**inputs)
prediction = torch.argmax(outputs.logits, dim=-1).item()
label = "FACT_CHECK_NEEDED" if prediction == 1 else "NO_FACT_CHECK_NEEDED"
print(f"{query} -> {label}")
Apache 2.0