EuroBERT Geopolitical Classifier (Binary)
Fine-tuned EuroBERT/EuroBERT-210m for binary classification of geopolitical tension in European news text.
- Task: Sequence classification (binary)
- Labels:
non_geopolitical(0),geopolitical(1) - Intended use: Detects whether an article reflects geopolitical tension.
- Languages: English, German, French, Spanish, Italian
- Framework: ๐ค Transformers (PyTorch)
Quick start
Inference with transformers
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_id = "Durrani95/eurobert-geopolitical-binary"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)
texts = [
"Energy Sanctions Deepen Divide Between Western Bloc and Major Oil Exporters.",
"Military Exercises Near Disputed Waters Raise Fears of Regional Escalations.",
]
inputs = tokenizer(texts, padding=True, truncation=True, max_length=512, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.softmax(logits, dim=1)
for text, p in zip(texts, probs):
label_id = int(p.argmax())
label = model.config.id2label[label_id]
confidence = float(p[label_id])
print(f"{label:>16} {confidence:6.2%} | {text}")
Labels
{
"0": "non_geopolitical",
"1": "geopolitical"
}
You may apply a decision threshold (e.g., score >= 0.5) depending on your precision/recall trade-off.
Training & Evaluation
- Base model:
EuroBERT/EuroBERT-210m - Objective: Cross-entropy (binary)
- Data: European news text labeled for geopolitical relevance
- Hardware: A100 GPU
- Epochs: 1
- Optimizer: AdamW with linear scheduler
- Metrics (validation set):
| Metric | Score |
|---|---|
| Accuracy | 0.95 |
| F1-score | 0.95 |
| Precision | 0.93 |
| Recall | 0.97 |
Training setup
| Parameter | Value |
|---|---|
| Learning rate | 3e-5 |
| Desired (effective) batch size | 64 |
| Actual GPU batch size | 16 |
| Gradient accumulation | 4 steps |
| Weight decay | 1e-5 |
| Betas | (0.9, 0.95) |
| Epsilon | 1e-8 |
| Max epochs | 1 |
Limitations & Risks
- May be sensitive to domain shift (non-news, social media text)
- Class imbalance can affect thresholding; calibrate on your validation data
- Multilingual performance can vary across languages and registers
How to cite
If you use this model, please cite this repository and the EuroBERT base model.
- Downloads last month
- 24
Model tree for Durrani95/eurobert-geopolitical-binary
Base model
EuroBERT/EuroBERT-210m