EuroBERT Geopolitical Classifier (Binary)

Fine-tuned EuroBERT/EuroBERT-210m for binary classification of geopolitical tension in European news text.

  • Task: Sequence classification (binary)
  • Labels: non_geopolitical (0), geopolitical (1)
  • Intended use: Detects whether an article reflects geopolitical tension.
  • Languages: English, German, French, Spanish, Italian
  • Framework: ๐Ÿค— Transformers (PyTorch)

Quick start

Inference with transformers

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "Durrani95/eurobert-geopolitical-binary"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

texts = [
    "Energy Sanctions Deepen Divide Between Western Bloc and Major Oil Exporters.",
    "Military Exercises Near Disputed Waters Raise Fears of Regional Escalations.",

]

inputs = tokenizer(texts, padding=True, truncation=True, max_length=512, return_tensors="pt")

with torch.no_grad():
    logits = model(**inputs).logits
    probs = torch.softmax(logits, dim=1)

for text, p in zip(texts, probs):
    label_id = int(p.argmax())
    label = model.config.id2label[label_id]
    confidence = float(p[label_id])
    print(f"{label:>16}  {confidence:6.2%}  | {text}")

Labels

{
  "0": "non_geopolitical",
  "1": "geopolitical"
}

You may apply a decision threshold (e.g., score >= 0.5) depending on your precision/recall trade-off.


Training & Evaluation

  • Base model: EuroBERT/EuroBERT-210m
  • Objective: Cross-entropy (binary)
  • Data: European news text labeled for geopolitical relevance
  • Hardware: A100 GPU
  • Epochs: 1
  • Optimizer: AdamW with linear scheduler
  • Metrics (validation set):
Metric Score
Accuracy 0.95
F1-score 0.95
Precision 0.93
Recall 0.97

Training setup

Parameter Value
Learning rate 3e-5
Desired (effective) batch size 64
Actual GPU batch size 16
Gradient accumulation 4 steps
Weight decay 1e-5
Betas (0.9, 0.95)
Epsilon 1e-8
Max epochs 1

Limitations & Risks

  • May be sensitive to domain shift (non-news, social media text)
  • Class imbalance can affect thresholding; calibrate on your validation data
  • Multilingual performance can vary across languages and registers

How to cite

If you use this model, please cite this repository and the EuroBERT base model.

Downloads last month
24
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Durrani95/eurobert-geopolitical-binary

Finetuned
(45)
this model