File size: 1,478 Bytes
ff3e67f
bcffc85
 
 
 
 
 
 
ff3e67f
bcffc85
ff3e67f
 
bcffc85
ff3e67f
bcffc85
ff3e67f
bcffc85
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
---
title: Council Topics Classifier
emoji: 🏛️
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: 1.36.0
app_file: src/streamlit_app.py
pinned: false
license: cc-by-4.0
---

# 🏛️ Council Topics Classifier

**Council Topics Classifier** is a system for automatically identifying topics in **Portuguese municipal meeting minutes discussion subjects**.

---

## 🎯 About

This demo showcases the classifier's ability to:
- Detect topics in Portuguese municipal texts discussion subjects
- Use a hybrid feature set (TF-IDF + BERTimbau embeddings)
- Combine Logistic Regression and Gradient Boosting models in an adaptive weighted ensemble
- Apply dynamic thresholds optimized per topic
- Handle unbalanced topic distributions with active learning

---

## 📊 Model Performance

- **Model Architecture**: Logistic Regression + 3x Gradient Boosting models
- **Features**: TF-IDF (1–3 n-grams) + BERTimbau contextual embeddings
- **Adaptive weighting**: Rare topics get higher LogReg weight, common topics get higher GB weight
- **Dynamic thresholds**: Optimized per topic using validation data

---

## 📝 Usage

1. **Try Your Own Text**: Paste Portuguese municipal text in the input area  
2. **Demo Examples**: Select from pre-loaded examples to see topic predictions  
3. **View Results**: Confidence scores for each predicted topic are displayed interactively  

---

## 🔧 Running Locally

```bash
pip install -r requirements.txt
streamlit run app.py