Spaces:
Sleeping
Sleeping
Initial upload
Browse files- README.md +308 -13
- app.py +342 -0
- config.py +78 -0
- pipeline/__init__.py +17 -0
- pipeline/critique_extraction.py +145 -0
- pipeline/disagreement_detection.py +174 -0
- pipeline/disagreement_resolution.py +242 -0
- pipeline/meta_review.py +170 -0
- pipeline/search_retrieval.py +224 -0
- requirements.txt +38 -0
- utils/__init__.py +29 -0
- utils/queue_manager.py +76 -0
- utils/rate_limiter.py +84 -0
- utils/validators.py +196 -0
README.md
CHANGED
|
@@ -1,13 +1,308 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# 🔬 Automated Consensus Analysis API
|
| 2 |
+
|
| 3 |
+
A comprehensive HuggingFace Spaces API for automated peer review consensus analysis using LLMs and search-augmented verification.
|
| 4 |
+
|
| 5 |
+
## 🌟 Features
|
| 6 |
+
|
| 7 |
+
- **Critique Extraction**: Extract structured critique points from peer reviews using Gemini 2.0
|
| 8 |
+
- **Disagreement Detection**: Identify conflicts and disagreements between reviewers
|
| 9 |
+
- **Search-Augmented Verification**: Retrieve supporting/contradicting evidence from academic sources
|
| 10 |
+
- **Disagreement Resolution**: AI-powered resolution using DeepSeek-R1 with reasoning
|
| 11 |
+
- **Meta-Review Generation**: Comprehensive meta-reviews synthesizing all analyses
|
| 12 |
+
- **Rate Limiting**: 10 requests per minute per client
|
| 13 |
+
- **Queue Management**: Up to 3 concurrent pipeline executions
|
| 14 |
+
- **Progress Tracking**: Real-time status updates for long-running tasks
|
| 15 |
+
|
| 16 |
+
## 🚀 Quick Start
|
| 17 |
+
|
| 18 |
+
### Local Development
|
| 19 |
+
|
| 20 |
+
1. **Clone and setup**
|
| 21 |
+
|
| 22 |
+
```bash
|
| 23 |
+
cd api
|
| 24 |
+
python -m venv venv
|
| 25 |
+
source venv/bin/activate # On Windows: venv\Scripts\activate
|
| 26 |
+
pip install -r requirements.txt
|
| 27 |
+
```
|
| 28 |
+
|
| 29 |
+
2. **Configure environment**
|
| 30 |
+
|
| 31 |
+
```bash
|
| 32 |
+
cp .env.example .env
|
| 33 |
+
# Edit .env with your API keys
|
| 34 |
+
```
|
| 35 |
+
|
| 36 |
+
3. **Run the application**
|
| 37 |
+
|
| 38 |
+
```bash
|
| 39 |
+
python app.py
|
| 40 |
+
```
|
| 41 |
+
|
| 42 |
+
Visit `http://localhost:7860` to access the Gradio interface.
|
| 43 |
+
|
| 44 |
+
### HuggingFace Spaces Deployment
|
| 45 |
+
|
| 46 |
+
1. **Create a new Space**
|
| 47 |
+
|
| 48 |
+
- Go to [HuggingFace Spaces](https://huggingface.co/spaces)
|
| 49 |
+
- Click "Create new Space"
|
| 50 |
+
- Select "Gradio" as SDK
|
| 51 |
+
|
| 52 |
+
2. **Upload files**
|
| 53 |
+
|
| 54 |
+
- Upload all files from the `api/` directory
|
| 55 |
+
- Ensure `requirements.txt` and `app.py` are in the root
|
| 56 |
+
|
| 57 |
+
3. **Configure secrets**
|
| 58 |
+
|
| 59 |
+
- Go to Space Settings → Repository secrets
|
| 60 |
+
- Add the following secrets:
|
| 61 |
+
- `GEMINI_API_KEY`
|
| 62 |
+
- `OPENROUTER_API_KEY`
|
| 63 |
+
- `TAVILY_API_KEY`
|
| 64 |
+
- `SERPAPI_API_KEY`
|
| 65 |
+
|
| 66 |
+
4. **Deploy**
|
| 67 |
+
- The Space will automatically build and deploy
|
| 68 |
+
|
| 69 |
+
## 📚 API Endpoints
|
| 70 |
+
|
| 71 |
+
### Full Pipeline
|
| 72 |
+
|
| 73 |
+
**Endpoint**: `/api/full_pipeline`
|
| 74 |
+
**Method**: POST
|
| 75 |
+
**Description**: Run the complete consensus analysis pipeline
|
| 76 |
+
|
| 77 |
+
**Request Body**:
|
| 78 |
+
|
| 79 |
+
```json
|
| 80 |
+
{
|
| 81 |
+
"paper_title": "Visual Correspondence Hallucination",
|
| 82 |
+
"paper_abstract": "This paper investigates...",
|
| 83 |
+
"reviews": [
|
| 84 |
+
"Review 1: The methodology is sound but...",
|
| 85 |
+
"Review 2: While the experiments are comprehensive..."
|
| 86 |
+
]
|
| 87 |
+
}
|
| 88 |
+
```
|
| 89 |
+
|
| 90 |
+
**Response**:
|
| 91 |
+
|
| 92 |
+
```json
|
| 93 |
+
{
|
| 94 |
+
"request_id": "req_123456789",
|
| 95 |
+
"paper_title": "...",
|
| 96 |
+
"critique_points": [...],
|
| 97 |
+
"disagreements": [...],
|
| 98 |
+
"search_results": {...},
|
| 99 |
+
"resolution": [...],
|
| 100 |
+
"meta_review": "..."
|
| 101 |
+
}
|
| 102 |
+
```
|
| 103 |
+
|
| 104 |
+
### Individual Stages
|
| 105 |
+
|
| 106 |
+
#### Critique Extraction
|
| 107 |
+
|
| 108 |
+
**Endpoint**: `/api/critique_extraction`
|
| 109 |
+
**Method**: POST
|
| 110 |
+
|
| 111 |
+
```json
|
| 112 |
+
{
|
| 113 |
+
"reviews": ["Review 1 text...", "Review 2 text..."]
|
| 114 |
+
}
|
| 115 |
+
```
|
| 116 |
+
|
| 117 |
+
#### Disagreement Detection
|
| 118 |
+
|
| 119 |
+
**Endpoint**: `/api/disagreement_detection`
|
| 120 |
+
**Method**: POST
|
| 121 |
+
|
| 122 |
+
```json
|
| 123 |
+
{
|
| 124 |
+
"critiques": [
|
| 125 |
+
{"Methodology": [...], "Experiments": [...]},
|
| 126 |
+
{"Methodology": [...], "Experiments": [...]}
|
| 127 |
+
]
|
| 128 |
+
}
|
| 129 |
+
```
|
| 130 |
+
|
| 131 |
+
#### Search & Retrieval
|
| 132 |
+
|
| 133 |
+
**Endpoint**: `/api/search_retrieval`
|
| 134 |
+
**Method**: POST
|
| 135 |
+
|
| 136 |
+
```json
|
| 137 |
+
{
|
| 138 |
+
"paper_title": "...",
|
| 139 |
+
"paper_abstract": "...",
|
| 140 |
+
"critiques": [...]
|
| 141 |
+
}
|
| 142 |
+
```
|
| 143 |
+
|
| 144 |
+
#### Progress Tracking
|
| 145 |
+
|
| 146 |
+
**Endpoint**: `/api/progress/{request_id}`
|
| 147 |
+
**Method**: GET
|
| 148 |
+
|
| 149 |
+
**Response**:
|
| 150 |
+
|
| 151 |
+
```json
|
| 152 |
+
{
|
| 153 |
+
"stage": "search_retrieval",
|
| 154 |
+
"progress": 0.5,
|
| 155 |
+
"message": "Searching for relevant research...",
|
| 156 |
+
"timestamp": "2025-01-15T10:30:00"
|
| 157 |
+
}
|
| 158 |
+
```
|
| 159 |
+
|
| 160 |
+
## 🔧 Configuration
|
| 161 |
+
|
| 162 |
+
### Environment Variables
|
| 163 |
+
|
| 164 |
+
| Variable | Description | Default |
|
| 165 |
+
| ------------------------- | ------------------------------ | -------- |
|
| 166 |
+
| `GEMINI_API_KEY` | Google Gemini API key | Required |
|
| 167 |
+
| `OPENROUTER_API_KEY` | OpenRouter API key (DeepSeek) | Required |
|
| 168 |
+
| `TAVILY_API_KEY` | Tavily Search API key | Required |
|
| 169 |
+
| `SERPAPI_API_KEY` | SerpAPI key for Google Scholar | Optional |
|
| 170 |
+
| `MAX_REQUESTS_PER_MINUTE` | Rate limit | 10 |
|
| 171 |
+
| `MAX_CONCURRENT_TASKS` | Max parallel executions | 3 |
|
| 172 |
+
| `MAX_RETRIES` | Retry attempts on failure | 5 |
|
| 173 |
+
|
| 174 |
+
### Rate Limits
|
| 175 |
+
|
| 176 |
+
- **10 requests per minute** per client IP
|
| 177 |
+
- **Maximum 3 concurrent** pipeline executions
|
| 178 |
+
- **Queue size**: 20 pending requests
|
| 179 |
+
|
| 180 |
+
## 🏗️ Architecture
|
| 181 |
+
|
| 182 |
+
```
|
| 183 |
+
api/
|
| 184 |
+
├── app.py # Main Gradio application
|
| 185 |
+
├── config.py # Configuration management
|
| 186 |
+
├── requirements.txt # Python dependencies
|
| 187 |
+
├── pipeline/ # Pipeline modules
|
| 188 |
+
│ ├── critique_extraction.py # Gemini-based extraction
|
| 189 |
+
│ ├── disagreement_detection.py
|
| 190 |
+
│ ├── search_retrieval.py # LangChain search agent
|
| 191 |
+
│ ├── disagreement_resolution.py # DeepSeek resolution
|
| 192 |
+
│ └── meta_review.py
|
| 193 |
+
└── utils/ # Utility modules
|
| 194 |
+
├── rate_limiter.py
|
| 195 |
+
├── queue_manager.py
|
| 196 |
+
└── validators.py
|
| 197 |
+
```
|
| 198 |
+
|
| 199 |
+
## 🔍 Pipeline Stages
|
| 200 |
+
|
| 201 |
+
1. **Critique Extraction** (Gemini 2.0)
|
| 202 |
+
|
| 203 |
+
- Extracts structured critique points
|
| 204 |
+
- Categories: Methodology, Experiments, Clarity, Significance, Novelty
|
| 205 |
+
|
| 206 |
+
2. **Disagreement Detection** (Gemini 2.0)
|
| 207 |
+
|
| 208 |
+
- Compares all review pairs
|
| 209 |
+
- Assigns disagreement scores (0-1)
|
| 210 |
+
- Identifies specific conflict points
|
| 211 |
+
|
| 212 |
+
3. **Search & Retrieval** (LangChain + Multi-Search)
|
| 213 |
+
|
| 214 |
+
- SoTA research discovery
|
| 215 |
+
- Evidence validation
|
| 216 |
+
- Sources: Semantic Scholar, arXiv, Google Scholar, Tavily
|
| 217 |
+
|
| 218 |
+
4. **Disagreement Resolution** (DeepSeek-R1)
|
| 219 |
+
|
| 220 |
+
- Validates critique points
|
| 221 |
+
- Accepts/rejects based on evidence
|
| 222 |
+
- Provides resolution summaries
|
| 223 |
+
|
| 224 |
+
5. **Meta-Review Generation** (DeepSeek-R1)
|
| 225 |
+
- Synthesizes all analyses
|
| 226 |
+
- Provides final verdict
|
| 227 |
+
- Offers actionable recommendations
|
| 228 |
+
|
| 229 |
+
## 📊 Example Usage
|
| 230 |
+
|
| 231 |
+
### Python
|
| 232 |
+
|
| 233 |
+
```python
|
| 234 |
+
import requests
|
| 235 |
+
|
| 236 |
+
response = requests.post(
|
| 237 |
+
"https://your-space.hf.space/api/full_pipeline",
|
| 238 |
+
json={
|
| 239 |
+
"paper_title": "Novel Approach to X",
|
| 240 |
+
"paper_abstract": "We propose...",
|
| 241 |
+
"reviews": [
|
| 242 |
+
"Reviewer 1: Strong methodology...",
|
| 243 |
+
"Reviewer 2: Weak experimental validation..."
|
| 244 |
+
]
|
| 245 |
+
}
|
| 246 |
+
)
|
| 247 |
+
|
| 248 |
+
result = response.json()
|
| 249 |
+
print(result["meta_review"])
|
| 250 |
+
```
|
| 251 |
+
|
| 252 |
+
### cURL
|
| 253 |
+
|
| 254 |
+
```bash
|
| 255 |
+
curl -X POST https://your-space.hf.space/api/full_pipeline \
|
| 256 |
+
-H "Content-Type: application/json" \
|
| 257 |
+
-d '{
|
| 258 |
+
"paper_title": "Novel Approach to X",
|
| 259 |
+
"paper_abstract": "We propose...",
|
| 260 |
+
"reviews": ["Review 1...", "Review 2..."]
|
| 261 |
+
}'
|
| 262 |
+
```
|
| 263 |
+
|
| 264 |
+
## 🛠️ Development
|
| 265 |
+
|
| 266 |
+
### Running Tests
|
| 267 |
+
|
| 268 |
+
```bash
|
| 269 |
+
pytest tests/
|
| 270 |
+
```
|
| 271 |
+
|
| 272 |
+
### Code Quality
|
| 273 |
+
|
| 274 |
+
```bash
|
| 275 |
+
# Format code
|
| 276 |
+
black .
|
| 277 |
+
|
| 278 |
+
# Type checking
|
| 279 |
+
mypy .
|
| 280 |
+
|
| 281 |
+
# Linting
|
| 282 |
+
ruff check .
|
| 283 |
+
```
|
| 284 |
+
|
| 285 |
+
## 📝 License
|
| 286 |
+
|
| 287 |
+
See the main project LICENSE file.
|
| 288 |
+
|
| 289 |
+
## 🤝 Contributing
|
| 290 |
+
|
| 291 |
+
Contributions welcome! Please:
|
| 292 |
+
|
| 293 |
+
1. Fork the repository
|
| 294 |
+
2. Create a feature branch
|
| 295 |
+
3. Submit a pull request
|
| 296 |
+
|
| 297 |
+
## 📧 Support
|
| 298 |
+
|
| 299 |
+
For issues or questions:
|
| 300 |
+
|
| 301 |
+
- Open an issue on GitHub
|
| 302 |
+
- Contact: [Your contact info]
|
| 303 |
+
|
| 304 |
+
## 🔗 Links
|
| 305 |
+
|
| 306 |
+
- [HuggingFace Space](https://huggingface.co/spaces/your-username/consensus-analysis)
|
| 307 |
+
- [Main Repository](https://github.com/your-username/automated-consensus-analysis)
|
| 308 |
+
- [Documentation](https://your-docs-site.com)
|
app.py
ADDED
|
@@ -0,0 +1,342 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import gradio as gr
|
| 2 |
+
import json
|
| 3 |
+
import os
|
| 4 |
+
from typing import Dict, List, Optional
|
| 5 |
+
from datetime import datetime
|
| 6 |
+
import asyncio
|
| 7 |
+
from functools import wraps
|
| 8 |
+
|
| 9 |
+
from pipeline.critique_extraction import extract_critiques
|
| 10 |
+
from pipeline.disagreement_detection import detect_disagreements
|
| 11 |
+
from pipeline.search_retrieval import search_and_retrieve
|
| 12 |
+
from pipeline.disagreement_resolution import resolve_disagreements
|
| 13 |
+
from pipeline.meta_review import generate_meta_review
|
| 14 |
+
from utils.rate_limiter import RateLimiter
|
| 15 |
+
from utils.queue_manager import QueueManager
|
| 16 |
+
from utils.validators import validate_paper_input
|
| 17 |
+
|
| 18 |
+
from dotenv import load_dotenv
|
| 19 |
+
load_dotenv()
|
| 20 |
+
print(os.getenv("GEMINI_API_KEY"))
|
| 21 |
+
|
| 22 |
+
# Initialize rate limiter and queue manager
|
| 23 |
+
rate_limiter = RateLimiter(max_requests_per_minute=10)
|
| 24 |
+
queue_manager = QueueManager(max_concurrent=3)
|
| 25 |
+
|
| 26 |
+
# Progress tracking
|
| 27 |
+
progress_store = {}
|
| 28 |
+
|
| 29 |
+
def update_progress(request_id: str, stage: str, progress: float, message: str):
|
| 30 |
+
"""Update progress for a request"""
|
| 31 |
+
progress_store[request_id] = {
|
| 32 |
+
"stage": stage,
|
| 33 |
+
"progress": progress,
|
| 34 |
+
"message": message,
|
| 35 |
+
"timestamp": datetime.now().isoformat()
|
| 36 |
+
}
|
| 37 |
+
|
| 38 |
+
async def full_pipeline(
|
| 39 |
+
paper_title: str,
|
| 40 |
+
paper_abstract: str,
|
| 41 |
+
reviews: List[str],
|
| 42 |
+
request_id: Optional[str] = None
|
| 43 |
+
) -> Dict:
|
| 44 |
+
"""
|
| 45 |
+
Run the complete consensus analysis pipeline
|
| 46 |
+
|
| 47 |
+
Args:
|
| 48 |
+
paper_title: Title of the paper
|
| 49 |
+
paper_abstract: Abstract of the paper
|
| 50 |
+
reviews: List of review texts
|
| 51 |
+
request_id: Optional request ID for progress tracking
|
| 52 |
+
|
| 53 |
+
Returns:
|
| 54 |
+
Complete pipeline results
|
| 55 |
+
"""
|
| 56 |
+
if not request_id:
|
| 57 |
+
request_id = f"req_{datetime.now().timestamp()}"
|
| 58 |
+
|
| 59 |
+
results = {
|
| 60 |
+
"request_id": request_id,
|
| 61 |
+
"paper_title": paper_title,
|
| 62 |
+
"paper_abstract": paper_abstract
|
| 63 |
+
}
|
| 64 |
+
|
| 65 |
+
try:
|
| 66 |
+
# Stage 1: Critique Extraction
|
| 67 |
+
update_progress(request_id, "critique_extraction", 0.1, "Extracting critique points...")
|
| 68 |
+
critique_results = await extract_critiques(reviews)
|
| 69 |
+
results["critique_points"] = critique_results
|
| 70 |
+
|
| 71 |
+
# Stage 2: Disagreement Detection
|
| 72 |
+
update_progress(request_id, "disagreement_detection", 0.3, "Detecting disagreements...")
|
| 73 |
+
disagreement_results = await detect_disagreements(critique_results)
|
| 74 |
+
results["disagreements"] = disagreement_results
|
| 75 |
+
|
| 76 |
+
# Stage 3: Search & Retrieval
|
| 77 |
+
update_progress(request_id, "search_retrieval", 0.5, "Searching for relevant research...")
|
| 78 |
+
search_results = await search_and_retrieve(paper_title, paper_abstract, critique_results)
|
| 79 |
+
results["search_results"] = search_results
|
| 80 |
+
|
| 81 |
+
# Stage 4: Disagreement Resolution
|
| 82 |
+
update_progress(request_id, "disagreement_resolution", 0.7, "Resolving disagreements...")
|
| 83 |
+
resolution_results = await resolve_disagreements(
|
| 84 |
+
paper_title,
|
| 85 |
+
paper_abstract,
|
| 86 |
+
disagreement_results,
|
| 87 |
+
critique_results,
|
| 88 |
+
search_results
|
| 89 |
+
)
|
| 90 |
+
results["resolution"] = resolution_results
|
| 91 |
+
|
| 92 |
+
# Stage 5: Meta-Review Generation
|
| 93 |
+
update_progress(request_id, "meta_review", 0.9, "Generating meta-review...")
|
| 94 |
+
meta_review = await generate_meta_review(
|
| 95 |
+
paper_title,
|
| 96 |
+
paper_abstract,
|
| 97 |
+
resolution_results,
|
| 98 |
+
search_results
|
| 99 |
+
)
|
| 100 |
+
results["meta_review"] = meta_review
|
| 101 |
+
|
| 102 |
+
update_progress(request_id, "complete", 1.0, "Pipeline complete!")
|
| 103 |
+
return results
|
| 104 |
+
|
| 105 |
+
except Exception as e:
|
| 106 |
+
update_progress(request_id, "error", 0.0, f"Error: {str(e)}")
|
| 107 |
+
raise
|
| 108 |
+
|
| 109 |
+
# Gradio Interface Functions
|
| 110 |
+
def run_full_pipeline_ui(title: str, abstract: str, reviews_json: str) -> str:
|
| 111 |
+
"""UI wrapper for full pipeline"""
|
| 112 |
+
try:
|
| 113 |
+
# Validate and parse input
|
| 114 |
+
reviews = json.loads(reviews_json)
|
| 115 |
+
if not isinstance(reviews, list):
|
| 116 |
+
return json.dumps({"error": "Reviews must be a list of strings"}, indent=2)
|
| 117 |
+
|
| 118 |
+
# Check rate limit
|
| 119 |
+
if not rate_limiter.allow_request():
|
| 120 |
+
return json.dumps({"error": "Rate limit exceeded. Please try again later."}, indent=2)
|
| 121 |
+
|
| 122 |
+
# Add to queue and run
|
| 123 |
+
request_id = f"ui_{datetime.now().timestamp()}"
|
| 124 |
+
result = asyncio.run(queue_manager.add_task(
|
| 125 |
+
full_pipeline(title, abstract, reviews, request_id)
|
| 126 |
+
))
|
| 127 |
+
|
| 128 |
+
return json.dumps(result, indent=2)
|
| 129 |
+
|
| 130 |
+
except json.JSONDecodeError:
|
| 131 |
+
return json.dumps({"error": "Invalid JSON format for reviews"}, indent=2)
|
| 132 |
+
except Exception as e:
|
| 133 |
+
return json.dumps({"error": str(e)}, indent=2)
|
| 134 |
+
|
| 135 |
+
def run_critique_extraction_ui(reviews_json: str) -> str:
|
| 136 |
+
"""UI wrapper for critique extraction"""
|
| 137 |
+
try:
|
| 138 |
+
reviews = json.loads(reviews_json)
|
| 139 |
+
if not rate_limiter.allow_request():
|
| 140 |
+
return json.dumps({"error": "Rate limit exceeded"}, indent=2)
|
| 141 |
+
|
| 142 |
+
result = asyncio.run(extract_critiques(reviews))
|
| 143 |
+
return json.dumps(result, indent=2)
|
| 144 |
+
except Exception as e:
|
| 145 |
+
return json.dumps({"error": str(e)}, indent=2)
|
| 146 |
+
|
| 147 |
+
def run_disagreement_detection_ui(critiques_json: str) -> str:
|
| 148 |
+
"""UI wrapper for disagreement detection"""
|
| 149 |
+
try:
|
| 150 |
+
critiques = json.loads(critiques_json)
|
| 151 |
+
if not rate_limiter.allow_request():
|
| 152 |
+
return json.dumps({"error": "Rate limit exceeded"}, indent=2)
|
| 153 |
+
|
| 154 |
+
result = asyncio.run(detect_disagreements(critiques))
|
| 155 |
+
return json.dumps(result, indent=2)
|
| 156 |
+
except Exception as e:
|
| 157 |
+
return json.dumps({"error": str(e)}, indent=2)
|
| 158 |
+
|
| 159 |
+
def run_search_retrieval_ui(title: str, abstract: str, critiques_json: str) -> str:
|
| 160 |
+
"""UI wrapper for search retrieval"""
|
| 161 |
+
try:
|
| 162 |
+
critiques = json.loads(critiques_json)
|
| 163 |
+
if not rate_limiter.allow_request():
|
| 164 |
+
return json.dumps({"error": "Rate limit exceeded"}, indent=2)
|
| 165 |
+
|
| 166 |
+
result = asyncio.run(search_and_retrieve(title, abstract, critiques))
|
| 167 |
+
return json.dumps(result, indent=2)
|
| 168 |
+
except Exception as e:
|
| 169 |
+
return json.dumps({"error": str(e)}, indent=2)
|
| 170 |
+
|
| 171 |
+
def check_progress_ui(request_id: str) -> str:
|
| 172 |
+
"""Check progress of a request"""
|
| 173 |
+
if request_id in progress_store:
|
| 174 |
+
return json.dumps(progress_store[request_id], indent=2)
|
| 175 |
+
return json.dumps({"error": "Request ID not found"}, indent=2)
|
| 176 |
+
|
| 177 |
+
# Build Gradio Interface
|
| 178 |
+
with gr.Blocks(title="Automated Consensus Analysis API", theme=gr.themes.Soft()) as demo:
|
| 179 |
+
gr.Markdown("""
|
| 180 |
+
# 🔬 Automated Consensus Analysis API
|
| 181 |
+
|
| 182 |
+
This API provides automated peer review consensus analysis using LLMs and search-augmented verification.
|
| 183 |
+
|
| 184 |
+
## Features:
|
| 185 |
+
- **Critique Extraction**: Extract structured critique points from reviews
|
| 186 |
+
- **Disagreement Detection**: Identify conflicts between reviewers
|
| 187 |
+
- **Search Retrieval**: Find supporting/contradicting evidence
|
| 188 |
+
- **Resolution**: Resolve disagreements with evidence
|
| 189 |
+
- **Meta-Review**: Generate comprehensive meta-reviews
|
| 190 |
+
""")
|
| 191 |
+
|
| 192 |
+
with gr.Tabs():
|
| 193 |
+
# Full Pipeline Tab
|
| 194 |
+
with gr.Tab("📋 Full Pipeline"):
|
| 195 |
+
gr.Markdown("### Run the complete analysis pipeline")
|
| 196 |
+
with gr.Row():
|
| 197 |
+
with gr.Column():
|
| 198 |
+
full_title = gr.Textbox(label="Paper Title", placeholder="Enter paper title...")
|
| 199 |
+
full_abstract = gr.Textbox(label="Paper Abstract", lines=5, placeholder="Enter paper abstract...")
|
| 200 |
+
full_reviews = gr.Code(
|
| 201 |
+
label="Reviews (JSON Array)",
|
| 202 |
+
language="json",
|
| 203 |
+
value='["Review 1 text...", "Review 2 text..."]'
|
| 204 |
+
)
|
| 205 |
+
full_submit = gr.Button("🚀 Run Full Pipeline", variant="primary")
|
| 206 |
+
with gr.Column():
|
| 207 |
+
full_output = gr.Code(label="Results", language="json")
|
| 208 |
+
|
| 209 |
+
full_submit.click(
|
| 210 |
+
fn=run_full_pipeline_ui,
|
| 211 |
+
inputs=[full_title, full_abstract, full_reviews],
|
| 212 |
+
outputs=full_output
|
| 213 |
+
)
|
| 214 |
+
|
| 215 |
+
# Individual Stages
|
| 216 |
+
with gr.Tab("🔍 Critique Extraction"):
|
| 217 |
+
gr.Markdown("### Extract critique points from reviews")
|
| 218 |
+
critique_reviews = gr.Code(
|
| 219 |
+
label="Reviews (JSON Array)",
|
| 220 |
+
language="json",
|
| 221 |
+
value='["Review 1...", "Review 2..."]'
|
| 222 |
+
)
|
| 223 |
+
critique_submit = gr.Button("Extract Critiques")
|
| 224 |
+
critique_output = gr.Code(label="Extracted Critiques", language="json")
|
| 225 |
+
|
| 226 |
+
critique_submit.click(
|
| 227 |
+
fn=run_critique_extraction_ui,
|
| 228 |
+
inputs=critique_reviews,
|
| 229 |
+
outputs=critique_output
|
| 230 |
+
)
|
| 231 |
+
|
| 232 |
+
with gr.Tab("⚡ Disagreement Detection"):
|
| 233 |
+
gr.Markdown("### Detect disagreements between reviews")
|
| 234 |
+
disagree_critiques = gr.Code(
|
| 235 |
+
label="Critique Points (JSON)",
|
| 236 |
+
language="json"
|
| 237 |
+
)
|
| 238 |
+
disagree_submit = gr.Button("Detect Disagreements")
|
| 239 |
+
disagree_output = gr.Code(label="Disagreement Analysis", language="json")
|
| 240 |
+
|
| 241 |
+
disagree_submit.click(
|
| 242 |
+
fn=run_disagreement_detection_ui,
|
| 243 |
+
inputs=disagree_critiques,
|
| 244 |
+
outputs=disagree_output
|
| 245 |
+
)
|
| 246 |
+
|
| 247 |
+
with gr.Tab("🔎 Search & Retrieval"):
|
| 248 |
+
gr.Markdown("### Search for supporting evidence")
|
| 249 |
+
with gr.Row():
|
| 250 |
+
with gr.Column():
|
| 251 |
+
search_title = gr.Textbox(label="Paper Title")
|
| 252 |
+
search_abstract = gr.Textbox(label="Paper Abstract", lines=3)
|
| 253 |
+
search_critiques = gr.Code(label="Critiques (JSON)", language="json")
|
| 254 |
+
search_submit = gr.Button("Search Evidence")
|
| 255 |
+
with gr.Column():
|
| 256 |
+
search_output = gr.Code(label="Search Results", language="json")
|
| 257 |
+
|
| 258 |
+
search_submit.click(
|
| 259 |
+
fn=run_search_retrieval_ui,
|
| 260 |
+
inputs=[search_title, search_abstract, search_critiques],
|
| 261 |
+
outputs=search_output
|
| 262 |
+
)
|
| 263 |
+
|
| 264 |
+
with gr.Tab("📊 Progress Tracking"):
|
| 265 |
+
gr.Markdown("### Check pipeline progress")
|
| 266 |
+
progress_id = gr.Textbox(label="Request ID", placeholder="Enter request ID...")
|
| 267 |
+
progress_check = gr.Button("Check Progress")
|
| 268 |
+
progress_output = gr.Code(label="Progress Status", language="json")
|
| 269 |
+
|
| 270 |
+
progress_check.click(
|
| 271 |
+
fn=check_progress_ui,
|
| 272 |
+
inputs=progress_id,
|
| 273 |
+
outputs=progress_output
|
| 274 |
+
)
|
| 275 |
+
|
| 276 |
+
with gr.Tab("📖 API Documentation"):
|
| 277 |
+
gr.Markdown("""
|
| 278 |
+
## API Endpoints
|
| 279 |
+
|
| 280 |
+
### POST /api/full_pipeline
|
| 281 |
+
Run the complete consensus analysis pipeline.
|
| 282 |
+
|
| 283 |
+
**Request Body:**
|
| 284 |
+
```json
|
| 285 |
+
{
|
| 286 |
+
"paper_title": "string",
|
| 287 |
+
"paper_abstract": "string",
|
| 288 |
+
"reviews": ["review1", "review2", ...]
|
| 289 |
+
}
|
| 290 |
+
```
|
| 291 |
+
|
| 292 |
+
### POST /api/critique_extraction
|
| 293 |
+
Extract critique points from reviews.
|
| 294 |
+
|
| 295 |
+
**Request Body:**
|
| 296 |
+
```json
|
| 297 |
+
{
|
| 298 |
+
"reviews": ["review1", "review2", ...]
|
| 299 |
+
}
|
| 300 |
+
```
|
| 301 |
+
|
| 302 |
+
### POST /api/disagreement_detection
|
| 303 |
+
Detect disagreements in critique points.
|
| 304 |
+
|
| 305 |
+
**Request Body:**
|
| 306 |
+
```json
|
| 307 |
+
{
|
| 308 |
+
"critiques": [{"Methodology": [...], ...}, ...]
|
| 309 |
+
}
|
| 310 |
+
```
|
| 311 |
+
|
| 312 |
+
### POST /api/search_retrieval
|
| 313 |
+
Search for supporting evidence.
|
| 314 |
+
|
| 315 |
+
**Request Body:**
|
| 316 |
+
```json
|
| 317 |
+
{
|
| 318 |
+
"paper_title": "string",
|
| 319 |
+
"paper_abstract": "string",
|
| 320 |
+
"critiques": [...]
|
| 321 |
+
}
|
| 322 |
+
```
|
| 323 |
+
|
| 324 |
+
### GET /api/progress/{request_id}
|
| 325 |
+
Check progress of a pipeline execution.
|
| 326 |
+
|
| 327 |
+
## Rate Limits
|
| 328 |
+
- 10 requests per minute per IP
|
| 329 |
+
- Maximum 3 concurrent pipeline executions
|
| 330 |
+
|
| 331 |
+
## Authentication
|
| 332 |
+
API keys are managed through HuggingFace Spaces secrets.
|
| 333 |
+
""")
|
| 334 |
+
|
| 335 |
+
# Launch the app
|
| 336 |
+
if __name__ == "__main__":
|
| 337 |
+
demo.queue(max_size=20) # Enable queuing
|
| 338 |
+
demo.launch(
|
| 339 |
+
server_name="0.0.0.0",
|
| 340 |
+
server_port=7860,
|
| 341 |
+
share=False
|
| 342 |
+
)
|
config.py
ADDED
|
@@ -0,0 +1,78 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
from pathlib import Path
|
| 3 |
+
|
| 4 |
+
# Base directory
|
| 5 |
+
BASE_DIR = Path(__file__).parent
|
| 6 |
+
|
| 7 |
+
# API Configuration
|
| 8 |
+
API_TITLE = "Automated Consensus Analysis API"
|
| 9 |
+
API_VERSION = "1.0.0"
|
| 10 |
+
API_DESCRIPTION = """
|
| 11 |
+
## Automated Consensus Analysis for Peer Reviews
|
| 12 |
+
|
| 13 |
+
This API provides comprehensive analysis of peer review disagreements using:
|
| 14 |
+
- **LLM-based critique extraction** (Gemini 2.0)
|
| 15 |
+
- **Disagreement detection** between reviewers
|
| 16 |
+
- **Search-augmented evidence retrieval** (Semantic Scholar, arXiv, Google Scholar, Tavily)
|
| 17 |
+
- **AI-powered disagreement resolution** (DeepSeek-R1)
|
| 18 |
+
- **Meta-review generation**
|
| 19 |
+
|
| 20 |
+
### Features:
|
| 21 |
+
- ✅ Full pipeline or individual stage execution
|
| 22 |
+
- ✅ Rate limiting and queue management
|
| 23 |
+
- ✅ Progress tracking
|
| 24 |
+
- ✅ JSON and form data support
|
| 25 |
+
"""
|
| 26 |
+
|
| 27 |
+
# Rate Limiting
|
| 28 |
+
MAX_REQUESTS_PER_MINUTE = int(os.getenv("MAX_REQUESTS_PER_MINUTE", "10"))
|
| 29 |
+
MAX_CONCURRENT_TASKS = int(os.getenv("MAX_CONCURRENT_TASKS", "3"))
|
| 30 |
+
QUEUE_MAX_SIZE = int(os.getenv("QUEUE_MAX_SIZE", "20"))
|
| 31 |
+
|
| 32 |
+
# Model Configuration
|
| 33 |
+
GEMINI_MODEL = os.getenv("GEMINI_MODEL", "gemini-2.0-flash")
|
| 34 |
+
DEEPSEEK_MODEL = os.getenv("DEEPSEEK_MODEL", "deepseek/deepseek-r1")
|
| 35 |
+
|
| 36 |
+
# API Keys (from HF Spaces secrets)
|
| 37 |
+
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
|
| 38 |
+
OPENROUTER_API_KEY = os.getenv("OPENROUTER_API_KEY")
|
| 39 |
+
TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")
|
| 40 |
+
SERPAPI_API_KEY = os.getenv("SERPAPI_API_KEY")
|
| 41 |
+
|
| 42 |
+
# Retry Configuration
|
| 43 |
+
MAX_RETRIES = int(os.getenv("MAX_RETRIES", "5"))
|
| 44 |
+
BASE_RETRY_WAIT = int(os.getenv("BASE_RETRY_WAIT", "2"))
|
| 45 |
+
|
| 46 |
+
# Timeout Configuration
|
| 47 |
+
REQUEST_TIMEOUT = int(os.getenv("REQUEST_TIMEOUT", "300")) # 5 minutes
|
| 48 |
+
SEARCH_TIMEOUT = int(os.getenv("SEARCH_TIMEOUT", "60")) # 1 minute
|
| 49 |
+
|
| 50 |
+
# Logging
|
| 51 |
+
LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO")
|
| 52 |
+
|
| 53 |
+
def validate_environment():
|
| 54 |
+
"""
|
| 55 |
+
Validate that all required environment variables are set
|
| 56 |
+
|
| 57 |
+
Raises:
|
| 58 |
+
ValueError: If required variables are missing
|
| 59 |
+
"""
|
| 60 |
+
required_vars = {
|
| 61 |
+
"GEMINI_API_KEY": GEMINI_API_KEY,
|
| 62 |
+
"OPENROUTER_API_KEY": OPENROUTER_API_KEY,
|
| 63 |
+
"TAVILY_API_KEY": TAVILY_API_KEY,
|
| 64 |
+
}
|
| 65 |
+
|
| 66 |
+
missing = [var for var, value in required_vars.items() if not value]
|
| 67 |
+
|
| 68 |
+
if missing:
|
| 69 |
+
raise ValueError(
|
| 70 |
+
f"Missing required environment variables: {', '.join(missing)}\n"
|
| 71 |
+
f"Please set them in HuggingFace Spaces secrets."
|
| 72 |
+
)
|
| 73 |
+
|
| 74 |
+
# Validate on import
|
| 75 |
+
try:
|
| 76 |
+
validate_environment()
|
| 77 |
+
except ValueError as e:
|
| 78 |
+
print(f"⚠️ Configuration Warning: {e}")
|
pipeline/__init__.py
ADDED
|
@@ -0,0 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Pipeline modules for automated consensus analysis
|
| 3 |
+
"""
|
| 4 |
+
|
| 5 |
+
from .critique_extraction import extract_critiques
|
| 6 |
+
from .disagreement_detection import detect_disagreements
|
| 7 |
+
from .search_retrieval import search_and_retrieve
|
| 8 |
+
from .disagreement_resolution import resolve_disagreements
|
| 9 |
+
from .meta_review import generate_meta_review
|
| 10 |
+
|
| 11 |
+
__all__ = [
|
| 12 |
+
'extract_critiques',
|
| 13 |
+
'detect_disagreements',
|
| 14 |
+
'search_and_retrieve',
|
| 15 |
+
'resolve_disagreements',
|
| 16 |
+
'generate_meta_review',
|
| 17 |
+
]
|
pipeline/critique_extraction.py
ADDED
|
@@ -0,0 +1,145 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import json
|
| 2 |
+
import os
|
| 3 |
+
from typing import List, Dict
|
| 4 |
+
import google.generativeai as genai
|
| 5 |
+
from pydantic import BaseModel
|
| 6 |
+
import asyncio
|
| 7 |
+
import time
|
| 8 |
+
|
| 9 |
+
from dotenv import load_dotenv
|
| 10 |
+
load_dotenv()
|
| 11 |
+
|
| 12 |
+
# Configure Gemini
|
| 13 |
+
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
|
| 14 |
+
|
| 15 |
+
class CritiquePoint(BaseModel):
|
| 16 |
+
Methodology: List[str] = []
|
| 17 |
+
Experiments: List[str] = []
|
| 18 |
+
Clarity: List[str] = []
|
| 19 |
+
Significance: List[str] = []
|
| 20 |
+
Novelty: List[str] = []
|
| 21 |
+
|
| 22 |
+
async def extract_single_critique(review_text: str, retries: int = 5) -> Dict:
|
| 23 |
+
"""
|
| 24 |
+
Extract critique points from a single review using Gemini
|
| 25 |
+
|
| 26 |
+
Args:
|
| 27 |
+
review_text: The review text to analyze
|
| 28 |
+
retries: Maximum number of retries
|
| 29 |
+
|
| 30 |
+
Returns:
|
| 31 |
+
Dictionary with categorized critique points
|
| 32 |
+
"""
|
| 33 |
+
prompt = f"""
|
| 34 |
+
Extract key critique points from the following research paper review.
|
| 35 |
+
Categorize them into aspects: Methodology, Experiments, Clarity, Significance, Novelty.
|
| 36 |
+
Return a structured JSON with these categories as keys and lists of critique points as values.
|
| 37 |
+
|
| 38 |
+
Review:
|
| 39 |
+
{review_text}
|
| 40 |
+
|
| 41 |
+
Respond with ONLY valid JSON in this format:
|
| 42 |
+
{{
|
| 43 |
+
"Methodology": ["point1", "point2"],
|
| 44 |
+
"Experiments": ["point1"],
|
| 45 |
+
"Clarity": ["point1", "point2"],
|
| 46 |
+
"Significance": ["point1"],
|
| 47 |
+
"Novelty": ["point1"]
|
| 48 |
+
}}
|
| 49 |
+
"""
|
| 50 |
+
|
| 51 |
+
model = genai.GenerativeModel(
|
| 52 |
+
model_name="gemini-2.0-flash",
|
| 53 |
+
generation_config={
|
| 54 |
+
"response_mime_type": "application/json",
|
| 55 |
+
}
|
| 56 |
+
)
|
| 57 |
+
|
| 58 |
+
for attempt in range(retries):
|
| 59 |
+
try:
|
| 60 |
+
response = await asyncio.to_thread(
|
| 61 |
+
model.generate_content,
|
| 62 |
+
prompt
|
| 63 |
+
)
|
| 64 |
+
|
| 65 |
+
if not response.text.strip():
|
| 66 |
+
raise ValueError("Empty response from Gemini")
|
| 67 |
+
|
| 68 |
+
result = json.loads(response.text)
|
| 69 |
+
|
| 70 |
+
# Validate structure
|
| 71 |
+
critique = CritiquePoint(**result)
|
| 72 |
+
return critique.model_dump()
|
| 73 |
+
|
| 74 |
+
except genai.types.generation_types.BlockedPromptException as e:
|
| 75 |
+
print(f"Content blocked by safety filters: {e}")
|
| 76 |
+
return {
|
| 77 |
+
"Methodology": [],
|
| 78 |
+
"Experiments": [],
|
| 79 |
+
"Clarity": [],
|
| 80 |
+
"Significance": [],
|
| 81 |
+
"Novelty": [],
|
| 82 |
+
"error": "Content blocked by safety filters"
|
| 83 |
+
}
|
| 84 |
+
|
| 85 |
+
except Exception as e:
|
| 86 |
+
wait_time = 2 ** attempt
|
| 87 |
+
print(f"Attempt {attempt + 1} failed: {e}. Retrying in {wait_time}s...")
|
| 88 |
+
|
| 89 |
+
if attempt < retries - 1:
|
| 90 |
+
await asyncio.sleep(wait_time)
|
| 91 |
+
else:
|
| 92 |
+
return {
|
| 93 |
+
"Methodology": [],
|
| 94 |
+
"Experiments": [],
|
| 95 |
+
"Clarity": [],
|
| 96 |
+
"Significance": [],
|
| 97 |
+
"Novelty": [],
|
| 98 |
+
"error": str(e)
|
| 99 |
+
}
|
| 100 |
+
|
| 101 |
+
async def extract_critiques(reviews: List[str]) -> List[Dict]:
|
| 102 |
+
"""
|
| 103 |
+
Extract critique points from multiple reviews
|
| 104 |
+
|
| 105 |
+
Args:
|
| 106 |
+
reviews: List of review texts
|
| 107 |
+
|
| 108 |
+
Returns:
|
| 109 |
+
List of dictionaries with categorized critique points
|
| 110 |
+
"""
|
| 111 |
+
if not reviews:
|
| 112 |
+
return []
|
| 113 |
+
|
| 114 |
+
# Filter valid reviews (must be strings with substantial content)
|
| 115 |
+
valid_reviews = [r for r in reviews if isinstance(r, str) and len(r.strip()) > 100]
|
| 116 |
+
|
| 117 |
+
if not valid_reviews:
|
| 118 |
+
return []
|
| 119 |
+
|
| 120 |
+
# Process reviews concurrently with rate limiting
|
| 121 |
+
tasks = []
|
| 122 |
+
for review in valid_reviews:
|
| 123 |
+
tasks.append(extract_single_critique(review))
|
| 124 |
+
# Small delay to avoid overwhelming the API
|
| 125 |
+
await asyncio.sleep(0.5)
|
| 126 |
+
|
| 127 |
+
results = await asyncio.gather(*tasks, return_exceptions=True)
|
| 128 |
+
|
| 129 |
+
# Filter out exceptions and return valid results
|
| 130 |
+
critiques = []
|
| 131 |
+
for i, result in enumerate(results):
|
| 132 |
+
if isinstance(result, Exception):
|
| 133 |
+
print(f"Review {i} failed: {result}")
|
| 134 |
+
critiques.append({
|
| 135 |
+
"Methodology": [],
|
| 136 |
+
"Experiments": [],
|
| 137 |
+
"Clarity": [],
|
| 138 |
+
"Significance": [],
|
| 139 |
+
"Novelty": [],
|
| 140 |
+
"error": str(result)
|
| 141 |
+
})
|
| 142 |
+
else:
|
| 143 |
+
critiques.append(result)
|
| 144 |
+
|
| 145 |
+
return critiques
|
pipeline/disagreement_detection.py
ADDED
|
@@ -0,0 +1,174 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import json
|
| 2 |
+
import os
|
| 3 |
+
from typing import List, Dict
|
| 4 |
+
from itertools import combinations
|
| 5 |
+
import google.generativeai as genai
|
| 6 |
+
from pydantic import BaseModel, Field
|
| 7 |
+
import asyncio
|
| 8 |
+
|
| 9 |
+
from dotenv import load_dotenv
|
| 10 |
+
load_dotenv()
|
| 11 |
+
|
| 12 |
+
# Configure Gemini
|
| 13 |
+
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
|
| 14 |
+
|
| 15 |
+
class DisagreementDetails(BaseModel):
|
| 16 |
+
Methodology: List[str] = Field(default_factory=list)
|
| 17 |
+
Experiments: List[str] = Field(default_factory=list)
|
| 18 |
+
Clarity: List[str] = Field(default_factory=list)
|
| 19 |
+
Significance: List[str] = Field(default_factory=list)
|
| 20 |
+
Novelty: List[str] = Field(default_factory=list)
|
| 21 |
+
|
| 22 |
+
class DisagreementResult(BaseModel):
|
| 23 |
+
review_pair: List[int]
|
| 24 |
+
disagreement_score: float = Field(..., ge=0.0, le=1.0)
|
| 25 |
+
disagreement_details: DisagreementDetails
|
| 26 |
+
|
| 27 |
+
def list_to_string(lst: List[str]) -> str:
|
| 28 |
+
"""Convert list to formatted string"""
|
| 29 |
+
return "\n".join(f"- {item}" for item in lst) if lst else "None"
|
| 30 |
+
|
| 31 |
+
async def compare_review_pair(
|
| 32 |
+
review1: Dict,
|
| 33 |
+
review2: Dict,
|
| 34 |
+
idx1: int,
|
| 35 |
+
idx2: int,
|
| 36 |
+
retries: int = 5
|
| 37 |
+
) -> Dict:
|
| 38 |
+
"""
|
| 39 |
+
Compare two reviews and detect disagreements
|
| 40 |
+
|
| 41 |
+
Args:
|
| 42 |
+
review1: First review's critique points
|
| 43 |
+
review2: Second review's critique points
|
| 44 |
+
idx1: Index of first review
|
| 45 |
+
idx2: Index of second review
|
| 46 |
+
retries: Maximum retry attempts
|
| 47 |
+
|
| 48 |
+
Returns:
|
| 49 |
+
Disagreement analysis results
|
| 50 |
+
"""
|
| 51 |
+
prompt = f"""
|
| 52 |
+
Compare the following two reviews and identify disagreements across different aspects.
|
| 53 |
+
Assess disagreement level (0.0 = perfect agreement, 1.0 = complete disagreement) and
|
| 54 |
+
list specific points of disagreement for each category.
|
| 55 |
+
|
| 56 |
+
Review 1:
|
| 57 |
+
Methodology: {list_to_string(review1.get('Methodology', []))}
|
| 58 |
+
Experiments: {list_to_string(review1.get('Experiments', []))}
|
| 59 |
+
Clarity: {list_to_string(review1.get('Clarity', []))}
|
| 60 |
+
Significance: {list_to_string(review1.get('Significance', []))}
|
| 61 |
+
Novelty: {list_to_string(review1.get('Novelty', []))}
|
| 62 |
+
|
| 63 |
+
Review 2:
|
| 64 |
+
Methodology: {list_to_string(review2.get('Methodology', []))}
|
| 65 |
+
Experiments: {list_to_string(review2.get('Experiments', []))}
|
| 66 |
+
Clarity: {list_to_string(review2.get('Clarity', []))}
|
| 67 |
+
Significance: {list_to_string(review2.get('Significance', []))}
|
| 68 |
+
Novelty: {list_to_string(review2.get('Novelty', []))}
|
| 69 |
+
|
| 70 |
+
Respond with ONLY valid JSON in this exact format:
|
| 71 |
+
{{
|
| 72 |
+
"disagreement_score": 0.5,
|
| 73 |
+
"disagreement_details": {{
|
| 74 |
+
"Methodology": ["specific disagreement point 1"],
|
| 75 |
+
"Experiments": ["specific disagreement point 1"],
|
| 76 |
+
"Clarity": [],
|
| 77 |
+
"Significance": ["specific disagreement point 1"],
|
| 78 |
+
"Novelty": []
|
| 79 |
+
}}
|
| 80 |
+
}}
|
| 81 |
+
"""
|
| 82 |
+
|
| 83 |
+
model = genai.GenerativeModel(
|
| 84 |
+
model_name="gemini-2.0-flash",
|
| 85 |
+
generation_config={
|
| 86 |
+
"response_mime_type": "application/json",
|
| 87 |
+
}
|
| 88 |
+
)
|
| 89 |
+
|
| 90 |
+
for attempt in range(retries):
|
| 91 |
+
try:
|
| 92 |
+
response = await asyncio.to_thread(
|
| 93 |
+
model.generate_content,
|
| 94 |
+
prompt
|
| 95 |
+
)
|
| 96 |
+
|
| 97 |
+
if not response.text.strip():
|
| 98 |
+
raise ValueError("Empty response from Gemini")
|
| 99 |
+
|
| 100 |
+
result = json.loads(response.text)
|
| 101 |
+
|
| 102 |
+
# Validate structure
|
| 103 |
+
disagreement = DisagreementResult(
|
| 104 |
+
review_pair=[idx1, idx2],
|
| 105 |
+
disagreement_score=result["disagreement_score"],
|
| 106 |
+
disagreement_details=result["disagreement_details"]
|
| 107 |
+
)
|
| 108 |
+
|
| 109 |
+
return disagreement.model_dump()
|
| 110 |
+
|
| 111 |
+
except Exception as e:
|
| 112 |
+
wait_time = 2 ** attempt
|
| 113 |
+
print(f"Disagreement detection attempt {attempt + 1} failed: {e}")
|
| 114 |
+
|
| 115 |
+
if attempt < retries - 1:
|
| 116 |
+
await asyncio.sleep(wait_time)
|
| 117 |
+
else:
|
| 118 |
+
return {
|
| 119 |
+
"review_pair": [idx1, idx2],
|
| 120 |
+
"disagreement_score": 0.0,
|
| 121 |
+
"disagreement_details": {
|
| 122 |
+
"Methodology": [],
|
| 123 |
+
"Experiments": [],
|
| 124 |
+
"Clarity": [],
|
| 125 |
+
"Significance": [],
|
| 126 |
+
"Novelty": []
|
| 127 |
+
},
|
| 128 |
+
"error": str(e)
|
| 129 |
+
}
|
| 130 |
+
|
| 131 |
+
async def detect_disagreements(critique_points: List[Dict]) -> List[Dict]:
|
| 132 |
+
"""
|
| 133 |
+
Detect disagreements across all review pairs
|
| 134 |
+
|
| 135 |
+
Args:
|
| 136 |
+
critique_points: List of critique point dictionaries
|
| 137 |
+
|
| 138 |
+
Returns:
|
| 139 |
+
List of disagreement analyses
|
| 140 |
+
"""
|
| 141 |
+
if len(critique_points) < 2:
|
| 142 |
+
return []
|
| 143 |
+
|
| 144 |
+
# Generate all review pairs
|
| 145 |
+
review_pairs = list(combinations(range(len(critique_points)), 2))
|
| 146 |
+
|
| 147 |
+
if not review_pairs:
|
| 148 |
+
return []
|
| 149 |
+
|
| 150 |
+
# Process pairs concurrently with rate limiting
|
| 151 |
+
tasks = []
|
| 152 |
+
for idx1, idx2 in review_pairs:
|
| 153 |
+
tasks.append(
|
| 154 |
+
compare_review_pair(
|
| 155 |
+
critique_points[idx1],
|
| 156 |
+
critique_points[idx2],
|
| 157 |
+
idx1,
|
| 158 |
+
idx2
|
| 159 |
+
)
|
| 160 |
+
)
|
| 161 |
+
# Small delay between API calls
|
| 162 |
+
await asyncio.sleep(0.3)
|
| 163 |
+
|
| 164 |
+
results = await asyncio.gather(*tasks, return_exceptions=True)
|
| 165 |
+
|
| 166 |
+
# Filter results
|
| 167 |
+
disagreements = []
|
| 168 |
+
for i, result in enumerate(results):
|
| 169 |
+
if isinstance(result, Exception):
|
| 170 |
+
print(f"Review pair {review_pairs[i]} failed: {result}")
|
| 171 |
+
else:
|
| 172 |
+
disagreements.append(result)
|
| 173 |
+
|
| 174 |
+
return disagreements
|
pipeline/disagreement_resolution.py
ADDED
|
@@ -0,0 +1,242 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import json
|
| 2 |
+
import os
|
| 3 |
+
from typing import List, Dict
|
| 4 |
+
from openai import OpenAI
|
| 5 |
+
from pydantic import BaseModel
|
| 6 |
+
import asyncio
|
| 7 |
+
|
| 8 |
+
from dotenv import load_dotenv
|
| 9 |
+
load_dotenv()
|
| 10 |
+
|
| 11 |
+
# Initialize OpenRouter client
|
| 12 |
+
client = OpenAI(
|
| 13 |
+
base_url="https://openrouter.ai/api/v1",
|
| 14 |
+
api_key=os.getenv("OPENROUTER_API_KEY"),
|
| 15 |
+
)
|
| 16 |
+
|
| 17 |
+
class ResolutionDetails(BaseModel):
|
| 18 |
+
accepted_critique_points: Dict[str, List[str]]
|
| 19 |
+
rejected_critique_points: Dict[str, List[str]]
|
| 20 |
+
final_resolution_summary: str
|
| 21 |
+
|
| 22 |
+
class DisagreementResolutionResult(BaseModel):
|
| 23 |
+
review_pair: List[int]
|
| 24 |
+
resolution_details: ResolutionDetails
|
| 25 |
+
|
| 26 |
+
def construct_resolution_prompt(
|
| 27 |
+
paper_title: str,
|
| 28 |
+
paper_abstract: str,
|
| 29 |
+
disagreement: Dict,
|
| 30 |
+
combined_critiques: Dict,
|
| 31 |
+
sota_results: str,
|
| 32 |
+
retrieved_evidence: Dict
|
| 33 |
+
) -> tuple:
|
| 34 |
+
"""
|
| 35 |
+
Construct prompt for disagreement resolution
|
| 36 |
+
|
| 37 |
+
Args:
|
| 38 |
+
paper_title: Title of the paper
|
| 39 |
+
paper_abstract: Abstract of the paper
|
| 40 |
+
disagreement: Disagreement analysis results
|
| 41 |
+
combined_critiques: Combined critique points
|
| 42 |
+
sota_results: State-of-the-art findings
|
| 43 |
+
retrieved_evidence: Retrieved evidence per category
|
| 44 |
+
|
| 45 |
+
Returns:
|
| 46 |
+
Tuple of (system_prompt, user_prompt)
|
| 47 |
+
"""
|
| 48 |
+
system_prompt = """
|
| 49 |
+
You are an AI specialized in resolving academic peer review disagreements.
|
| 50 |
+
Your task is to analyze critiques, verify evidence, and provide a structured resolution.
|
| 51 |
+
|
| 52 |
+
Respond in the following JSON format:
|
| 53 |
+
{
|
| 54 |
+
"accepted_critique_points": {"category": ["critique_1", "critique_2"]},
|
| 55 |
+
"rejected_critique_points": {"category": ["critique_3"]},
|
| 56 |
+
"final_resolution_summary": "After analyzing critiques and evidence, we conclude that..."
|
| 57 |
+
}
|
| 58 |
+
"""
|
| 59 |
+
|
| 60 |
+
disagreement_details = disagreement.get('disagreement_details', {})
|
| 61 |
+
disagreement_score = disagreement.get('disagreement_score', 0.0)
|
| 62 |
+
|
| 63 |
+
user_prompt = f"""
|
| 64 |
+
### **Paper Details**
|
| 65 |
+
**Title:** {paper_title}
|
| 66 |
+
**Abstract:** {paper_abstract}
|
| 67 |
+
|
| 68 |
+
### **Reviewer Disagreement (Score: {disagreement_score})**
|
| 69 |
+
- **Methodology:** {', '.join(disagreement_details.get('Methodology', ['N/A']))}
|
| 70 |
+
- **Experiments:** {', '.join(disagreement_details.get('Experiments', ['N/A']))}
|
| 71 |
+
- **Clarity:** {', '.join(disagreement_details.get('Clarity', ['N/A']))}
|
| 72 |
+
- **Significance:** {', '.join(disagreement_details.get('Significance', ['N/A']))}
|
| 73 |
+
- **Novelty:** {', '.join(disagreement_details.get('Novelty', ['N/A']))}
|
| 74 |
+
|
| 75 |
+
### **Supporting Information**
|
| 76 |
+
**Combined Critique Points from Reviews:**
|
| 77 |
+
{json.dumps(combined_critiques, indent=2)}
|
| 78 |
+
|
| 79 |
+
**State-of-the-Art (SoTA) Findings:**
|
| 80 |
+
{sota_results[:2000]}
|
| 81 |
+
|
| 82 |
+
**Retrieved Evidence:**
|
| 83 |
+
{json.dumps(retrieved_evidence, indent=2)[:2000]}
|
| 84 |
+
|
| 85 |
+
### **Resolution Task**
|
| 86 |
+
1. Validate critique points and categorize them into accepted or rejected.
|
| 87 |
+
2. Compare with SoTA research and retrieved evidence.
|
| 88 |
+
3. Provide a final resolution summary explaining whether the disagreement is justified.
|
| 89 |
+
|
| 90 |
+
Respond with ONLY valid JSON.
|
| 91 |
+
"""
|
| 92 |
+
|
| 93 |
+
return system_prompt, user_prompt
|
| 94 |
+
|
| 95 |
+
async def resolve_single_disagreement(
|
| 96 |
+
paper_title: str,
|
| 97 |
+
paper_abstract: str,
|
| 98 |
+
disagreement: Dict,
|
| 99 |
+
combined_critiques: Dict,
|
| 100 |
+
sota_results: str,
|
| 101 |
+
retrieved_evidence: Dict,
|
| 102 |
+
retries: int = 5
|
| 103 |
+
) -> Dict:
|
| 104 |
+
"""
|
| 105 |
+
Resolve a single disagreement using DeepSeek-R1
|
| 106 |
+
|
| 107 |
+
Args:
|
| 108 |
+
paper_title: Paper title
|
| 109 |
+
paper_abstract: Paper abstract
|
| 110 |
+
disagreement: Disagreement analysis
|
| 111 |
+
combined_critiques: Combined critique points
|
| 112 |
+
sota_results: SoTA findings
|
| 113 |
+
retrieved_evidence: Evidence per category
|
| 114 |
+
retries: Maximum retry attempts
|
| 115 |
+
|
| 116 |
+
Returns:
|
| 117 |
+
Resolution results
|
| 118 |
+
"""
|
| 119 |
+
system_prompt, user_prompt = construct_resolution_prompt(
|
| 120 |
+
paper_title,
|
| 121 |
+
paper_abstract,
|
| 122 |
+
disagreement,
|
| 123 |
+
combined_critiques,
|
| 124 |
+
sota_results,
|
| 125 |
+
retrieved_evidence
|
| 126 |
+
)
|
| 127 |
+
|
| 128 |
+
messages = [
|
| 129 |
+
{"role": "system", "content": system_prompt},
|
| 130 |
+
{"role": "user", "content": user_prompt},
|
| 131 |
+
]
|
| 132 |
+
|
| 133 |
+
for attempt in range(retries):
|
| 134 |
+
try:
|
| 135 |
+
response = await asyncio.to_thread(
|
| 136 |
+
client.chat.completions.create,
|
| 137 |
+
model="deepseek/deepseek-r1",
|
| 138 |
+
messages=messages,
|
| 139 |
+
response_format={"type": "json_object"},
|
| 140 |
+
)
|
| 141 |
+
|
| 142 |
+
if not response.choices or not response.choices[0].message.content.strip():
|
| 143 |
+
raise ValueError("Empty response from DeepSeek-R1")
|
| 144 |
+
|
| 145 |
+
# Parse response (remove potential prefix)
|
| 146 |
+
content = response.choices[0].message.content.strip()
|
| 147 |
+
if content.startswith("```json"):
|
| 148 |
+
content = content[7:-3].strip()
|
| 149 |
+
elif content.startswith("```"):
|
| 150 |
+
content = content[3:-3].strip()
|
| 151 |
+
|
| 152 |
+
llm_output = json.loads(content)
|
| 153 |
+
|
| 154 |
+
# Validate required keys
|
| 155 |
+
required_keys = {
|
| 156 |
+
"accepted_critique_points",
|
| 157 |
+
"rejected_critique_points",
|
| 158 |
+
"final_resolution_summary"
|
| 159 |
+
}
|
| 160 |
+
|
| 161 |
+
if not required_keys.issubset(llm_output.keys()):
|
| 162 |
+
raise ValueError(f"Missing keys. Present: {llm_output.keys()}")
|
| 163 |
+
|
| 164 |
+
# Validate structure
|
| 165 |
+
resolution = DisagreementResolutionResult(
|
| 166 |
+
review_pair=disagreement.get('review_pair', [0, 1]),
|
| 167 |
+
resolution_details=ResolutionDetails(**llm_output)
|
| 168 |
+
)
|
| 169 |
+
|
| 170 |
+
return resolution.model_dump()
|
| 171 |
+
|
| 172 |
+
except Exception as e:
|
| 173 |
+
wait_time = 2 ** attempt
|
| 174 |
+
print(f"Resolution attempt {attempt + 1} failed: {e}")
|
| 175 |
+
|
| 176 |
+
if attempt < retries - 1:
|
| 177 |
+
await asyncio.sleep(wait_time)
|
| 178 |
+
else:
|
| 179 |
+
return {
|
| 180 |
+
"review_pair": disagreement.get('review_pair', [0, 1]),
|
| 181 |
+
"resolution_details": {
|
| 182 |
+
"accepted_critique_points": {},
|
| 183 |
+
"rejected_critique_points": {},
|
| 184 |
+
"final_resolution_summary": f"Error: {str(e)}"
|
| 185 |
+
},
|
| 186 |
+
"error": str(e)
|
| 187 |
+
}
|
| 188 |
+
|
| 189 |
+
async def resolve_disagreements(
|
| 190 |
+
paper_title: str,
|
| 191 |
+
paper_abstract: str,
|
| 192 |
+
disagreements: List[Dict],
|
| 193 |
+
critique_points: List[Dict],
|
| 194 |
+
search_results: Dict
|
| 195 |
+
) -> List[Dict]:
|
| 196 |
+
"""
|
| 197 |
+
Resolve all disagreements
|
| 198 |
+
|
| 199 |
+
Args:
|
| 200 |
+
paper_title: Paper title
|
| 201 |
+
paper_abstract: Paper abstract
|
| 202 |
+
disagreements: List of disagreement analyses
|
| 203 |
+
critique_points: List of critique points
|
| 204 |
+
search_results: Search and retrieval results
|
| 205 |
+
|
| 206 |
+
Returns:
|
| 207 |
+
List of resolution results
|
| 208 |
+
"""
|
| 209 |
+
if not disagreements:
|
| 210 |
+
return []
|
| 211 |
+
|
| 212 |
+
combined_critiques = search_results.get('Combined_Critiques', {})
|
| 213 |
+
sota_results = search_results.get('SoTA_Results', '')
|
| 214 |
+
retrieved_evidence = search_results.get('Retrieved_Evidence', {})
|
| 215 |
+
|
| 216 |
+
# Process disagreements with rate limiting
|
| 217 |
+
tasks = []
|
| 218 |
+
for disagreement in disagreements:
|
| 219 |
+
tasks.append(
|
| 220 |
+
resolve_single_disagreement(
|
| 221 |
+
paper_title,
|
| 222 |
+
paper_abstract,
|
| 223 |
+
disagreement,
|
| 224 |
+
combined_critiques,
|
| 225 |
+
sota_results,
|
| 226 |
+
retrieved_evidence
|
| 227 |
+
)
|
| 228 |
+
)
|
| 229 |
+
# Delay between API calls
|
| 230 |
+
await asyncio.sleep(1)
|
| 231 |
+
|
| 232 |
+
results = await asyncio.gather(*tasks, return_exceptions=True)
|
| 233 |
+
|
| 234 |
+
# Filter results
|
| 235 |
+
resolutions = []
|
| 236 |
+
for i, result in enumerate(results):
|
| 237 |
+
if isinstance(result, Exception):
|
| 238 |
+
print(f"Resolution {i} failed: {result}")
|
| 239 |
+
else:
|
| 240 |
+
resolutions.append(result)
|
| 241 |
+
|
| 242 |
+
return resolutions
|
pipeline/meta_review.py
ADDED
|
@@ -0,0 +1,170 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import json
|
| 2 |
+
import os
|
| 3 |
+
from typing import List, Dict
|
| 4 |
+
from openai import OpenAI
|
| 5 |
+
from pydantic import BaseModel
|
| 6 |
+
import asyncio
|
| 7 |
+
|
| 8 |
+
from dotenv import load_dotenv
|
| 9 |
+
load_dotenv()
|
| 10 |
+
|
| 11 |
+
# Initialize OpenRouter client
|
| 12 |
+
client = OpenAI(
|
| 13 |
+
base_url="https://openrouter.ai/api/v1",
|
| 14 |
+
api_key=os.getenv("OPENROUTER_API_KEY"),
|
| 15 |
+
)
|
| 16 |
+
|
| 17 |
+
class MetaReviewResult(BaseModel):
|
| 18 |
+
meta_review: str
|
| 19 |
+
|
| 20 |
+
def construct_meta_review_prompt(
|
| 21 |
+
paper_title: str,
|
| 22 |
+
paper_abstract: str,
|
| 23 |
+
resolutions: List[Dict],
|
| 24 |
+
search_results: Dict
|
| 25 |
+
) -> tuple:
|
| 26 |
+
"""
|
| 27 |
+
Construct prompt for meta-review generation
|
| 28 |
+
|
| 29 |
+
Args:
|
| 30 |
+
paper_title: Paper title
|
| 31 |
+
paper_abstract: Paper abstract
|
| 32 |
+
resolutions: List of disagreement resolutions
|
| 33 |
+
search_results: Search and retrieval results
|
| 34 |
+
|
| 35 |
+
Returns:
|
| 36 |
+
Tuple of (system_prompt, user_prompt)
|
| 37 |
+
"""
|
| 38 |
+
# Aggregate all resolutions
|
| 39 |
+
all_accepted = {}
|
| 40 |
+
all_rejected = {}
|
| 41 |
+
resolution_summaries = []
|
| 42 |
+
|
| 43 |
+
for resolution in resolutions:
|
| 44 |
+
details = resolution.get('resolution_details', {})
|
| 45 |
+
|
| 46 |
+
# Merge accepted points
|
| 47 |
+
accepted = details.get('accepted_critique_points', {})
|
| 48 |
+
for category, points in accepted.items():
|
| 49 |
+
if category not in all_accepted:
|
| 50 |
+
all_accepted[category] = []
|
| 51 |
+
all_accepted[category].extend(points)
|
| 52 |
+
|
| 53 |
+
# Merge rejected points
|
| 54 |
+
rejected = details.get('rejected_critique_points', {})
|
| 55 |
+
for category, points in rejected.items():
|
| 56 |
+
if category not in all_rejected:
|
| 57 |
+
all_rejected[category] = []
|
| 58 |
+
all_rejected[category].extend(points)
|
| 59 |
+
|
| 60 |
+
# Collect summaries
|
| 61 |
+
summary = details.get('final_resolution_summary', '')
|
| 62 |
+
if summary:
|
| 63 |
+
resolution_summaries.append(summary)
|
| 64 |
+
|
| 65 |
+
system_prompt = """
|
| 66 |
+
You are an expert meta-reviewer. Your task is to generate a structured, comprehensive
|
| 67 |
+
meta-review based on reviewer critiques, disagreements, and resolutions.
|
| 68 |
+
Your review should be clear, concise, well-structured, and provide actionable feedback.
|
| 69 |
+
|
| 70 |
+
Respond with ONLY the meta-review text (no JSON, no preamble).
|
| 71 |
+
"""
|
| 72 |
+
|
| 73 |
+
user_prompt = f"""
|
| 74 |
+
### **Paper Details**
|
| 75 |
+
**Title:** {paper_title}
|
| 76 |
+
**Abstract:** {paper_abstract}
|
| 77 |
+
|
| 78 |
+
### **Disagreement Resolution Summaries**
|
| 79 |
+
{chr(10).join(f"- {summary}" for summary in resolution_summaries)}
|
| 80 |
+
|
| 81 |
+
### **Accepted Critique Points (Valid Feedback)**
|
| 82 |
+
{json.dumps(all_accepted, indent=2)}
|
| 83 |
+
|
| 84 |
+
### **Rejected Critique Points (Unjustified Criticism)**
|
| 85 |
+
{json.dumps(all_rejected, indent=2)}
|
| 86 |
+
|
| 87 |
+
### **State-of-the-Art (SoTA) Findings**
|
| 88 |
+
{search_results.get('SoTA_Results', '')[:2000]}
|
| 89 |
+
|
| 90 |
+
### **Retrieved Evidence for Validation**
|
| 91 |
+
{json.dumps(search_results.get('Retrieved_Evidence', {}), indent=2)[:2000]}
|
| 92 |
+
|
| 93 |
+
### **Meta-Review Task**
|
| 94 |
+
Generate a comprehensive meta-review that:
|
| 95 |
+
1. Summarizes the paper's main contribution and approach
|
| 96 |
+
2. Discusses the strengths of the paper (based on accepted critiques and evidence)
|
| 97 |
+
3. Discusses the weaknesses and concerns (based on valid accepted critiques)
|
| 98 |
+
4. Addresses key disagreements among reviewers and how they were resolved
|
| 99 |
+
5. Compares the paper's claims with state-of-the-art research
|
| 100 |
+
6. Provides a final verdict on the paper's quality, novelty, significance, and clarity
|
| 101 |
+
7. Offers constructive recommendations for improvement
|
| 102 |
+
|
| 103 |
+
Format the meta-review professionally with clear sections.
|
| 104 |
+
"""
|
| 105 |
+
|
| 106 |
+
return system_prompt, user_prompt
|
| 107 |
+
|
| 108 |
+
async def generate_meta_review(
|
| 109 |
+
paper_title: str,
|
| 110 |
+
paper_abstract: str,
|
| 111 |
+
resolutions: List[Dict],
|
| 112 |
+
search_results: Dict,
|
| 113 |
+
retries: int = 5
|
| 114 |
+
) -> str:
|
| 115 |
+
"""
|
| 116 |
+
Generate a meta-review using DeepSeek-R1
|
| 117 |
+
|
| 118 |
+
Args:
|
| 119 |
+
paper_title: Paper title
|
| 120 |
+
paper_abstract: Paper abstract
|
| 121 |
+
resolutions: List of disagreement resolutions
|
| 122 |
+
search_results: Search and retrieval results
|
| 123 |
+
retries: Maximum retry attempts
|
| 124 |
+
|
| 125 |
+
Returns:
|
| 126 |
+
Generated meta-review text
|
| 127 |
+
"""
|
| 128 |
+
if not resolutions:
|
| 129 |
+
return "Unable to generate meta-review: No disagreement resolutions available."
|
| 130 |
+
|
| 131 |
+
system_prompt, user_prompt = construct_meta_review_prompt(
|
| 132 |
+
paper_title,
|
| 133 |
+
paper_abstract,
|
| 134 |
+
resolutions,
|
| 135 |
+
search_results
|
| 136 |
+
)
|
| 137 |
+
|
| 138 |
+
messages = [
|
| 139 |
+
{"role": "system", "content": system_prompt},
|
| 140 |
+
{"role": "user", "content": user_prompt},
|
| 141 |
+
]
|
| 142 |
+
|
| 143 |
+
for attempt in range(retries):
|
| 144 |
+
try:
|
| 145 |
+
response = await asyncio.to_thread(
|
| 146 |
+
client.chat.completions.create,
|
| 147 |
+
model="deepseek/deepseek-r1",
|
| 148 |
+
messages=messages,
|
| 149 |
+
)
|
| 150 |
+
|
| 151 |
+
if not response.choices or not response.choices[0].message.content.strip():
|
| 152 |
+
raise ValueError("Empty response from DeepSeek-R1")
|
| 153 |
+
|
| 154 |
+
meta_review_text = response.choices[0].message.content.strip()
|
| 155 |
+
|
| 156 |
+
# Remove any JSON formatting if present
|
| 157 |
+
if meta_review_text.startswith("```"):
|
| 158 |
+
lines = meta_review_text.split("\n")
|
| 159 |
+
meta_review_text = "\n".join(lines[1:-1])
|
| 160 |
+
|
| 161 |
+
return meta_review_text
|
| 162 |
+
|
| 163 |
+
except Exception as e:
|
| 164 |
+
wait_time = 2 ** attempt
|
| 165 |
+
print(f"Meta-review generation attempt {attempt + 1} failed: {e}")
|
| 166 |
+
|
| 167 |
+
if attempt < retries - 1:
|
| 168 |
+
await asyncio.sleep(wait_time)
|
| 169 |
+
else:
|
| 170 |
+
return f"Error generating meta-review: {str(e)}"
|
pipeline/search_retrieval.py
ADDED
|
@@ -0,0 +1,224 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
from typing import Dict, List
|
| 3 |
+
import asyncio
|
| 4 |
+
from langchain_google_genai import ChatGoogleGenerativeAI
|
| 5 |
+
from langchain_community.utilities import ArxivAPIWrapper, SerpAPIWrapper
|
| 6 |
+
from langchain_community.tools.semanticscholar.tool import SemanticScholarQueryRun
|
| 7 |
+
from langchain_community.tools.tavily_search import TavilySearchResults
|
| 8 |
+
from langchain.agents import AgentType, initialize_agent, AgentExecutor
|
| 9 |
+
from langchain.tools import Tool
|
| 10 |
+
|
| 11 |
+
from dotenv import load_dotenv
|
| 12 |
+
load_dotenv()
|
| 13 |
+
|
| 14 |
+
# Initialize LLM
|
| 15 |
+
llm = ChatGoogleGenerativeAI(
|
| 16 |
+
model=os.getenv("GEMINI_MODEL"),
|
| 17 |
+
google_api_key=os.getenv("GEMINI_API_KEY"),
|
| 18 |
+
max_retries=2,
|
| 19 |
+
)
|
| 20 |
+
|
| 21 |
+
# Initialize search tools
|
| 22 |
+
semantic_scholar = SemanticScholarQueryRun()
|
| 23 |
+
google_scholar = SerpAPIWrapper(params={"engine": "google_scholar"})
|
| 24 |
+
arxiv_search = ArxivAPIWrapper()
|
| 25 |
+
tavily_search = TavilySearchResults(max_results=5)
|
| 26 |
+
|
| 27 |
+
# Define tools
|
| 28 |
+
tools = [
|
| 29 |
+
Tool(
|
| 30 |
+
name="TavilySearch",
|
| 31 |
+
func=tavily_search.run,
|
| 32 |
+
description="Retrieves the latest State-of-the-Art (SoTA) research and current academic information"
|
| 33 |
+
),
|
| 34 |
+
Tool(
|
| 35 |
+
name="SemanticScholar",
|
| 36 |
+
func=semantic_scholar.run,
|
| 37 |
+
description="Find academic papers from Semantic Scholar database"
|
| 38 |
+
),
|
| 39 |
+
Tool(
|
| 40 |
+
name="GoogleScholar",
|
| 41 |
+
func=google_scholar.run,
|
| 42 |
+
description="Search for scholarly articles and citations"
|
| 43 |
+
),
|
| 44 |
+
Tool(
|
| 45 |
+
name="ArxivSearch",
|
| 46 |
+
func=arxiv_search.run,
|
| 47 |
+
description="Find research papers from ArXiv preprint repository"
|
| 48 |
+
),
|
| 49 |
+
]
|
| 50 |
+
|
| 51 |
+
# Initialize agent
|
| 52 |
+
agent = initialize_agent(
|
| 53 |
+
tools=tools,
|
| 54 |
+
llm=llm,
|
| 55 |
+
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
|
| 56 |
+
verbose=False,
|
| 57 |
+
handle_parsing_errors=True,
|
| 58 |
+
max_iterations=10
|
| 59 |
+
)
|
| 60 |
+
|
| 61 |
+
def combine_critiques(critique_points: List[Dict]) -> Dict[str, str]:
|
| 62 |
+
"""
|
| 63 |
+
Combine critique points from multiple reviews into categories
|
| 64 |
+
|
| 65 |
+
Args:
|
| 66 |
+
critique_points: List of critique dictionaries
|
| 67 |
+
|
| 68 |
+
Returns:
|
| 69 |
+
Dictionary with combined critiques per category
|
| 70 |
+
"""
|
| 71 |
+
categories = ["Methodology", "Clarity", "Experiments", "Significance", "Novelty"]
|
| 72 |
+
combined = {cat: [] for cat in categories}
|
| 73 |
+
|
| 74 |
+
for review in critique_points:
|
| 75 |
+
for category in categories:
|
| 76 |
+
if category in review and review[category]:
|
| 77 |
+
combined[category].extend(review[category])
|
| 78 |
+
|
| 79 |
+
# Join into strings
|
| 80 |
+
for category in categories:
|
| 81 |
+
combined[category] = " | ".join(combined[category]) if combined[category] else "No critiques"
|
| 82 |
+
|
| 83 |
+
return combined
|
| 84 |
+
|
| 85 |
+
async def search_sota(paper_title: str, paper_abstract: str, retries: int = 3) -> str:
|
| 86 |
+
"""
|
| 87 |
+
Search for state-of-the-art research related to the paper
|
| 88 |
+
|
| 89 |
+
Args:
|
| 90 |
+
paper_title: Paper title
|
| 91 |
+
paper_abstract: Paper abstract
|
| 92 |
+
retries: Maximum retry attempts
|
| 93 |
+
|
| 94 |
+
Returns:
|
| 95 |
+
Summary of SoTA findings
|
| 96 |
+
"""
|
| 97 |
+
query = (
|
| 98 |
+
f"Find the latest state-of-the-art research related to: '{paper_title}'. "
|
| 99 |
+
f"Abstract: {paper_abstract[:500]}. "
|
| 100 |
+
f"Focus on recent advances, similar methodologies, and competing approaches."
|
| 101 |
+
)
|
| 102 |
+
|
| 103 |
+
for attempt in range(retries):
|
| 104 |
+
try:
|
| 105 |
+
result = await asyncio.to_thread(agent.run, query)
|
| 106 |
+
|
| 107 |
+
if not result or len(result.strip()) < 50:
|
| 108 |
+
raise ValueError("Empty or insufficient response")
|
| 109 |
+
|
| 110 |
+
return result
|
| 111 |
+
|
| 112 |
+
except Exception as e:
|
| 113 |
+
wait_time = 2 ** attempt
|
| 114 |
+
print(f"SoTA search attempt {attempt + 1} failed: {e}")
|
| 115 |
+
|
| 116 |
+
if attempt < retries - 1:
|
| 117 |
+
await asyncio.sleep(wait_time)
|
| 118 |
+
else:
|
| 119 |
+
return f"Error retrieving SoTA research: {str(e)}"
|
| 120 |
+
|
| 121 |
+
async def retrieve_evidence_for_category(
|
| 122 |
+
category: str,
|
| 123 |
+
critiques: str,
|
| 124 |
+
retries: int = 3
|
| 125 |
+
) -> str:
|
| 126 |
+
"""
|
| 127 |
+
Retrieve evidence for critiques in a specific category
|
| 128 |
+
|
| 129 |
+
Args:
|
| 130 |
+
category: Category name (e.g., "Methodology")
|
| 131 |
+
critiques: Combined critique text
|
| 132 |
+
retries: Maximum retry attempts
|
| 133 |
+
|
| 134 |
+
Returns:
|
| 135 |
+
Evidence findings
|
| 136 |
+
"""
|
| 137 |
+
if critiques == "No critiques" or not critiques.strip():
|
| 138 |
+
return f"No critiques to validate for {category}"
|
| 139 |
+
|
| 140 |
+
query = (
|
| 141 |
+
f"Find research papers that support or contradict these critiques "
|
| 142 |
+
f"related to {category}: {critiques[:500]}"
|
| 143 |
+
)
|
| 144 |
+
|
| 145 |
+
for attempt in range(retries):
|
| 146 |
+
try:
|
| 147 |
+
result = await asyncio.to_thread(agent.run, query)
|
| 148 |
+
|
| 149 |
+
if not result:
|
| 150 |
+
raise ValueError("Empty response")
|
| 151 |
+
|
| 152 |
+
return result
|
| 153 |
+
|
| 154 |
+
except Exception as e:
|
| 155 |
+
wait_time = 2 ** attempt
|
| 156 |
+
print(f"Evidence retrieval for {category} attempt {attempt + 1} failed: {e}")
|
| 157 |
+
|
| 158 |
+
if attempt < retries - 1:
|
| 159 |
+
await asyncio.sleep(wait_time)
|
| 160 |
+
else:
|
| 161 |
+
return f"Error retrieving evidence for {category}: {str(e)}"
|
| 162 |
+
|
| 163 |
+
async def retrieve_evidence(combined_critiques: Dict[str, str]) -> Dict[str, str]:
|
| 164 |
+
"""
|
| 165 |
+
Retrieve evidence for all critique categories
|
| 166 |
+
|
| 167 |
+
Args:
|
| 168 |
+
combined_critiques: Dictionary of combined critiques per category
|
| 169 |
+
|
| 170 |
+
Returns:
|
| 171 |
+
Dictionary of evidence per category
|
| 172 |
+
"""
|
| 173 |
+
evidence_results = {}
|
| 174 |
+
|
| 175 |
+
# Process categories with rate limiting
|
| 176 |
+
for category, critiques in combined_critiques.items():
|
| 177 |
+
evidence_results[category] = await retrieve_evidence_for_category(
|
| 178 |
+
category,
|
| 179 |
+
critiques
|
| 180 |
+
)
|
| 181 |
+
# Delay between requests
|
| 182 |
+
await asyncio.sleep(1)
|
| 183 |
+
|
| 184 |
+
return evidence_results
|
| 185 |
+
|
| 186 |
+
async def search_and_retrieve(
|
| 187 |
+
paper_title: str,
|
| 188 |
+
paper_abstract: str,
|
| 189 |
+
critique_points: List[Dict]
|
| 190 |
+
) -> Dict:
|
| 191 |
+
"""
|
| 192 |
+
Complete search and retrieval pipeline
|
| 193 |
+
|
| 194 |
+
Args:
|
| 195 |
+
paper_title: Paper title
|
| 196 |
+
paper_abstract: Paper abstract
|
| 197 |
+
critique_points: List of critique point dictionaries
|
| 198 |
+
|
| 199 |
+
Returns:
|
| 200 |
+
Dictionary with SoTA results, combined critiques, and evidence
|
| 201 |
+
"""
|
| 202 |
+
try:
|
| 203 |
+
# Step 1: Search for SoTA research
|
| 204 |
+
sota_results = await search_sota(paper_title, paper_abstract)
|
| 205 |
+
|
| 206 |
+
# Step 2: Combine critique points
|
| 207 |
+
combined_critiques = combine_critiques(critique_points)
|
| 208 |
+
|
| 209 |
+
# Step 3: Retrieve evidence for critiques
|
| 210 |
+
evidence = await retrieve_evidence(combined_critiques)
|
| 211 |
+
|
| 212 |
+
return {
|
| 213 |
+
"SoTA_Results": sota_results,
|
| 214 |
+
"Combined_Critiques": combined_critiques,
|
| 215 |
+
"Retrieved_Evidence": evidence
|
| 216 |
+
}
|
| 217 |
+
|
| 218 |
+
except Exception as e:
|
| 219 |
+
return {
|
| 220 |
+
"error": str(e),
|
| 221 |
+
"SoTA_Results": "",
|
| 222 |
+
"Combined_Critiques": {},
|
| 223 |
+
"Retrieved_Evidence": {}
|
| 224 |
+
}
|
requirements.txt
ADDED
|
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Web Framework
|
| 2 |
+
gradio==5.9.1
|
| 3 |
+
|
| 4 |
+
# LLM Libraries
|
| 5 |
+
openai==1.59.5
|
| 6 |
+
google-generativeai==0.8.3
|
| 7 |
+
|
| 8 |
+
# LangChain and Tools
|
| 9 |
+
langchain==0.3.13
|
| 10 |
+
langchain-community==0.3.13
|
| 11 |
+
langchain-google-genai==2.0.8
|
| 12 |
+
langgraph==0.2.59
|
| 13 |
+
langgraph-checkpoint-sqlite==2.0.5
|
| 14 |
+
|
| 15 |
+
# Search APIs
|
| 16 |
+
tavily-python==0.5.0
|
| 17 |
+
semanticscholar==0.8.4
|
| 18 |
+
arxiv==2.1.3
|
| 19 |
+
google-search-results==2.4.2
|
| 20 |
+
|
| 21 |
+
# Data Processing
|
| 22 |
+
pandas==2.2.3
|
| 23 |
+
pydantic==2.10.4
|
| 24 |
+
python-dotenv==1.0.1
|
| 25 |
+
|
| 26 |
+
# API & Async
|
| 27 |
+
fastapi==0.115.6
|
| 28 |
+
uvicorn==0.34.0
|
| 29 |
+
aiohttp==3.11.11
|
| 30 |
+
httpx==0.28.1
|
| 31 |
+
|
| 32 |
+
# Utilities
|
| 33 |
+
tqdm==4.67.1
|
| 34 |
+
mlflow==2.19.0
|
| 35 |
+
|
| 36 |
+
# Rate Limiting & Queue
|
| 37 |
+
ratelimit==2.2.1
|
| 38 |
+
asyncio-throttle==1.0.1
|
utils/__init__.py
ADDED
|
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Utility modules for API functionality
|
| 3 |
+
"""
|
| 4 |
+
|
| 5 |
+
from .rate_limiter import RateLimiter
|
| 6 |
+
from .queue_manager import QueueManager
|
| 7 |
+
from .validators import (
|
| 8 |
+
validate_paper_input,
|
| 9 |
+
validate_critique_input,
|
| 10 |
+
validate_disagreement_input,
|
| 11 |
+
validate_search_input,
|
| 12 |
+
PaperInput,
|
| 13 |
+
CritiqueInput,
|
| 14 |
+
DisagreementInput,
|
| 15 |
+
SearchInput,
|
| 16 |
+
)
|
| 17 |
+
|
| 18 |
+
__all__ = [
|
| 19 |
+
'RateLimiter',
|
| 20 |
+
'QueueManager',
|
| 21 |
+
'validate_paper_input',
|
| 22 |
+
'validate_critique_input',
|
| 23 |
+
'validate_disagreement_input',
|
| 24 |
+
'validate_search_input',
|
| 25 |
+
'PaperInput',
|
| 26 |
+
'CritiqueInput',
|
| 27 |
+
'DisagreementInput',
|
| 28 |
+
'SearchInput',
|
| 29 |
+
]
|
utils/queue_manager.py
ADDED
|
@@ -0,0 +1,76 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import asyncio
|
| 2 |
+
from typing import Coroutine, Any
|
| 3 |
+
from asyncio import Semaphore, Queue
|
| 4 |
+
from datetime import datetime
|
| 5 |
+
|
| 6 |
+
class QueueManager:
|
| 7 |
+
"""
|
| 8 |
+
Async queue manager for handling concurrent pipeline executions
|
| 9 |
+
"""
|
| 10 |
+
|
| 11 |
+
def __init__(self, max_concurrent: int = 3):
|
| 12 |
+
"""
|
| 13 |
+
Initialize queue manager
|
| 14 |
+
|
| 15 |
+
Args:
|
| 16 |
+
max_concurrent: Maximum number of concurrent tasks
|
| 17 |
+
"""
|
| 18 |
+
self.max_concurrent = max_concurrent
|
| 19 |
+
self.semaphore = Semaphore(max_concurrent)
|
| 20 |
+
self.queue: Queue = Queue()
|
| 21 |
+
self.active_tasks = 0
|
| 22 |
+
self.total_processed = 0
|
| 23 |
+
|
| 24 |
+
async def add_task(self, coro: Coroutine) -> Any:
|
| 25 |
+
"""
|
| 26 |
+
Add a task to the queue and execute it
|
| 27 |
+
|
| 28 |
+
Args:
|
| 29 |
+
coro: Coroutine to execute
|
| 30 |
+
|
| 31 |
+
Returns:
|
| 32 |
+
Result from the coroutine
|
| 33 |
+
"""
|
| 34 |
+
async with self.semaphore:
|
| 35 |
+
self.active_tasks += 1
|
| 36 |
+
try:
|
| 37 |
+
result = await coro
|
| 38 |
+
self.total_processed += 1
|
| 39 |
+
return result
|
| 40 |
+
finally:
|
| 41 |
+
self.active_tasks -= 1
|
| 42 |
+
|
| 43 |
+
def get_queue_status(self) -> dict:
|
| 44 |
+
"""
|
| 45 |
+
Get current queue status
|
| 46 |
+
|
| 47 |
+
Returns:
|
| 48 |
+
Dictionary with queue statistics
|
| 49 |
+
"""
|
| 50 |
+
return {
|
| 51 |
+
"active_tasks": self.active_tasks,
|
| 52 |
+
"max_concurrent": self.max_concurrent,
|
| 53 |
+
"total_processed": self.total_processed,
|
| 54 |
+
"available_slots": self.max_concurrent - self.active_tasks,
|
| 55 |
+
"timestamp": datetime.now().isoformat()
|
| 56 |
+
}
|
| 57 |
+
|
| 58 |
+
async def wait_for_slot(self, timeout: float = 60.0) -> bool:
|
| 59 |
+
"""
|
| 60 |
+
Wait for an available slot in the queue
|
| 61 |
+
|
| 62 |
+
Args:
|
| 63 |
+
timeout: Maximum time to wait in seconds
|
| 64 |
+
|
| 65 |
+
Returns:
|
| 66 |
+
True if slot became available, False if timeout
|
| 67 |
+
"""
|
| 68 |
+
start_time = asyncio.get_event_loop().time()
|
| 69 |
+
|
| 70 |
+
while self.active_tasks >= self.max_concurrent:
|
| 71 |
+
if asyncio.get_event_loop().time() - start_time > timeout:
|
| 72 |
+
return False
|
| 73 |
+
|
| 74 |
+
await asyncio.sleep(0.5)
|
| 75 |
+
|
| 76 |
+
return True
|
utils/rate_limiter.py
ADDED
|
@@ -0,0 +1,84 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import time
|
| 2 |
+
from collections import defaultdict, deque
|
| 3 |
+
from threading import Lock
|
| 4 |
+
from typing import Dict
|
| 5 |
+
|
| 6 |
+
class RateLimiter:
|
| 7 |
+
"""
|
| 8 |
+
Thread-safe rate limiter for API requests
|
| 9 |
+
"""
|
| 10 |
+
|
| 11 |
+
def __init__(self, max_requests_per_minute: int = 10):
|
| 12 |
+
"""
|
| 13 |
+
Initialize rate limiter
|
| 14 |
+
|
| 15 |
+
Args:
|
| 16 |
+
max_requests_per_minute: Maximum requests allowed per minute
|
| 17 |
+
"""
|
| 18 |
+
self.max_requests = max_requests_per_minute
|
| 19 |
+
self.window_seconds = 60
|
| 20 |
+
self.requests: Dict[str, deque] = defaultdict(deque)
|
| 21 |
+
self.lock = Lock()
|
| 22 |
+
|
| 23 |
+
def _clean_old_requests(self, client_id: str):
|
| 24 |
+
"""Remove requests older than the time window"""
|
| 25 |
+
current_time = time.time()
|
| 26 |
+
cutoff_time = current_time - self.window_seconds
|
| 27 |
+
|
| 28 |
+
while self.requests[client_id] and self.requests[client_id][0] < cutoff_time:
|
| 29 |
+
self.requests[client_id].popleft()
|
| 30 |
+
|
| 31 |
+
def allow_request(self, client_id: str = "default") -> bool:
|
| 32 |
+
"""
|
| 33 |
+
Check if a request is allowed
|
| 34 |
+
|
| 35 |
+
Args:
|
| 36 |
+
client_id: Identifier for the client (e.g., IP address)
|
| 37 |
+
|
| 38 |
+
Returns:
|
| 39 |
+
True if request is allowed, False otherwise
|
| 40 |
+
"""
|
| 41 |
+
with self.lock:
|
| 42 |
+
self._clean_old_requests(client_id)
|
| 43 |
+
|
| 44 |
+
if len(self.requests[client_id]) >= self.max_requests:
|
| 45 |
+
return False
|
| 46 |
+
|
| 47 |
+
self.requests[client_id].append(time.time())
|
| 48 |
+
return True
|
| 49 |
+
|
| 50 |
+
def get_remaining_requests(self, client_id: str = "default") -> int:
|
| 51 |
+
"""
|
| 52 |
+
Get number of remaining requests in current window
|
| 53 |
+
|
| 54 |
+
Args:
|
| 55 |
+
client_id: Identifier for the client
|
| 56 |
+
|
| 57 |
+
Returns:
|
| 58 |
+
Number of remaining requests
|
| 59 |
+
"""
|
| 60 |
+
with self.lock:
|
| 61 |
+
self._clean_old_requests(client_id)
|
| 62 |
+
return max(0, self.max_requests - len(self.requests[client_id]))
|
| 63 |
+
|
| 64 |
+
def get_reset_time(self, client_id: str = "default") -> float:
|
| 65 |
+
"""
|
| 66 |
+
Get time until rate limit resets
|
| 67 |
+
|
| 68 |
+
Args:
|
| 69 |
+
client_id: Identifier for the client
|
| 70 |
+
|
| 71 |
+
Returns:
|
| 72 |
+
Seconds until oldest request expires
|
| 73 |
+
"""
|
| 74 |
+
with self.lock:
|
| 75 |
+
self._clean_old_requests(client_id)
|
| 76 |
+
|
| 77 |
+
if not self.requests[client_id]:
|
| 78 |
+
return 0
|
| 79 |
+
|
| 80 |
+
oldest_request = self.requests[client_id][0]
|
| 81 |
+
current_time = time.time()
|
| 82 |
+
reset_time = oldest_request + self.window_seconds
|
| 83 |
+
|
| 84 |
+
return max(0, reset_time - current_time)
|
utils/validators.py
ADDED
|
@@ -0,0 +1,196 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from typing import List, Dict, Tuple
|
| 2 |
+
from pydantic import BaseModel, Field, field_validator
|
| 3 |
+
|
| 4 |
+
class PaperInput(BaseModel):
|
| 5 |
+
"""Validated paper input schema"""
|
| 6 |
+
paper_title: str = Field(..., min_length=5, max_length=500)
|
| 7 |
+
paper_abstract: str = Field(..., min_length=50, max_length=5000)
|
| 8 |
+
reviews: List[str] = Field(..., min_length=1, max_length=10)
|
| 9 |
+
|
| 10 |
+
@field_validator('paper_title')
|
| 11 |
+
@classmethod
|
| 12 |
+
def validate_title(cls, v: str) -> str:
|
| 13 |
+
"""Validate paper title"""
|
| 14 |
+
if not v or not v.strip():
|
| 15 |
+
raise ValueError("Paper title cannot be empty")
|
| 16 |
+
return v.strip()
|
| 17 |
+
|
| 18 |
+
@field_validator('paper_abstract')
|
| 19 |
+
@classmethod
|
| 20 |
+
def validate_abstract(cls, v: str) -> str:
|
| 21 |
+
"""Validate paper abstract"""
|
| 22 |
+
if not v or not v.strip():
|
| 23 |
+
raise ValueError("Paper abstract cannot be empty")
|
| 24 |
+
if len(v.strip()) < 50:
|
| 25 |
+
raise ValueError("Paper abstract must be at least 50 characters")
|
| 26 |
+
return v.strip()
|
| 27 |
+
|
| 28 |
+
@field_validator('reviews')
|
| 29 |
+
@classmethod
|
| 30 |
+
def validate_reviews(cls, v: List[str]) -> List[str]:
|
| 31 |
+
"""Validate reviews"""
|
| 32 |
+
if not v:
|
| 33 |
+
raise ValueError("At least one review is required")
|
| 34 |
+
|
| 35 |
+
valid_reviews = []
|
| 36 |
+
for i, review in enumerate(v):
|
| 37 |
+
if not isinstance(review, str):
|
| 38 |
+
raise ValueError(f"Review {i} must be a string")
|
| 39 |
+
|
| 40 |
+
cleaned = review.strip()
|
| 41 |
+
if len(cleaned) < 50:
|
| 42 |
+
raise ValueError(f"Review {i} must be at least 50 characters")
|
| 43 |
+
|
| 44 |
+
valid_reviews.append(cleaned)
|
| 45 |
+
|
| 46 |
+
return valid_reviews
|
| 47 |
+
|
| 48 |
+
class CritiqueInput(BaseModel):
|
| 49 |
+
"""Validated critique input schema"""
|
| 50 |
+
reviews: List[str] = Field(..., min_length=1, max_length=10)
|
| 51 |
+
|
| 52 |
+
@field_validator('reviews')
|
| 53 |
+
@classmethod
|
| 54 |
+
def validate_reviews(cls, v: List[str]) -> List[str]:
|
| 55 |
+
"""Validate reviews"""
|
| 56 |
+
if not v:
|
| 57 |
+
raise ValueError("At least one review is required")
|
| 58 |
+
|
| 59 |
+
valid_reviews = []
|
| 60 |
+
for review in v:
|
| 61 |
+
if isinstance(review, str) and len(review.strip()) >= 50:
|
| 62 |
+
valid_reviews.append(review.strip())
|
| 63 |
+
|
| 64 |
+
if not valid_reviews:
|
| 65 |
+
raise ValueError("No valid reviews found (must be at least 50 characters)")
|
| 66 |
+
|
| 67 |
+
return valid_reviews
|
| 68 |
+
|
| 69 |
+
class DisagreementInput(BaseModel):
|
| 70 |
+
"""Validated disagreement detection input schema"""
|
| 71 |
+
critiques: List[Dict] = Field(..., min_length=2)
|
| 72 |
+
|
| 73 |
+
@field_validator('critiques')
|
| 74 |
+
@classmethod
|
| 75 |
+
def validate_critiques(cls, v: List[Dict]) -> List[Dict]:
|
| 76 |
+
"""Validate critique structure"""
|
| 77 |
+
if len(v) < 2:
|
| 78 |
+
raise ValueError("At least 2 critiques required for disagreement detection")
|
| 79 |
+
|
| 80 |
+
required_keys = {'Methodology', 'Experiments', 'Clarity', 'Significance', 'Novelty'}
|
| 81 |
+
|
| 82 |
+
for i, critique in enumerate(v):
|
| 83 |
+
if not isinstance(critique, dict):
|
| 84 |
+
raise ValueError(f"Critique {i} must be a dictionary")
|
| 85 |
+
|
| 86 |
+
# Check if critique has the expected structure
|
| 87 |
+
if not any(key in critique for key in required_keys):
|
| 88 |
+
raise ValueError(f"Critique {i} missing required categories")
|
| 89 |
+
|
| 90 |
+
return v
|
| 91 |
+
|
| 92 |
+
class SearchInput(BaseModel):
|
| 93 |
+
"""Validated search input schema"""
|
| 94 |
+
paper_title: str = Field(..., min_length=5, max_length=500)
|
| 95 |
+
paper_abstract: str = Field(..., min_length=50, max_length=5000)
|
| 96 |
+
critiques: List[Dict] = Field(..., min_length=1)
|
| 97 |
+
|
| 98 |
+
@field_validator('paper_title')
|
| 99 |
+
@classmethod
|
| 100 |
+
def validate_title(cls, v: str) -> str:
|
| 101 |
+
"""Validate paper title"""
|
| 102 |
+
if not v or not v.strip():
|
| 103 |
+
raise ValueError("Paper title cannot be empty")
|
| 104 |
+
return v.strip()
|
| 105 |
+
|
| 106 |
+
@field_validator('paper_abstract')
|
| 107 |
+
@classmethod
|
| 108 |
+
def validate_abstract(cls, v: str) -> str:
|
| 109 |
+
"""Validate paper abstract"""
|
| 110 |
+
if not v or not v.strip():
|
| 111 |
+
raise ValueError("Paper abstract cannot be empty")
|
| 112 |
+
return v.strip()
|
| 113 |
+
|
| 114 |
+
def validate_paper_input(
|
| 115 |
+
paper_title: str,
|
| 116 |
+
paper_abstract: str,
|
| 117 |
+
reviews: List[str]
|
| 118 |
+
) -> Tuple[bool, str]:
|
| 119 |
+
"""
|
| 120 |
+
Validate paper input data
|
| 121 |
+
|
| 122 |
+
Args:
|
| 123 |
+
paper_title: Paper title
|
| 124 |
+
paper_abstract: Paper abstract
|
| 125 |
+
reviews: List of review texts
|
| 126 |
+
|
| 127 |
+
Returns:
|
| 128 |
+
Tuple of (is_valid, error_message)
|
| 129 |
+
"""
|
| 130 |
+
try:
|
| 131 |
+
PaperInput(
|
| 132 |
+
paper_title=paper_title,
|
| 133 |
+
paper_abstract=paper_abstract,
|
| 134 |
+
reviews=reviews
|
| 135 |
+
)
|
| 136 |
+
return True, ""
|
| 137 |
+
except Exception as e:
|
| 138 |
+
return False, str(e)
|
| 139 |
+
|
| 140 |
+
def validate_critique_input(reviews: List[str]) -> Tuple[bool, str]:
|
| 141 |
+
"""
|
| 142 |
+
Validate critique extraction input
|
| 143 |
+
|
| 144 |
+
Args:
|
| 145 |
+
reviews: List of review texts
|
| 146 |
+
|
| 147 |
+
Returns:
|
| 148 |
+
Tuple of (is_valid, error_message)
|
| 149 |
+
"""
|
| 150 |
+
try:
|
| 151 |
+
CritiqueInput(reviews=reviews)
|
| 152 |
+
return True, ""
|
| 153 |
+
except Exception as e:
|
| 154 |
+
return False, str(e)
|
| 155 |
+
|
| 156 |
+
def validate_disagreement_input(critiques: List[Dict]) -> Tuple[bool, str]:
|
| 157 |
+
"""
|
| 158 |
+
Validate disagreement detection input
|
| 159 |
+
|
| 160 |
+
Args:
|
| 161 |
+
critiques: List of critique dictionaries
|
| 162 |
+
|
| 163 |
+
Returns:
|
| 164 |
+
Tuple of (is_valid, error_message)
|
| 165 |
+
"""
|
| 166 |
+
try:
|
| 167 |
+
DisagreementInput(critiques=critiques)
|
| 168 |
+
return True, ""
|
| 169 |
+
except Exception as e:
|
| 170 |
+
return False, str(e)
|
| 171 |
+
|
| 172 |
+
def validate_search_input(
|
| 173 |
+
paper_title: str,
|
| 174 |
+
paper_abstract: str,
|
| 175 |
+
critiques: List[Dict]
|
| 176 |
+
) -> Tuple[bool, str]:
|
| 177 |
+
"""
|
| 178 |
+
Validate search input
|
| 179 |
+
|
| 180 |
+
Args:
|
| 181 |
+
paper_title: Paper title
|
| 182 |
+
paper_abstract: Paper abstract
|
| 183 |
+
critiques: List of critique dictionaries
|
| 184 |
+
|
| 185 |
+
Returns:
|
| 186 |
+
Tuple of (is_valid, error_message)
|
| 187 |
+
"""
|
| 188 |
+
try:
|
| 189 |
+
SearchInput(
|
| 190 |
+
paper_title=paper_title,
|
| 191 |
+
paper_abstract=paper_abstract,
|
| 192 |
+
critiques=critiques
|
| 193 |
+
)
|
| 194 |
+
return True, ""
|
| 195 |
+
except Exception as e:
|
| 196 |
+
return False, str(e)
|