Spaces:

VibecoderMcSwaggins
/

DeepBoner

Paused

VibecoderMcSwaggins commited on 10 days ago

Commit

8d97867

1 Parent(s): dacd086

docs: Remove outdated architecture and implementation documents

- Delete obsolete files related to the dual-mode architecture plan, including situation analysis, architecture specification, implementation phases, immediate actions, follow-up review request, and senior agent review prompt.
- These documents are no longer relevant following the recent architectural decisions and updates to the project structure.

Files changed (6) hide show

docs/decisions/architecture-2025-11/00_SITUATION_AND_PLAN.md +0 -189
docs/decisions/architecture-2025-11/01_ARCHITECTURE_SPEC.md +0 -289
docs/decisions/architecture-2025-11/02_IMPLEMENTATION_PHASES.md +0 -112
docs/decisions/architecture-2025-11/03_IMMEDIATE_ACTIONS.md +0 -112
docs/decisions/architecture-2025-11/04_FOLLOWUP_REVIEW_REQUEST.md +0 -158
docs/decisions/architecture-2025-11/REVIEW_PROMPT_FOR_SENIOR_AGENT.md +0 -113

docs/decisions/architecture-2025-11/00_SITUATION_AND_PLAN.md DELETED Viewed

@@ -1,189 +0,0 @@
-# Situation Analysis: Pydantic-AI + Microsoft Agent Framework Integration
-**Date:** November 27, 2025
-**Status:** ACTIVE DECISION REQUIRED
-**Risk Level:** HIGH - DO NOT MERGE PR #41 UNTIL RESOLVED
----
-## 1. The Problem
-We almost merged a refactor that would have **deleted** multi-agent orchestration capability from the codebase, mistakenly believing pydantic-ai and Microsoft Agent Framework were mutually exclusive.
-**They are not.** They are complementary:
-- **pydantic-ai** (Library): Ensures LLM outputs match Pydantic schemas
-- **Microsoft Agent Framework** (Framework): Orchestrates multi-agent workflows
----
-## 2. Current Branch State
-| Branch | Location | Has Agent Framework? | Has Pydantic-AI Improvements? | Status |
-|--------|----------|---------------------|------------------------------|--------|
-| `origin/dev` | GitHub | YES | NO | **SAFE - Source of Truth** |
-| `huggingface-upstream/dev` | HF Spaces | YES | NO | **SAFE - Same as GitHub** |
-| `origin/main` | GitHub | YES | NO | **SAFE** |
-| `feat/pubmed-fulltext` | GitHub | NO (deleted) | YES | **DANGER - Has destructive refactor** |
-| `refactor/pydantic-unification` | Local | NO (deleted) | YES | **DANGER - Redundant, delete** |
-| Local `dev` | Local only | NO (deleted) | YES | **DANGER - NOT PUSHED (thankfully)** |
-### Key Files at Risk
-**On `origin/dev` (PRESERVED):**
-```text
-src/agents/
-├── analysis_agent.py      # StatisticalAnalyzer wrapper
-├── hypothesis_agent.py    # Hypothesis generation
-├── judge_agent.py         # JudgeHandler wrapper
-├── magentic_agents.py     # Multi-agent definitions
-├── report_agent.py        # Report synthesis
-├── search_agent.py        # SearchHandler wrapper
-├── state.py               # Thread-safe state management
-└── tools.py               # @ai_function decorated tools
-src/orchestrator_magentic.py  # Multi-agent orchestrator
-src/utils/llm_factory.py      # Centralized LLM client factory
-```
-**Deleted in refactor branch (would be lost if merged):**
-- All of the above
----
-## 3. Target Architecture
-```text
-┌─────────────────────────────────────────────────────────────────┐
-│  Microsoft Agent Framework (Orchestration Layer)                │
-│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
-│  │ SearchAgent  │→ │ JudgeAgent   │→ │ ReportAgent  │          │
-│  │ (BaseAgent)  │  │ (BaseAgent)  │  │ (BaseAgent)  │          │
-│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘          │
-│         │                 │                 │                  │
-│         ▼                 ▼                 ▼                  │
-│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
-│  │ pydantic-ai  │  │ pydantic-ai  │  │ pydantic-ai  │          │
-│  │ Agent()      │  │ Agent()      │  │ Agent()      │          │
-│  │ output_type= │  │ output_type= │  │ output_type= │          │
-│  │ SearchResult │  │ JudgeAssess  │  │ Report       │          │
-│  └──────────────┘  └──────────────┘  └──────────────┘          │
-└─────────────────────────────────────────────────────────────────┘
-```
-**Why this architecture:**
-1. **Agent Framework** handles: workflow coordination, state passing, middleware, observability
-2. **pydantic-ai** handles: type-safe LLM calls within each agent
----
-## 4. CRITICAL: Naming Confusion Clarification
-> **Senior Agent Review Finding:** The codebase uses "magentic" in file names (e.g., `orchestrator_magentic.py`, `magentic_agents.py`) but this is **NOT** the `magentic` PyPI package by Jacky Liang. It's Microsoft Agent Framework (`agent-framework-core`).
-**The naming confusion:**
-- `magentic` (PyPI package): A different library for structured LLM outputs
-- "Magentic" (in our codebase): Our internal name for Microsoft Agent Framework integration
-- `agent-framework-core` (PyPI package): Microsoft's actual multi-agent orchestration framework
-**Recommended future action:** Rename `orchestrator_magentic.py` → `orchestrator_advanced.py` to eliminate confusion.
----
-## 5. What the Refactor DID Get Right
-The refactor branch (`feat/pubmed-fulltext`) has some valuable improvements:
-1. **`judges.py` unified `get_model()`** - Supports OpenAI, Anthropic, AND HuggingFace via pydantic-ai
-2. **HuggingFace free tier support** - `HuggingFaceModel` integration
-3. **Test fix** - Properly mocks `HuggingFaceModel` class
-4. **Removed broken magentic optional dependency** from pyproject.toml (this was correct - the old `magentic` package is different from Microsoft Agent Framework)
-**What it got WRONG:**
-1. Deleted `src/agents/` entirely instead of refactoring them
-2. Deleted `src/orchestrator_magentic.py` instead of fixing it
-3. Conflated "magentic" (old package) with "Microsoft Agent Framework" (current framework)
----
-## 6. Options for Path Forward
-### Option A: Abandon Refactor, Start Fresh
-- Close PR #41
-- Delete `feat/pubmed-fulltext` and `refactor/pydantic-unification` branches
-- Reset local `dev` to match `origin/dev`
-- Cherry-pick ONLY the good parts (judges.py improvements, HF support)
-- **Pros:** Clean, safe
-- **Cons:** Lose some work, need to redo carefully
-### Option B: Cherry-Pick Good Parts to origin/dev
-- Do NOT merge PR #41
-- Create new branch from `origin/dev`
-- Cherry-pick specific commits/changes that improve pydantic-ai usage
-- Keep agent framework code intact
-- **Pros:** Preserves both, surgical
-- **Cons:** Requires careful file-by-file review
-### Option C: Revert Deletions in Refactor Branch
-- On `feat/pubmed-fulltext`, restore deleted agent files from `origin/dev`
-- Keep the pydantic-ai improvements
-- Merge THAT to dev
-- **Pros:** Gets both
-- **Cons:** Complex git operations, risk of conflicts
----
-## 7. Recommended Action: Option B (Cherry-Pick)
-**Step-by-step:**
-1. **Close PR #41** (do not merge)
-2. **Delete redundant branches:**
-   - `refactor/pydantic-unification` (local)
-   - Reset local `dev` to `origin/dev`
-3. **Create new branch from origin/dev:**
-   ```bash
-   git checkout -b feat/pydantic-ai-improvements origin/dev
-   ```
-4. **Cherry-pick or manually port these improvements:**
-   - `src/agent_factory/judges.py` - the unified `get_model()` function
-   - `examples/free_tier_demo.py` - HuggingFace demo
-   - Test improvements
-5. **Do NOT delete any agent framework files**
-6. **Create PR for review**
----
-## 8. Files to Cherry-Pick (Safe Improvements)
-| File | What Changed | Safe to Port? |
-|------|-------------|---------------|
-| `src/agent_factory/judges.py` | Added `HuggingFaceModel` support in `get_model()` | YES |
-| `examples/free_tier_demo.py` | New demo for HF inference | YES |
-| `tests/unit/agent_factory/test_judges.py` | Fixed HF model mocking | YES |
-| `pyproject.toml` | Removed old `magentic` optional dep | MAYBE (review carefully) |
----
-## 9. Questions to Answer Before Proceeding
-1. **For the hackathon**: Do we need full multi-agent orchestration, or is single-agent sufficient?
-2. **For DeepBoner mainline**: Is the plan to use Microsoft Agent Framework for orchestration?
-3. **Timeline**: How much time do we have to get this right?
----
-## 10. Immediate Actions (DO NOW)
-- [ ] **DO NOT merge PR #41**
-- [ ] Close PR #41 with comment explaining the situation
-- [ ] Do not push local `dev` branch anywhere
-- [ ] Confirm HuggingFace Spaces is untouched (it is - verified)
----
-## 11. Decision Log
-| Date | Decision | Rationale |
-|------|----------|-----------|
-| 2025-11-27 | Pause refactor merge | Discovered agent framework and pydantic-ai are complementary, not exclusive |
-| TBD | ? | Awaiting decision on path forward |

docs/decisions/architecture-2025-11/01_ARCHITECTURE_SPEC.md DELETED Viewed

@@ -1,289 +0,0 @@
-# Architecture Specification: Dual-Mode Agent System
-**Date:** November 27, 2025
-**Status:** SPECIFICATION
-**Goal:** Graceful degradation from full multi-agent orchestration to simple single-agent mode
----
-## 1. Core Concept: Two Operating Modes
-```text
-┌─────────────────────────────────────────────────────────────────────┐
-│                        USER REQUEST                                 │
-│                            │                                        │
-│                            ▼                                        │
-│                   ┌─────────────────┐                               │
-│                   │  Mode Selection │                               │
-│                   │  (Auto-detect)  │                               │
-│                   └────────┬────────┘                               │
-│                            │                                        │
-│            ┌───────────────┴───────────────┐                        │
-│            │                               │                        │
-│            ▼                               ▼                        │
-│   ┌─────────────────┐             ┌─────────────────┐               │
-│   │   SIMPLE MODE   │             │  ADVANCED MODE  │               │
-│   │  (Free Tier)    │             │  (Paid Tier)    │               │
-│   │                 │             │                 │               │
-│   │  pydantic-ai    │             │  MS Agent Fwk   │               │
-│   │  single-agent   │             │  + pydantic-ai  │               │
-│   │  loop           │             │  multi-agent    │               │
-│   └─────────────────┘             └─────────────────┘               │
-│            │                               │                        │
-│            └───────────────┬───────────────┘                        │
-│                            ▼                                        │
-│                   ┌─────────────────┐                               │
-│                   │  Research Report │                              │
-│                   │  with Citations  │                              │
-│                   └─────────────────┘                               │
-└─────────────────────────────────────────────────────────────────────┘
-```
----
-## 2. Mode Comparison
-| Aspect | Simple Mode | Advanced Mode |
-|--------|-------------|---------------|
-| **Trigger** | No API key OR `LLM_PROVIDER=huggingface` | OpenAI API key present (currently OpenAI only) |
-| **Framework** | pydantic-ai only | Microsoft Agent Framework + pydantic-ai |
-| **Architecture** | Single orchestrator loop | Multi-agent coordination |
-| **Agents** | One agent does Search→Judge→Report | SearchAgent, JudgeAgent, ReportAgent, AnalysisAgent |
-| **State Management** | Simple dict | Thread-safe `MagenticState` with context vars |
-| **Quality** | Good (functional) | Better (specialized agents, coordination) |
-| **Cost** | Free (HuggingFace Inference) | Paid (OpenAI/Anthropic) |
-| **Use Case** | Demos, hackathon, budget-constrained | Production, research quality |
----
-## 3. Simple Mode Architecture (pydantic-ai Only)
-```text
-┌─────────────────────────────────────────────────────┐
-│                  Orchestrator                       │
-│                                                     │
-│   while not sufficient and iteration < max:        │
-│       1. SearchHandler.execute(query)              │
-│       2. JudgeHandler.assess(evidence)    ◄── pydantic-ai Agent  │
-│       3. if sufficient: break                      │
-│       4. query = judge.next_queries                │
-│                                                     │
-│   return ReportGenerator.generate(evidence)        │
-└─────────────────────────────────────────────────────┘
-```
-**Components:**
-- `src/orchestrator.py` - Simple loop orchestrator
-- `src/agent_factory/judges.py` - JudgeHandler with pydantic-ai
-- `src/tools/search_handler.py` - Scatter-gather search
-- `src/tools/pubmed.py`, `clinicaltrials.py`, `europepmc.py` - Search tools
----
-## 4. Advanced Mode Architecture (MS Agent Framework + pydantic-ai)
-```text
-┌─────────────────────────────────────────────────────────────────────┐
-│              Microsoft Agent Framework Orchestrator                 │
-│                                                                     │
-│   ┌─────────────┐    ┌─────────────┐    ┌─────────────┐            │
-│   │ SearchAgent │───▶│ JudgeAgent  │───▶│ ReportAgent │            │
-│   │ (BaseAgent) │    │ (BaseAgent) │    │ (BaseAgent) │            │
-│   └──────┬──────┘    └──────┬──────┘    └──────┬──────┘            │
-│          │                  │                  │                    │
-│          ▼                  ▼                  ▼                    │
-│   ┌─────────────┐    ┌─────────────┐    ┌─────────────┐            │
-│   │ pydantic-ai │    │ pydantic-ai │    │ pydantic-ai │            │
-│   │ Agent()     │    │ Agent()     │    │ Agent()     │            │
-│   │ output_type=│    │ output_type=│    │ output_type=│            │
-│   │ SearchResult│    │ JudgeAssess │    │ Report      │            │
-│   └─────────────┘    └─────────────┘    └─────────────┘            │
-│                                                                     │
-│   Shared State: MagenticState (thread-safe via contextvars)        │
-│   - evidence: list[Evidence]                                       │
-│   - embedding_service: EmbeddingService                            │
-└─────────────────────────────────────────────────────────────────────┘
-```
-**Components:**
-- `src/orchestrator_magentic.py` - Multi-agent orchestrator
-- `src/agents/search_agent.py` - SearchAgent (BaseAgent)
-- `src/agents/judge_agent.py` - JudgeAgent (BaseAgent)
-- `src/agents/report_agent.py` - ReportAgent (BaseAgent)
-- `src/agents/analysis_agent.py` - AnalysisAgent (BaseAgent)
-- `src/agents/state.py` - Thread-safe state management
-- `src/agents/tools.py` - @ai_function decorated tools
----
-## 5. Mode Selection Logic
-```python
-# src/orchestrator_factory.py (actual implementation)
-def create_orchestrator(
-    search_handler: SearchHandlerProtocol | None = None,
-    judge_handler: JudgeHandlerProtocol | None = None,
-    config: OrchestratorConfig | None = None,
-    mode: Literal["simple", "magentic", "advanced"] | None = None,
-) -> Any:
-    """
-    Auto-select orchestrator based on available credentials.
-    Priority:
-    1. If mode explicitly set, use that
-    2. If OpenAI key available -> Advanced Mode (currently OpenAI only)
-    3. Otherwise -> Simple Mode (HuggingFace free tier)
-    """
-    effective_mode = _determine_mode(mode)
-    if effective_mode == "advanced":
-        orchestrator_cls = _get_magentic_orchestrator_class()
-        return orchestrator_cls(max_rounds=config.max_iterations if config else 10)
-    # Simple mode requires handlers
-    if search_handler is None or judge_handler is None:
-        raise ValueError("Simple mode requires search_handler and judge_handler")
-    return Orchestrator(
-        search_handler=search_handler,
-        judge_handler=judge_handler,
-        config=config,
-    )
-```
----
-## 6. Shared Components (Both Modes Use)
-These components work in both modes:
-| Component | Purpose |
-|-----------|---------|
-| `src/tools/pubmed.py` | PubMed search |
-| `src/tools/clinicaltrials.py` | ClinicalTrials.gov search |
-| `src/tools/europepmc.py` | Europe PMC search |
-| `src/tools/search_handler.py` | Scatter-gather orchestration |
-| `src/tools/rate_limiter.py` | Rate limiting |
-| `src/utils/models.py` | Evidence, Citation, JudgeAssessment |
-| `src/utils/config.py` | Settings |
-| `src/services/embeddings.py` | Vector search (optional) |
----
-## 7. pydantic-ai Integration Points
-Both modes use pydantic-ai for structured LLM outputs:
-```python
-# In JudgeHandler (both modes)
-from pydantic_ai import Agent
-from pydantic_ai.models.huggingface import HuggingFaceModel
-from pydantic_ai.models.openai import OpenAIModel
-from pydantic_ai.models.anthropic import AnthropicModel
-class JudgeHandler:
-    def __init__(self, model: Any = None):
-        self.model = model or get_model()  # Auto-selects based on config
-        self.agent = Agent(
-            model=self.model,
-            output_type=JudgeAssessment,  # Structured output!
-            system_prompt=SYSTEM_PROMPT,
-        )
-    async def assess(self, question: str, evidence: list[Evidence]) -> JudgeAssessment:
-        result = await self.agent.run(format_prompt(question, evidence))
-        return result.output  # Guaranteed to be JudgeAssessment
-```
----
-## 8. Microsoft Agent Framework Integration Points
-Advanced mode wraps pydantic-ai agents in BaseAgent:
-```python
-# In JudgeAgent (advanced mode only)
-from agent_framework import BaseAgent, AgentRunResponse, ChatMessage, Role
-class JudgeAgent(BaseAgent):
-    def __init__(self, judge_handler: JudgeHandlerProtocol):
-        super().__init__(
-            name="JudgeAgent",
-            description="Evaluates evidence quality",
-        )
-        self._handler = judge_handler  # Uses pydantic-ai internally
-    async def run(self, messages, **kwargs) -> AgentRunResponse:
-        question = extract_question(messages)
-        evidence = self._evidence_store.get("current", [])
-        # Delegate to pydantic-ai powered handler
-        assessment = await self._handler.assess(question, evidence)
-        return AgentRunResponse(
-            messages=[ChatMessage(role=Role.ASSISTANT, text=format_response(assessment))],
-            additional_properties={"assessment": assessment.model_dump()},
-        )
-```
----
-## 9. Benefits of This Architecture
-1. **Graceful Degradation**: Works without API keys (free tier)
-2. **Progressive Enhancement**: Better with API keys (orchestration)
-3. **Code Reuse**: pydantic-ai handlers shared between modes
-4. **Hackathon Ready**: Demo works without requiring paid keys
-5. **Production Ready**: Full orchestration available when needed
-6. **Future Proof**: Can add more agents to advanced mode
-7. **Testable**: Simple mode is easier to unit test
----
-## 10. Known Risks and Mitigations
-> **From Senior Agent Review**
-### 10.1 Bridge Complexity (MEDIUM)
-**Risk:** In Advanced Mode, agents (Agent Framework) wrap handlers (pydantic-ai). Both are async. Context variables (`MagenticState`) must propagate correctly through the pydantic-ai call stack.
-**Mitigation:**
-- pydantic-ai uses standard Python `contextvars`, which naturally propagate through `await` chains
-- Test context propagation explicitly in integration tests
-- If issues arise, pass state explicitly rather than via context vars
-### 10.2 Integration Drift (MEDIUM)
-**Risk:** Simple Mode and Advanced Mode might diverge in behavior over time (e.g., Simple Mode uses logic A, Advanced Mode uses logic B).
-**Mitigation:**
-- Both modes MUST call the exact same underlying Tools (`src/tools/*`) and Handlers (`src/agent_factory/*`)
-- Handlers are the single source of truth for business logic
-- Agents are thin wrappers that delegate to handlers
-### 10.3 Testing Burden (LOW-MEDIUM)
-**Risk:** Two distinct orchestrators (`src/orchestrator.py` and `src/orchestrator_magentic.py`) doubles integration testing surface area.
-**Mitigation:**
-- Unit test handlers independently (shared code)
-- Integration tests for each mode separately
-- End-to-end tests verify same output for same input (determinism permitting)
-### 10.4 Dependency Conflicts (LOW)
-**Risk:** `agent-framework-core` might conflict with `pydantic-ai`'s dependencies (e.g., different pydantic versions).
-**Status:** Both use `pydantic>=2.x`. Should be compatible.
----
-## 11. Naming Clarification
-> See `00_SITUATION_AND_PLAN.md` Section 4 for full details.
-**Important:** The codebase uses "magentic" in file names (`orchestrator_magentic.py`, `magentic_agents.py`) but this refers to our internal naming for Microsoft Agent Framework integration, **NOT** the `magentic` PyPI package.
-**Future action:** Rename to `orchestrator_advanced.py` to eliminate confusion.

docs/decisions/architecture-2025-11/02_IMPLEMENTATION_PHASES.md DELETED Viewed

@@ -1,112 +0,0 @@
-# Implementation Phases: Dual-Mode Agent System
-**Date:** November 27, 2025
-**Status:** IMPLEMENTATION PLAN (REVISED)
-**Strategy:** TDD (Test-Driven Development), SOLID Principles
-**Dependency Strategy:** PyPI (agent-framework-core)
----
-## Phase 0: Environment Validation & Cleanup
-**Goal:** Ensure clean state and dependencies are correctly installed.
-### Step 0.1: Verify PyPI Package
-The `agent-framework-core` package is published on PyPI by Microsoft. Verify installation:
-```bash
-uv sync --all-extras
-python -c "from agent_framework import ChatAgent; print('OK')"
-```
-### Step 0.2: Branch State
-We are on `feat/dual-mode-architecture`. Ensure it is up to date with `origin/dev` before starting.
-**Note:** The `reference_repos/agent-framework` folder is kept for reference/documentation only.
-The production dependency uses the official PyPI release.
----
-## Phase 1: Pydantic-AI Improvements (Simple Mode)
-**Goal:** Implement `HuggingFaceModel` support in `JudgeHandler` using strict TDD.
-### Step 1.1: Test First (Red)
-Create `tests/unit/agent_factory/test_judges_factory.py`:
-- Test `get_model()` returns `HuggingFaceModel` when `LLM_PROVIDER=huggingface`.
-- Test `get_model()` respects `HF_TOKEN`.
-- Test fallback to OpenAI.
-### Step 1.2: Implementation (Green)
-Update `src/utils/config.py`:
-- Add `huggingface_model` and `hf_token` fields.
-Update `src/agent_factory/judges.py`:
-- Implement `get_model` with the logic derived from the tests.
-- Use dependency injection for the model where possible.
-### Step 1.3: Refactor
-Ensure `JudgeHandler` is loosely coupled from the specific model provider.
----
-## Phase 2: Orchestrator Factory (The Switch)
-**Goal:** Implement the factory pattern to switch between Simple and Advanced modes.
-### Step 2.1: Test First (Red)
-Create `tests/unit/test_orchestrator_factory.py`:
-- Test `create_orchestrator` returns `Orchestrator` (simple) when API keys are missing.
-- Test `create_orchestrator` returns `MagenticOrchestrator` (advanced) when OpenAI key exists.
-- Test explicit mode override.
-### Step 2.2: Implementation (Green)
-Update `src/orchestrator_factory.py` to implement the selection logic.
----
-## Phase 3: Agent Framework Integration (Advanced Mode)
-**Goal:** Integrate Microsoft Agent Framework from PyPI.
-### Step 3.1: Dependency Management
-The `agent-framework-core` package is installed from PyPI:
-```toml
-[project.optional-dependencies]
-magentic = [
-    "agent-framework-core>=1.0.0b251120,<2.0.0",  # Microsoft Agent Framework (PyPI)
-]
-```
-Install with: `uv sync --all-extras`
-### Step 3.2: Verify Imports (Test First)
-Create `tests/unit/agents/test_agent_imports.py`:
-- Verify `from agent_framework import ChatAgent` works.
-- Verify instantiation of `ChatAgent` with a mock client.
-### Step 3.3: Update Agents
-Refactor `src/agents/*.py` to ensure they match the exact signature of the local `ChatAgent` class.
-- **SOLID:** Ensure agents have single responsibilities.
-- **DRY:** Share tool definitions between Pydantic-AI simple mode and Agent Framework advanced mode.
----
-## Phase 4: UI & End-to-End Verification
-**Goal:** Update Gradio to reflect the active mode.
-### Step 4.1: UI Updates
-Update `src/app.py` to display "Simple Mode" vs "Advanced Mode".
-### Step 4.2: End-to-End Test
-Run the full loop:
-1. Simple Mode (No Keys) -> Search -> Judge (HF) -> Report.
-2. Advanced Mode (OpenAI Key) -> SearchAgent -> JudgeAgent -> ReportAgent.
----
-## Phase 5: Cleanup & Documentation
-- Remove unused code.
-- Update main README.md.
-- Final `make check`.

docs/decisions/architecture-2025-11/03_IMMEDIATE_ACTIONS.md DELETED Viewed

@@ -1,112 +0,0 @@
-# Immediate Actions Checklist
-**Date:** November 27, 2025
-**Priority:** Execute in order
----
-## Before Starting Implementation
-### 1. Close PR #41 (CRITICAL)
-```bash
-gh pr close 41 --comment "Architecture decision changed. Cherry-picking improvements to preserve both pydantic-ai and Agent Framework capabilities."
-```
-### 2. Verify HuggingFace Spaces is Safe
-```bash
-# Should show agent framework files exist
-git ls-tree --name-only huggingface-upstream/dev -- src/agents/
-git ls-tree --name-only huggingface-upstream/dev -- src/orchestrator_magentic.py
-```
-Expected output: Files should exist (they do as of this writing).
-### 3. Clean Local Environment
-```bash
-# Switch to main first
-git checkout main
-# Delete problematic branches
-git branch -D refactor/pydantic-unification 2>/dev/null || true
-git branch -D feat/pubmed-fulltext 2>/dev/null || true
-# Reset local dev to origin/dev
-git branch -D dev 2>/dev/null || true
-git checkout -b dev origin/dev
-# Verify agent framework code exists
-ls src/agents/
-# Expected: __init__.py, analysis_agent.py, hypothesis_agent.py, judge_agent.py,
-#           magentic_agents.py, report_agent.py, search_agent.py, state.py, tools.py
-ls src/orchestrator_magentic.py
-# Expected: file exists
-```
-### 4. Create Fresh Feature Branch
-```bash
-git checkout -b feat/dual-mode-architecture origin/dev
-```
----
-## Decision Points
-Before proceeding, confirm:
-1. **For hackathon**: Do we need advanced mode, or is simple mode sufficient?
-   - Simple mode = faster to implement, works today
-   - Advanced mode = better quality, more work
-2. **Timeline**: How much time do we have?
-   - If < 1 day: Focus on simple mode only
-   - If > 1 day: Implement dual-mode
-3. **Dependencies**: Is `agent-framework-core` available?
-   - Check: `pip index versions agent-framework-core`
-   - If not on PyPI, may need to install from GitHub
----
-## Quick Start (Simple Mode Only)
-If time is limited, implement only simple mode improvements:
-```bash
-# On feat/dual-mode-architecture branch
-# 1. Update judges.py to add HuggingFace support
-# 2. Update config.py to add HF settings
-# 3. Create free_tier_demo.py
-# 4. Run make check
-# 5. Create PR to dev
-```
-This gives you free-tier capability without touching agent framework code.
----
-## Quick Start (Full Dual-Mode)
-If time permits, implement full dual-mode:
-Follow phases 1-6 in `02_IMPLEMENTATION_PHASES.md`
----
-## Emergency Rollback
-If anything goes wrong:
-```bash
-# Reset to safe state
-git checkout main
-git branch -D feat/dual-mode-architecture
-git checkout -b feat/dual-mode-architecture origin/dev
-```
-Origin/dev is the safe fallback - it has agent framework intact.

docs/decisions/architecture-2025-11/04_FOLLOWUP_REVIEW_REQUEST.md DELETED Viewed

@@ -1,158 +0,0 @@
-# Follow-Up Review Request: Did We Implement Your Feedback?
-**Date:** November 27, 2025
-**Context:** You previously reviewed our dual-mode architecture plan and provided feedback. We have updated the documentation. Please verify we correctly implemented your recommendations.
----
-## Your Original Feedback vs Our Changes
-### 1. Naming Confusion Clarification
-**Your feedback:** "You are using Microsoft Agent Framework, but you've named your integration 'Magentic'. This caused the confusion."
-**Our change:** Added Section 4 in `00_SITUATION_AND_PLAN.md`:
-```markdown
-## 4. CRITICAL: Naming Confusion Clarification
-> **Senior Agent Review Finding:** The codebase uses "magentic" in file names
-> (e.g., `orchestrator_magentic.py`, `magentic_agents.py`) but this is **NOT**
-> the `magentic` PyPI package by Jacky Liang. It's Microsoft Agent Framework.
-**The naming confusion:**
-- `magentic` (PyPI package): A different library for structured LLM outputs
-- "Magentic" (in our codebase): Our internal name for Microsoft Agent Framework integration
-- `agent-framework-core` (PyPI package): Microsoft's actual multi-agent orchestration framework
-**Recommended future action:** Rename `orchestrator_magentic.py` → `orchestrator_advanced.py`
-```
-**Status:** ✅ IMPLEMENTED
----
-### 2. Bridge Complexity Warning
-**Your feedback:** "You must ensure MagenticState (context vars) propagates correctly through the pydantic-ai call stack."
-**Our change:** Added Section 10.1 in `01_ARCHITECTURE_SPEC.md`:
-```markdown
-### 10.1 Bridge Complexity (MEDIUM)
-**Risk:** In Advanced Mode, agents (Agent Framework) wrap handlers (pydantic-ai).
-Both are async. Context variables (`MagenticState`) must propagate correctly.
-**Mitigation:**
-- pydantic-ai uses standard Python `contextvars`, which naturally propagate through `await` chains
-- Test context propagation explicitly in integration tests
-- If issues arise, pass state explicitly rather than via context vars
-```
-**Status:** ✅ IMPLEMENTED
----
-### 3. Integration Drift Warning
-**Your feedback:** "Simple Mode and Advanced Mode might diverge in behavior."
-**Our change:** Added Section 10.2 in `01_ARCHITECTURE_SPEC.md`:
-```markdown
-### 10.2 Integration Drift (MEDIUM)
-**Risk:** Simple Mode and Advanced Mode might diverge in behavior over time.
-**Mitigation:**
-- Both modes MUST call the exact same underlying Tools (`src/tools/*`) and Handlers (`src/agent_factory/*`)
-- Handlers are the single source of truth for business logic
-- Agents are thin wrappers that delegate to handlers
-```
-**Status:** ✅ IMPLEMENTED
----
-### 4. Testing Burden Warning
-**Your feedback:** "You now have two distinct orchestrators to maintain. This doubles your integration testing surface area."
-**Our change:** Added Section 10.3 in `01_ARCHITECTURE_SPEC.md`:
-```markdown
-### 10.3 Testing Burden (LOW-MEDIUM)
-**Risk:** Two distinct orchestrators doubles integration testing surface area.
-**Mitigation:**
-- Unit test handlers independently (shared code)
-- Integration tests for each mode separately
-- End-to-end tests verify same output for same input
-```
-**Status:** ✅ IMPLEMENTED
----
-### 5. Rename Recommendation
-**Your feedback:** "Rename `src/orchestrator_magentic.py` to `src/orchestrator_advanced.py`"
-**Our change:** Added Step 3.4 in `02_IMPLEMENTATION_PHASES.md`:
-```markdown
-### Step 3.4: (OPTIONAL) Rename "Magentic" to "Advanced"
-> **Senior Agent Recommendation:** Rename files to eliminate confusion.
-git mv src/orchestrator_magentic.py src/orchestrator_advanced.py
-git mv src/agents/magentic_agents.py src/agents/advanced_agents.py
-**Note:** This is optional for the hackathon. Can be done in a follow-up PR.
-```
-**Status:** ✅ DOCUMENTED (marked as optional for hackathon)
----
-### 6. Standardize Wrapper Recommendation
-**Your feedback:** "Create a generic `PydanticAiAgentWrapper(BaseAgent)` class instead of manually wrapping each handler."
-**Our change:** NOT YET DOCUMENTED
-**Status:** ⚠️ NOT IMPLEMENTED - Should we add this?
----
-## Questions for Your Review
-1. **Did we correctly implement your feedback?** Are there any misunderstandings in how we interpreted your recommendations?
-2. **Is the "Standardize Wrapper" recommendation critical?** Should we add it to the implementation phases, or is it a nice-to-have for later?
-3. **Dependency versioning:** You noted `agent-framework-core>=1.0.0b251120` might be ephemeral. Should we:
-   - Pin to a specific version?
-   - Use a version range?
-   - Install from GitHub source?
-4. **Anything else we missed?**
----
-## Files to Re-Review
-1. `00_SITUATION_AND_PLAN.md` - Added Section 4 (Naming Clarification)
-2. `01_ARCHITECTURE_SPEC.md` - Added Sections 10-11 (Risks, Naming)
-3. `02_IMPLEMENTATION_PHASES.md` - Added Step 3.4 (Optional Rename)
----
-## Current Branch State
-We are now on `feat/dual-mode-architecture` branched from `origin/dev`:
-- ✅ Agent framework code intact (`src/agents/`, `src/orchestrator_magentic.py`)
-- ✅ Documentation committed
-- ❌ PR #41 still open (need to close it)
-- ❌ Cherry-pick of pydantic-ai improvements not yet done
----
-Please confirm: **GO / NO-GO** to proceed with Phase 1 (cherry-picking pydantic-ai improvements)?

docs/decisions/architecture-2025-11/REVIEW_PROMPT_FOR_SENIOR_AGENT.md DELETED Viewed

@@ -1,113 +0,0 @@
-# Senior Agent Review Prompt
-Copy and paste everything below this line to a fresh Claude/AI session:
----
-## Context
-I am a junior developer working on a HuggingFace hackathon project called DeepBoner. We made a significant architectural mistake and are now trying to course-correct. I need you to act as a **senior staff engineer** and critically review our proposed solution.
-## The Situation
-We almost merged a refactor that would have **deleted** our multi-agent orchestration capability, mistakenly believing that `pydantic-ai` (a library for structured LLM outputs) and Microsoft's `agent-framework` (a framework for multi-agent orchestration) were mutually exclusive alternatives.
-**They are not.** They are complementary:
-- `pydantic-ai` ensures LLM responses match Pydantic schemas (type-safe outputs)
-- `agent-framework` orchestrates multiple agents working together (coordination layer)
-We now want to implement a **dual-mode architecture** where:
-- **Simple Mode (No API key):** Uses only pydantic-ai with HuggingFace free tier
-- **Advanced Mode (With API key):** Uses Microsoft Agent Framework for orchestration, with pydantic-ai inside each agent for structured outputs
-## Your Task
-Please perform a **deep, critical review** of:
-1. **The architecture diagram** (image attached: `assets/magentic-pydantic.png`)
-2. **Our documentation** (4 files listed below)
-3. **The actual codebase** to verify our claims
-## Specific Questions to Answer
-### Architecture Validation
-1. Is our understanding correct that pydantic-ai and agent-framework are complementary, not competing?
-2. Does the dual-mode architecture diagram accurately represent how these should integrate?
-3. Are there any architectural flaws or anti-patterns in our proposed design?
-### Documentation Accuracy
-4. Are the branch states we documented accurate? (Check `git log`, `git ls-tree`)
-5. Is our understanding of what code exists where correct?
-6. Are the implementation phases realistic and in the correct order?
-7. Are there any missing steps or dependencies we overlooked?
-### Codebase Reality Check
-8. Does `origin/dev` actually have the agent framework code intact? Verify by checking:
-   - `git ls-tree origin/dev -- src/agents/`
-   - `git ls-tree origin/dev -- src/orchestrator_magentic.py`
-9. What does the current `src/agents/` code actually import? Does it use `agent_framework` or `agent-framework-core`?
-10. Is the `agent-framework-core` package actually available on PyPI, or do we need to install from source?
-### Implementation Feasibility
-11. Can the cherry-pick strategy we outlined actually work, or are there merge conflicts we're not seeing?
-12. Is the mode auto-detection logic sound?
-13. What are the risks we haven't identified?
-### Critical Errors Check
-14. Did we miss anything critical in our analysis?
-15. Are there any factual errors in our documentation?
-16. Would a Google/DeepMind senior engineer approve this plan, or would they flag issues?
-## Files to Review
-Please read these files in order:
-1. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/docs/brainstorming/magentic-pydantic/00_SITUATION_AND_PLAN.md`
-2. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/docs/brainstorming/magentic-pydantic/01_ARCHITECTURE_SPEC.md`
-3. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/docs/brainstorming/magentic-pydantic/02_IMPLEMENTATION_PHASES.md`
-4. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/docs/brainstorming/magentic-pydantic/03_IMMEDIATE_ACTIONS.md`
-And the architecture diagram:
-5. `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/assets/magentic-pydantic.png`
-## Reference Repositories to Consult
-We have local clones of the source-of-truth repositories:
-- **Original DeepBoner:** `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/reference_repos/DeepBoner/`
-- **Microsoft Agent Framework:** `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/reference_repos/agent-framework/`
-- **Microsoft AutoGen:** `/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/reference_repos/autogen-microsoft/`
-Please cross-reference our hackathon fork against these to verify architectural alignment.
-## Codebase to Analyze
-Our hackathon fork is at:
-`/Users/ray/Desktop/CLARITY-DIGITAL-TWIN/DeepBoner-1/`
-Key files to examine:
-- `src/agents/` - Agent framework integration
-- `src/agent_factory/judges.py` - pydantic-ai integration
-- `src/orchestrator.py` - Simple mode orchestrator
-- `src/orchestrator_magentic.py` - Advanced mode orchestrator
-- `src/orchestrator_factory.py` - Mode selection
-- `pyproject.toml` - Dependencies
-## Expected Output
-Please provide:
-1. **Validation Summary:** Is our plan sound? (YES/NO with explanation)
-2. **Errors Found:** List any factual errors in our documentation
-3. **Missing Items:** What did we overlook?
-4. **Risk Assessment:** What could go wrong?
-5. **Recommended Changes:** Specific edits to our documentation or plan
-6. **Go/No-Go Recommendation:** Should we proceed with this plan?
-## Tone
-Be brutally honest. If our plan is flawed, say so directly. We would rather know now than after implementation. Don't soften criticism - we need accuracy.
----
-END OF PROMPT