Spaces:

VibecoderMcSwaggins
/

DeepBoner

Paused

VibecoderMcSwaggins commited on 23 days ago

Commit

78ec52a

1 Parent(s): c6e9843

docs: Add P1 bug doc for HuggingFace Novita 500 error

- New bug: Free tier fails due to HuggingFace routing Qwen to Novita
which is returning 500 Internal Server Error
- Same pattern as Llama/Hyperbolic issue we saw before
- Updated ACTIVE_BUGS.md with current status
- Marked repr bug as FIXED (PR #117 Accumulator Pattern)

Pausing for senior review before implementing fix.

Files changed (2) hide show

docs/bugs/ACTIVE_BUGS.md +22 -14
docs/bugs/P1_HUGGINGFACE_NOVITA_500_ERROR.md +133 -0

docs/bugs/ACTIVE_BUGS.md CHANGED Viewed

@@ -1,34 +1,42 @@
 # Active Bugs
-> Last updated: 2025-12-01 (21:00 PST)
 >
 > **Note:** Completed bug docs archived to `docs/bugs/archive/`
 > **See also:** [Code Quality Audit Findings (2025-11-30)](AUDIT_FINDINGS_2025_11_30.md)
 > **See also:** [ARCHITECTURE.md](../ARCHITECTURE.md) for unified architecture plan
-## P0 - Critical (BLOCKED)
-### Free Tier Broken (Upstream #2562)
-**Issue:** [#105](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/105), [#113](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/113)
-**Status:** BLOCKED - Waiting for upstream PR #2566
-**Problem:** Free tier (Advanced Mode + HuggingFace) shows repr garbage output.
-**Cause:** Microsoft Agent Framework upstream bug #2562.
-**Fix:** Upstream PR #2566 will fix this. Once merged:
-1. Update `agent-framework` dependency
-2. Verify Advanced + HuggingFace works
-3. Unified architecture complete
-**Architecture Note:** We have ONE unified architecture. `simple.py` is deleted.
-Simple Mode behavior is INTEGRATED via `HuggingFaceChatClient`, not a parallel orchestrator.
 ---
 ## Resolved Bugs
 ### ~~P0 - AIFunction Not JSON Serializable~~ FIXED
 **File:** `docs/bugs/P0_AIFUNCTION_NOT_JSON_SERIALIZABLE.md`

 # Active Bugs
+> Last updated: 2025-12-02 (07:30 EST)
 >
 > **Note:** Completed bug docs archived to `docs/bugs/archive/`
 > **See also:** [Code Quality Audit Findings (2025-11-30)](AUDIT_FINDINGS_2025_11_30.md)
 > **See also:** [ARCHITECTURE.md](../ARCHITECTURE.md) for unified architecture plan
+## P1 - High (ACTIVE)
+### HuggingFace Novita Provider 500 Error
+**File:** `docs/bugs/P1_HUGGINGFACE_NOVITA_500_ERROR.md`
+**Status:** ACTIVE - Upstream Infrastructure Issue
+**Problem:** Free tier (no API key) fails with 500 error from Novita provider.
+**Cause:** HuggingFace routes Qwen/Qwen2.5-72B-Instruct to Novita (third-party), and Novita is returning 500 errors.
+**Fix Options:**
+1. Switch to a model hosted natively by HuggingFace
+2. Implement fallback model logic
+3. Wait for Novita to fix their infrastructure
 ---
 ## Resolved Bugs
+### ~~P0 - Repr Bug (Display Garbage)~~ FIXED
+**File:** `P0_REPR_BUG_ROOT_CAUSE_ANALYSIS.md`, `docs/specs/SPEC_17_ACCUMULATOR_PATTERN.md`
+**Found:** 2025-12-01
+**Resolved:** 2025-12-02 (PR #117)
+- Problem: Free tier showed `<agent_framework._types.ChatMessage object at 0x...>` instead of text
+- Root Cause: We were using API incorrectly - reading from `MagenticAgentMessageEvent.message` instead of `MagenticAgentDeltaEvent.text`
+- Fix: Implemented **Accumulator Pattern** (SPEC-17) - bypasses the upstream bug by using the API correctly
+- Note: Upstream fix (PR #2566) is now moot - we don't need it anymore
 ### ~~P0 - AIFunction Not JSON Serializable~~ FIXED
 **File:** `docs/bugs/P0_AIFUNCTION_NOT_JSON_SERIALIZABLE.md`

docs/bugs/P1_HUGGINGFACE_NOVITA_500_ERROR.md ADDED Viewed

	@@ -0,0 +1,133 @@

+# P1 BUG: HuggingFace Router 500 Error via Novita Provider
+**Status**: ACTIVE - Upstream Infrastructure Issue
+**Priority**: P1 (Free Tier Broken)
+**Discovered**: 2025-12-02
+**Related**: CLAUDE.md (Llama/Hyperbolic issue)
+---
+## Symptom
+```
+❌ **ERROR**: Workflow error: 500 Server Error: Internal Server Error for url:
+https://router.huggingface.co/novita/v3/openai/chat/completions
+```
+Free tier users (no API key) cannot use the system.
+---
+## Stack Trace
+```text
+User (no API key)
+    ↓
+src/clients/factory.py:get_chat_client()
+    ↓
+src/clients/huggingface.py:HuggingFaceChatClient
+    ↓
+Model: Qwen/Qwen2.5-72B-Instruct (from config.py)
+    ↓
+huggingface_hub.InferenceClient
+    ↓
+HuggingFace Router: router.huggingface.co
+    ↓
+Routes to: NOVITA (third-party inference provider)
+    ↓
+❌ Novita returns 500 Internal Server Error
+```
+---
+## Root Cause
+**HuggingFace doesn't host all models directly.** For some models, they route to third-party inference providers:
+| Model | Provider | Status |
+|-------|----------|--------|
+| Llama-3.1-70B | Hyperbolic | ❌ "staging mode" auth issues |
+| Qwen2.5-72B | Novita | ❌ 500 Internal Server Error |
+We switched from Llama to Qwen specifically to avoid Hyperbolic's issues. Now Novita is having its own problems.
+**This is an upstream infrastructure issue - not a bug in our code.**
+---
+## Evidence
+From the error URL:
+```
+https://router.huggingface.co/novita/v3/openai/chat/completions
+                              ^^^^^^
+                              Third-party provider in URL path
+```
+---
+## Potential Fixes
+### Option 1: Try a Different Model (Quick)
+Find a model that HuggingFace hosts natively (not routed to partners):
+```python
+# Candidates to test:
+# - mistralai/Mistral-7B-Instruct-v0.3
+# - microsoft/Phi-3-mini-4k-instruct
+# - google/gemma-2-9b-it
+```
+### Option 2: Add Fallback Logic (Robust)
+```python
+FALLBACK_MODELS = [
+    "Qwen/Qwen2.5-72B-Instruct",
+    "mistralai/Mistral-7B-Instruct-v0.3",
+    "microsoft/Phi-3-mini-4k-instruct",
+]
+async def get_response_with_fallback(...):
+    for model in FALLBACK_MODELS:
+        try:
+            return await client.chat_completion(model=model, ...)
+        except HfHubHTTPError as e:
+            if e.status_code == 500:
+                continue
+            raise
+    raise AllModelsFailedError()
+```
+### Option 3: Wait for Novita Fix (Passive)
+500 errors are typically transient. Novita may fix their infrastructure.
+---
+## Verification
+To check if issue is resolved:
+```bash
+curl -X POST "https://router.huggingface.co/novita/v3/openai/chat/completions" \
+  -H "Authorization: Bearer $HF_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"model": "Qwen/Qwen2.5-72B-Instruct", "messages": [{"role": "user", "content": "hi"}]}'
+```
+---
+## Historical Context
+From `CLAUDE.md`:
+```
+- **HuggingFace (Free Tier):** `Qwen/Qwen2.5-72B-Instruct`
+  - Changed from Llama-3.1-70B (Dec 2025) due to HuggingFace routing Llama
+    to Hyperbolic provider which has unreliable "staging mode" auth.
+```
+Now Qwen is being routed to Novita, continuing the pattern of unreliable third-party routing.
+---
+## Recommendation
+**Short-term**: Switch to a model hosted natively by HuggingFace (test candidates above)
+**Long-term**: Implement fallback model logic to handle provider outages gracefully