Spaces:
Running
Running
Joseph Pollack
commited on
Commit
·
9b88c56
1
Parent(s):
71ca2eb
Remove test output files and update .gitignore
Browse files- .gitignore +1 -0
- docs/LICENSE.md +1 -0
- test_failures_analysis.md +0 -81
- test_fixes_summary.md +0 -102
- test_output_local_embeddings.txt +0 -0
- tests/conftest.py +65 -2
.gitignore
CHANGED
|
@@ -70,6 +70,7 @@ logs/
|
|
| 70 |
.mypy_cache/
|
| 71 |
.coverage
|
| 72 |
htmlcov/
|
|
|
|
| 73 |
|
| 74 |
# Database files
|
| 75 |
chroma_db/
|
|
|
|
| 70 |
.mypy_cache/
|
| 71 |
.coverage
|
| 72 |
htmlcov/
|
| 73 |
+
test_output*.txt
|
| 74 |
|
| 75 |
# Database files
|
| 76 |
chroma_db/
|
docs/LICENSE.md
CHANGED
|
@@ -30,3 +30,4 @@ SOFTWARE.
|
|
| 30 |
|
| 31 |
|
| 32 |
|
|
|
|
|
|
| 30 |
|
| 31 |
|
| 32 |
|
| 33 |
+
|
test_failures_analysis.md
DELETED
|
@@ -1,81 +0,0 @@
|
|
| 1 |
-
# Test Failures Analysis
|
| 2 |
-
|
| 3 |
-
## Summary
|
| 4 |
-
- **Total Failures**: 9 failed, 10 errors
|
| 5 |
-
- **Total Passed**: 482 passed, 2 skipped
|
| 6 |
-
- **Integration Test Failures**: 11 (expected - LlamaIndex dependencies not installed)
|
| 7 |
-
|
| 8 |
-
## Unit Test Failures (9 failed, 10 errors)
|
| 9 |
-
|
| 10 |
-
### 1. `test_get_model_anthropic` - FAILED
|
| 11 |
-
**Location**: `tests/unit/agent_factory/test_judges_factory.py`
|
| 12 |
-
**Error**: Returns `HuggingFaceModel()` instead of `AnthropicModel`
|
| 13 |
-
**Root Cause**: Token validation failing - mock token is not a string (NonCallableMagicMock)
|
| 14 |
-
**Log**: `Token is not a string (type: NonCallableMagicMock)`
|
| 15 |
-
|
| 16 |
-
### 2. `test_get_message_history` - FAILED
|
| 17 |
-
**Location**: `tests/unit/orchestrator/test_graph_orchestrator.py`
|
| 18 |
-
**Error**: `has_visited('node1')` returns False
|
| 19 |
-
**Root Cause**: GraphExecutionContext not properly tracking visited nodes
|
| 20 |
-
|
| 21 |
-
### 3. `test_run_with_graph_iterative` - FAILED
|
| 22 |
-
**Location**: `tests/unit/orchestrator/test_graph_orchestrator.py`
|
| 23 |
-
**Error**: `mock_run_with_graph() takes 2 positional arguments but 3 were given`
|
| 24 |
-
**Root Cause**: Mock function signature doesn't match actual method signature (missing `message_history` parameter)
|
| 25 |
-
|
| 26 |
-
### 4. `test_extract_name_from_oauth_profile` - FAILED
|
| 27 |
-
**Location**: `tests/unit/test_app_oauth.py`
|
| 28 |
-
**Error**: Returns `None` instead of `'Test User'`
|
| 29 |
-
**Root Cause**: OAuth profile name extraction logic not working correctly
|
| 30 |
-
|
| 31 |
-
### 5-9. `validate_oauth_token` related tests - FAILED (5 tests)
|
| 32 |
-
**Location**: `tests/unit/test_app_oauth.py`
|
| 33 |
-
**Error**: `AttributeError: <module 'src.app'> does not have the attribute 'validate_oauth_token'`
|
| 34 |
-
**Root Cause**: Function `validate_oauth_token` doesn't exist in `src.app` module or was moved/renamed
|
| 35 |
-
|
| 36 |
-
### 10-19. `ddgs.ddgs` module errors - ERROR (10 tests)
|
| 37 |
-
**Location**: `tests/unit/tools/test_web_search.py`
|
| 38 |
-
**Error**: `ModuleNotFoundError: No module named 'ddgs.ddgs'; 'ddgs' is not a package`
|
| 39 |
-
**Root Cause**: DDGS package structure issue - likely version mismatch or installation problem
|
| 40 |
-
|
| 41 |
-
## Integration Test Failures (11 failed - Expected)
|
| 42 |
-
**Location**: `tests/integration/test_rag_integration*.py`
|
| 43 |
-
**Error**: `ImportError: LlamaIndex dependencies not installed. Run: uv sync --extra modal`
|
| 44 |
-
**Root Cause**: Expected - these tests require optional dependencies that aren't installed in the test environment
|
| 45 |
-
|
| 46 |
-
## Resolutions Applied
|
| 47 |
-
|
| 48 |
-
### 1. `test_get_model_anthropic` - FIXED
|
| 49 |
-
**Fix**: Added explicit mock settings to ensure no HF token is set, preventing HuggingFace from being preferred over Anthropic.
|
| 50 |
-
- Set `mock_settings.hf_token = None`
|
| 51 |
-
- Set `mock_settings.huggingface_api_key = None`
|
| 52 |
-
- Set `mock_settings.has_openai_key = False`
|
| 53 |
-
- Set `mock_settings.has_anthropic_key = True`
|
| 54 |
-
|
| 55 |
-
### 2. `test_get_message_history` - FIXED
|
| 56 |
-
**Fix**: Added explicit node visit before checking `has_visited()`.
|
| 57 |
-
- Added `context.visited_nodes.add("node1")` before the assertion
|
| 58 |
-
|
| 59 |
-
### 3. `test_run_with_graph_iterative` - FIXED
|
| 60 |
-
**Fix**: Corrected mock function signature to match actual method.
|
| 61 |
-
- Changed from `async def mock_run_with_graph(query: str, mode: str)`
|
| 62 |
-
- To `async def mock_run_with_graph(query: str, research_mode: str, message_history: list | None = None)`
|
| 63 |
-
|
| 64 |
-
### 4. `test_extract_name_from_oauth_profile` - FIXED
|
| 65 |
-
**Fix**: Fixed the source code logic to check for truthy values, not just attribute existence.
|
| 66 |
-
- Updated `src/app.py` to check `request.oauth_profile.username` is truthy before using it
|
| 67 |
-
- Updated `src/app.py` to check `request.oauth_profile.name` is truthy before using it
|
| 68 |
-
- This allows fallback to `name` when `username` exists but is None
|
| 69 |
-
|
| 70 |
-
### 5. `validate_oauth_token` tests (5 tests) - FIXED
|
| 71 |
-
**Fix**: Updated patch paths to point to the actual module where functions are defined.
|
| 72 |
-
- Changed from `patch("src.app.validate_oauth_token", ...)`
|
| 73 |
-
- To `patch("src.utils.hf_model_validator.validate_oauth_token", ...)`
|
| 74 |
-
- Also fixed `get_available_models` and `get_available_providers` patches similarly
|
| 75 |
-
|
| 76 |
-
### 6. `ddgs.ddgs` module errors (10 tests) - FIXED
|
| 77 |
-
**Fix**: Improved mock structure to properly handle the ddgs package's internal structure.
|
| 78 |
-
- Created proper mock module hierarchy with `ddgs` and `ddgs.ddgs` submodules
|
| 79 |
-
- Created `MockDDGS` class that can be instantiated
|
| 80 |
-
- Properly mocked both `ddgs` and `duckduckgo_search` packages
|
| 81 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
test_fixes_summary.md
DELETED
|
@@ -1,102 +0,0 @@
|
|
| 1 |
-
# Test Fixes Summary
|
| 2 |
-
|
| 3 |
-
## Overview
|
| 4 |
-
Fixed 9 failed tests and 10 errors identified in the test suite. All fixes have been verified to pass.
|
| 5 |
-
|
| 6 |
-
## Test Results
|
| 7 |
-
- **Before**: 9 failed, 10 errors, 482 passed
|
| 8 |
-
- **After**: 0 failed, 0 errors, 501+ passed (all previously failing tests now pass)
|
| 9 |
-
|
| 10 |
-
## Fixes Applied
|
| 11 |
-
|
| 12 |
-
### 1. `test_get_model_anthropic` ✅
|
| 13 |
-
**File**: `tests/unit/agent_factory/test_judges_factory.py`
|
| 14 |
-
**Issue**: Test was returning HuggingFaceModel instead of AnthropicModel
|
| 15 |
-
**Fix**: Added explicit mock settings to prevent HuggingFace from being preferred:
|
| 16 |
-
```python
|
| 17 |
-
mock_settings.hf_token = None
|
| 18 |
-
mock_settings.huggingface_api_key = None
|
| 19 |
-
mock_settings.has_openai_key = False
|
| 20 |
-
mock_settings.has_anthropic_key = True
|
| 21 |
-
```
|
| 22 |
-
|
| 23 |
-
### 2. `test_get_message_history` ✅
|
| 24 |
-
**File**: `tests/unit/orchestrator/test_graph_orchestrator.py`
|
| 25 |
-
**Issue**: `has_visited("node1")` returned False because node was never visited
|
| 26 |
-
**Fix**: Added explicit node visit before assertion:
|
| 27 |
-
```python
|
| 28 |
-
context.visited_nodes.add("node1")
|
| 29 |
-
assert context.has_visited("node1")
|
| 30 |
-
```
|
| 31 |
-
|
| 32 |
-
### 3. `test_run_with_graph_iterative` ✅
|
| 33 |
-
**File**: `tests/unit/orchestrator/test_graph_orchestrator.py`
|
| 34 |
-
**Issue**: Mock function signature mismatch - took 2 args but 3 were given
|
| 35 |
-
**Fix**: Updated mock signature to match actual method:
|
| 36 |
-
```python
|
| 37 |
-
async def mock_run_with_graph(query: str, research_mode: str, message_history: list | None = None):
|
| 38 |
-
```
|
| 39 |
-
|
| 40 |
-
### 4. `test_extract_name_from_oauth_profile` ✅
|
| 41 |
-
**File**: `tests/unit/test_app_oauth.py` and `src/app.py`
|
| 42 |
-
**Issue**: Function checked if attribute exists, not if it's truthy, preventing fallback to `name`
|
| 43 |
-
**Fix**: Updated source code to check for truthy values:
|
| 44 |
-
```python
|
| 45 |
-
if hasattr(request.oauth_profile, "username") and request.oauth_profile.username:
|
| 46 |
-
oauth_username = request.oauth_profile.username
|
| 47 |
-
elif hasattr(request.oauth_profile, "name") and request.oauth_profile.name:
|
| 48 |
-
oauth_username = request.oauth_profile.name
|
| 49 |
-
```
|
| 50 |
-
|
| 51 |
-
### 5. `validate_oauth_token` tests (5 tests) ✅
|
| 52 |
-
**File**: `tests/unit/test_app_oauth.py` and `src/app.py`
|
| 53 |
-
**Issue**: Functions imported inside function, so patching `src.app.*` didn't work. Also, inference scope warning was being overwritten.
|
| 54 |
-
**Fix**:
|
| 55 |
-
1. Updated patch paths to source module:
|
| 56 |
-
```python
|
| 57 |
-
patch("src.utils.hf_model_validator.validate_oauth_token", ...)
|
| 58 |
-
patch("src.utils.hf_model_validator.get_available_models", ...)
|
| 59 |
-
patch("src.utils.hf_model_validator.get_available_providers", ...)
|
| 60 |
-
```
|
| 61 |
-
2. Fixed source code to preserve inference scope warning in final status message
|
| 62 |
-
3. Updated test assertion to match actual message format (handles quote in "inference-api' scope")
|
| 63 |
-
|
| 64 |
-
### 6. `ddgs.ddgs` module errors (10 tests) ✅
|
| 65 |
-
**File**: `tests/unit/tools/test_web_search.py`
|
| 66 |
-
**Issue**: Mock structure didn't handle ddgs package's internal `ddgs.ddgs` submodule
|
| 67 |
-
**Fix**: Created proper mock hierarchy:
|
| 68 |
-
```python
|
| 69 |
-
mock_ddgs_module = MagicMock()
|
| 70 |
-
mock_ddgs_submodule = MagicMock()
|
| 71 |
-
class MockDDGS:
|
| 72 |
-
def __init__(self, *args, **kwargs):
|
| 73 |
-
pass
|
| 74 |
-
def text(self, *args, **kwargs):
|
| 75 |
-
return []
|
| 76 |
-
mock_ddgs_submodule.DDGS = MockDDGS
|
| 77 |
-
mock_ddgs_module.ddgs = mock_ddgs_submodule
|
| 78 |
-
sys.modules["ddgs"] = mock_ddgs_module
|
| 79 |
-
sys.modules["ddgs.ddgs"] = mock_ddgs_submodule
|
| 80 |
-
```
|
| 81 |
-
|
| 82 |
-
## Files Modified
|
| 83 |
-
1. `tests/unit/agent_factory/test_judges_factory.py` - Fixed Anthropic model test
|
| 84 |
-
2. `tests/unit/orchestrator/test_graph_orchestrator.py` - Fixed graph orchestrator tests
|
| 85 |
-
3. `tests/unit/test_app_oauth.py` - Fixed OAuth tests and patch paths
|
| 86 |
-
4. `tests/unit/tools/test_web_search.py` - Fixed ddgs mocking
|
| 87 |
-
5. `src/app.py` - Fixed OAuth name extraction logic
|
| 88 |
-
|
| 89 |
-
## Verification
|
| 90 |
-
All previously failing tests now pass:
|
| 91 |
-
- ✅ `test_get_model_anthropic`
|
| 92 |
-
- ✅ `test_get_message_history`
|
| 93 |
-
- ✅ `test_run_with_graph_iterative`
|
| 94 |
-
- ✅ `test_extract_name_from_oauth_profile`
|
| 95 |
-
- ✅ `test_update_with_valid_token` (and related OAuth tests)
|
| 96 |
-
- ✅ All 10 `test_web_search.py` tests
|
| 97 |
-
|
| 98 |
-
## Notes
|
| 99 |
-
- Integration test failures (11 tests) are expected - they require optional LlamaIndex dependencies
|
| 100 |
-
- All fixes maintain backward compatibility
|
| 101 |
-
- No breaking changes to public APIs
|
| 102 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
test_output_local_embeddings.txt
DELETED
|
Binary file (43 kB)
|
|
|
tests/conftest.py
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
"""Shared pytest fixtures for all tests."""
|
| 2 |
|
| 3 |
import os
|
| 4 |
-
from unittest.mock import AsyncMock
|
| 5 |
|
| 6 |
import pytest
|
| 7 |
|
|
@@ -78,4 +78,67 @@ def default_to_huggingface(monkeypatch):
|
|
| 78 |
|
| 79 |
# Set a dummy HF_TOKEN if not set (prevents errors, but tests should mock actual API calls)
|
| 80 |
if "HF_TOKEN" not in os.environ:
|
| 81 |
-
monkeypatch.setenv("HF_TOKEN", "dummy_token_for_testing")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
"""Shared pytest fixtures for all tests."""
|
| 2 |
|
| 3 |
import os
|
| 4 |
+
from unittest.mock import AsyncMock, MagicMock, patch
|
| 5 |
|
| 6 |
import pytest
|
| 7 |
|
|
|
|
| 78 |
|
| 79 |
# Set a dummy HF_TOKEN if not set (prevents errors, but tests should mock actual API calls)
|
| 80 |
if "HF_TOKEN" not in os.environ:
|
| 81 |
+
monkeypatch.setenv("HF_TOKEN", "dummy_token_for_testing")
|
| 82 |
+
|
| 83 |
+
|
| 84 |
+
@pytest.fixture
|
| 85 |
+
def mock_hf_model():
|
| 86 |
+
"""Create a mock HuggingFace model for testing.
|
| 87 |
+
|
| 88 |
+
This fixture provides a mock model that can be used in agent tests
|
| 89 |
+
to avoid requiring actual API keys.
|
| 90 |
+
"""
|
| 91 |
+
model = MagicMock()
|
| 92 |
+
model.name = "meta-llama/Llama-3.1-8B-Instruct"
|
| 93 |
+
model.model_name = "meta-llama/Llama-3.1-8B-Instruct"
|
| 94 |
+
return model
|
| 95 |
+
|
| 96 |
+
|
| 97 |
+
@pytest.fixture(autouse=True)
|
| 98 |
+
def auto_mock_get_model(mock_hf_model, request):
|
| 99 |
+
"""Automatically mock get_model() in all agent modules.
|
| 100 |
+
|
| 101 |
+
This fixture runs automatically for all tests (except OpenAI tests) and
|
| 102 |
+
mocks get_model() where it's imported in each agent module, preventing
|
| 103 |
+
tests from requiring actual API keys.
|
| 104 |
+
|
| 105 |
+
Tests marked with @pytest.mark.openai will skip this fixture.
|
| 106 |
+
Tests can override by explicitly patching get_model() themselves.
|
| 107 |
+
"""
|
| 108 |
+
# Skip auto-mocking for OpenAI tests
|
| 109 |
+
if "openai" in request.keywords:
|
| 110 |
+
return
|
| 111 |
+
|
| 112 |
+
# Patch get_model in all agent modules where it's imported
|
| 113 |
+
agent_modules = [
|
| 114 |
+
"src.agents.input_parser",
|
| 115 |
+
"src.agents.writer",
|
| 116 |
+
"src.agents.long_writer",
|
| 117 |
+
"src.agents.proofreader",
|
| 118 |
+
"src.agents.knowledge_gap",
|
| 119 |
+
"src.agents.tool_selector",
|
| 120 |
+
"src.agents.thinking",
|
| 121 |
+
"src.agents.hypothesis_agent",
|
| 122 |
+
"src.agents.report_agent",
|
| 123 |
+
"src.agents.judge_agent_llm",
|
| 124 |
+
"src.orchestrator.planner_agent",
|
| 125 |
+
"src.services.statistical_analyzer",
|
| 126 |
+
]
|
| 127 |
+
|
| 128 |
+
patches = []
|
| 129 |
+
for module in agent_modules:
|
| 130 |
+
try:
|
| 131 |
+
patches.append(patch(f"{module}.get_model", return_value=mock_hf_model))
|
| 132 |
+
except (ImportError, AttributeError):
|
| 133 |
+
# Module might not exist or get_model might not be imported
|
| 134 |
+
pass
|
| 135 |
+
|
| 136 |
+
# Start all patches
|
| 137 |
+
for patch_obj in patches:
|
| 138 |
+
patch_obj.start()
|
| 139 |
+
|
| 140 |
+
yield
|
| 141 |
+
|
| 142 |
+
# Stop all patches
|
| 143 |
+
for patch_obj in patches:
|
| 144 |
+
patch_obj.stop()
|