Spaces:
Running
Contributing to The DETERMINATOR
Thank you for your interest in contributing to The DETERMINATOR! This guide will help you get started.
Table of Contents
- Git Workflow
- Getting Started
- Development Commands
- MCP Integration
- Common Pitfalls
- Key Principles
- Pull Request Process
Note: Additional sections (Code Style, Error Handling, Testing, Implementation Patterns, Code Quality, and Prompt Engineering) are available as separate pages in the documentation.
Note on Project Names: "The DETERMINATOR" is the product name, "DeepCritical" is the organization/project name, and "determinator" is the Python package name.
Repository Information
- GitHub Repository:
DeepCritical/GradioDemo(source of truth, PRs, code review) - HuggingFace Space:
DataQuests/DeepCritical(deployment/demo) - Package Name:
determinator(Python package name inpyproject.toml)
Git Workflow
main: Production-ready (GitHub)dev: Development integration (GitHub)- Use feature branches:
yourname-dev - NEVER push directly to
mainordevon HuggingFace - GitHub is source of truth; HuggingFace is for deployment
Dual Repository Setup
This project uses a dual repository setup:
- GitHub (
DeepCritical/GradioDemo): Source of truth for code, PRs, and code review - HuggingFace (
DataQuests/DeepCritical): Deployment target for the Gradio demo
Remote Configuration
When cloning, set up remotes as follows:
# Clone from GitHub
git clone https://github.com/DeepCritical/GradioDemo.git
cd GradioDemo
# Add HuggingFace remote (optional, for deployment)
git remote add huggingface-upstream https://huggingface.co/spaces/DataQuests/DeepCritical
Important: Never push directly to main or dev on HuggingFace. Always work through GitHub PRs. GitHub is the source of truth; HuggingFace is for deployment/demo only.
Getting Started
Fork the repository on GitHub:
DeepCritical/GradioDemoClone your fork:
git clone https://github.com/yourusername/GradioDemo.git cd GradioDemoInstall dependencies:
uv sync --all-extras uv run pre-commit installCreate a feature branch:
git checkout -b yourname-feature-nameMake your changes following the guidelines below
Run checks:
uv run ruff check src tests uv run mypy src uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfireCommit and push:
git commit -m "Description of changes" git push origin yourname-feature-nameCreate a pull request on GitHub
Package Manager
This project uses uv as the package manager. All commands should be prefixed with uv run to ensure they run in the correct environment.
Installation
# Install uv if you haven't already (recommended: standalone installer)
# Unix/macOS/Linux:
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows (PowerShell):
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# Alternative: pipx install uv
# Or: pip install uv
# Sync all dependencies including dev extras
uv sync --all-extras
# Install pre-commit hooks
uv run pre-commit install
Development Commands
# Installation
uv sync --all-extras # Install all dependencies including dev
uv run pre-commit install # Install pre-commit hooks
# Code Quality Checks (run all before committing)
uv run ruff check src tests # Lint with ruff
uv run ruff format src tests # Format with ruff
uv run mypy src # Type checking
uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire # Tests with coverage
# Testing Commands
uv run pytest tests/unit/ -v -m "not openai" -p no:logfire # Run unit tests (excludes OpenAI tests)
uv run pytest tests/ -v -m "huggingface" -p no:logfire # Run HuggingFace tests
uv run pytest tests/ -v -p no:logfire # Run all tests
uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire # Tests with terminal coverage
uv run pytest --cov=src --cov-report=html -p no:logfire # Generate HTML coverage report (opens htmlcov/index.html)
# Documentation Commands
uv run mkdocs build # Build documentation
uv run mkdocs serve # Serve documentation locally (http://127.0.0.1:8000)
Test Markers
The project uses pytest markers to categorize tests. See Testing Guidelines for details:
unit: Unit tests (mocked, fast)integration: Integration tests (real APIs)slow: Slow testsopenai: Tests requiring OpenAI API keyhuggingface: Tests requiring HuggingFace API keyembedding_provider: Tests requiring API-based embedding providerslocal_embeddings: Tests using local embeddings
Note: The -p no:logfire flag disables the logfire plugin to avoid conflicts during testing.
Code Style & Conventions
Type Safety
- ALWAYS use type hints for all function parameters and return types
- Use
mypy --strictcompliance (noAnyunless absolutely necessary) - Use
TYPE_CHECKINGimports for circular dependencies:
TYPE_CHECKING Import Pattern start_line:8 end_line:11
Pydantic Models
- All data exchange uses Pydantic models (
src/utils/models.py) - Models are frozen (
model_config = {"frozen": True}) for immutability - Use
Field()with descriptions for all model fields - Validate with
ge=,le=,min_length=,max_length=constraints
Async Patterns
- ALL I/O operations must be async (
async def,await) - Use
asyncio.gather()for parallel operations - CPU-bound work (embeddings, parsing) must use
run_in_executor():
loop = asyncio.get_running_loop()
result = await loop.run_in_executor(None, cpu_bound_function, args)
- Never block the event loop with synchronous I/O
Linting
- Ruff with 100-char line length
- Ignore rules documented in
pyproject.toml:PLR0913: Too many arguments (agents need many params)PLR0912: Too many branches (complex orchestrator logic)PLR0911: Too many return statements (complex agent logic)PLR2004: Magic values (statistical constants)PLW0603: Global statement (singleton pattern)PLC0415: Lazy imports for optional dependencies
Pre-commit
- Pre-commit hooks run automatically on commit
- Must pass: lint + typecheck + test-cov
- Install hooks with:
uv run pre-commit install - Note:
uv sync --all-extrasinstalls the pre-commit package, but you must runuv run pre-commit installseparately to set up the git hooks
Error Handling & Logging
Exception Hierarchy
Use custom exception hierarchy (src/utils/exceptions.py):
Exception Hierarchy start_line:4 end_line:31
Error Handling Rules
- Always chain exceptions:
raise SearchError(...) from e - Log errors with context using
structlog:
logger.error("Operation failed", error=str(e), context=value)
- Never silently swallow exceptions
- Provide actionable error messages
Logging
- Use
structlogfor all logging (NOTprintorlogging) - Import:
import structlog; logger = structlog.get_logger() - Log with structured data:
logger.info("event", key=value) - Use appropriate levels: DEBUG, INFO, WARNING, ERROR
Logging Examples
logger.info("Starting search", query=query, tools=[t.name for t in tools])
logger.warning("Search tool failed", tool=tool.name, error=str(result))
logger.error("Assessment failed", error=str(e))
Error Chaining
Always preserve exception context:
try:
result = await api_call()
except httpx.HTTPError as e:
raise SearchError(f"API call failed: {e}") from e
Testing Requirements
Test Structure
- Unit tests in
tests/unit/(mocked, fast) - Integration tests in
tests/integration/(real APIs, marked@pytest.mark.integration) - Use markers:
unit,integration,slow
Mocking
- Use
respxfor httpx mocking - Use
pytest-mockfor general mocking - Mock LLM calls in unit tests (use
MockJudgeHandler) - Fixtures in
tests/conftest.py:mock_httpx_client,mock_llm_response
TDD Workflow
- Write failing test in
tests/unit/ - Implement in
src/ - Ensure test passes
- Run checks:
uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire
Test Examples
@pytest.mark.unit
async def test_pubmed_search(mock_httpx_client):
tool = PubMedTool()
results = await tool.search("metformin", max_results=5)
assert len(results) > 0
assert all(isinstance(r, Evidence) for r in results)
@pytest.mark.integration
async def test_real_pubmed_search():
tool = PubMedTool()
results = await tool.search("metformin", max_results=3)
assert len(results) <= 3
Test Coverage
- Run
uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfirefor coverage report - Run
uv run pytest --cov=src --cov-report=html -p no:logfirefor HTML coverage report (openshtmlcov/index.html) - Aim for >80% coverage on critical paths
- Exclude:
__init__.py,TYPE_CHECKINGblocks
Implementation Patterns
Search Tools
All tools implement SearchTool protocol (src/tools/base.py):
- Must have
nameproperty - Must implement
async def search(query, max_results) -> list[Evidence] - Use
@retrydecorator from tenacity for resilience - Rate limiting: Implement
_rate_limit()for APIs with limits (e.g., PubMed) - Error handling: Raise
SearchErrororRateLimitErroron failures
Example pattern:
class MySearchTool:
@property
def name(self) -> str:
return "mytool"
@retry(stop=stop_after_attempt(3), wait=wait_exponential(...))
async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
# Implementation
return evidence_list
Judge Handlers
- Implement
JudgeHandlerProtocol(async def assess(question, evidence) -> JudgeAssessment) - Use pydantic-ai
Agentwithoutput_type=JudgeAssessment - System prompts in
src/prompts/judge.py - Support fallback handlers:
MockJudgeHandler,HFInferenceJudgeHandler - Always return valid
JudgeAssessment(never raise exceptions)
Agent Factory Pattern
- Use factory functions for creating agents (
src/agent_factory/) - Lazy initialization for optional dependencies (e.g., embeddings, Modal)
- Check requirements before initialization:
Check Magentic Requirements start_line:152 end_line:170
State Management
- Magentic Mode: Use
ContextVarfor thread-safe state (src/agents/state.py) - Simple Mode: Pass state via function parameters
- Never use global mutable state (except singletons via
@lru_cache)
Singleton Pattern
Use @lru_cache(maxsize=1) for singletons:
Singleton Pattern Example start_line:252 end_line:255
- Lazy initialization to avoid requiring dependencies at import time
Code Quality & Documentation
Docstrings
- Google-style docstrings for all public functions
- Include Args, Returns, Raises sections
- Use type hints in docstrings only if needed for clarity
Example:
Search Method Docstring Example start_line:51 end_line:58
Code Comments
- Explain WHY, not WHAT
- Document non-obvious patterns (e.g., why
requestsnothttpxfor ClinicalTrials) - Mark critical sections:
# CRITICAL: ... - Document rate limiting rationale
- Explain async patterns when non-obvious
Prompt Engineering & Citation Validation
Judge Prompts
- System prompt in
src/prompts/judge.py - Format evidence with truncation (1500 chars per item)
- Handle empty evidence case separately
- Always request structured JSON output
- Use
format_user_prompt()andformat_empty_evidence_prompt()helpers
Hypothesis Prompts
- Use diverse evidence selection (MMR algorithm)
- Sentence-aware truncation (
truncate_at_sentence()) - Format: Drug β Target β Pathway β Effect
- System prompt emphasizes mechanistic reasoning
- Use
format_hypothesis_prompt()with embeddings for diversity
Report Prompts
- Include full citation details for validation
- Use diverse evidence selection (n=20)
- CRITICAL: Emphasize citation validation rules
- Format hypotheses with support/contradiction counts
- System prompt includes explicit JSON structure requirements
Citation Validation
- ALWAYS validate references before returning reports
- Use
validate_references()fromsrc/utils/citation_validator.py - Remove hallucinated citations (URLs not in evidence)
- Log warnings for removed citations
- Never trust LLM-generated citations without validation
Citation Validation Rules
- Every reference URL must EXACTLY match a provided evidence URL
- Do NOT invent, fabricate, or hallucinate any references
- Do NOT modify paper titles, authors, dates, or URLs
- If unsure about a citation, OMIT it rather than guess
- Copy URLs exactly as provided - do not create similar-looking URLs
Evidence Selection
- Use
select_diverse_evidence()for MMR-based selection - Balance relevance vs diversity (lambda=0.7 default)
- Sentence-aware truncation preserves meaning
- Limit evidence per prompt to avoid context overflow
MCP Integration
MCP Tools
- Functions in
src/mcp_tools.pyfor Claude Desktop - Full type hints required
- Google-style docstrings with Args/Returns sections
- Formatted string returns (markdown)
Gradio MCP Server
- Enable with
mcp_server=Trueindemo.launch() - Endpoint:
/gradio_api/mcp/ - Use
ssr_mode=Falseto fix hydration issues in HF Spaces
Common Pitfalls
- Blocking the event loop: Never use sync I/O in async functions
- Missing type hints: All functions must have complete type annotations
- Hallucinated citations: Always validate references
- Global mutable state: Use ContextVar or pass via parameters
- Import errors: Lazy-load optional dependencies (magentic, modal, embeddings)
- Rate limiting: Always implement for external APIs
- Error chaining: Always use
from ewhen raising exceptions
Key Principles
- Type Safety First: All code must pass
mypy --strict - Async Everything: All I/O must be async
- Test-Driven: Write tests before implementation
- No Hallucinations: Validate all citations
- Graceful Degradation: Support free tier (HF Inference) when no API keys
- Lazy Loading: Don't require optional dependencies at import time
- Structured Logging: Use structlog, never print()
- Error Chaining: Always preserve exception context
Pull Request Process
- Ensure all checks pass:
uv run ruff check src tests && uv run mypy src && uv run pytest --cov=src --cov-report=term-missing tests/unit/ -v -m "not openai" -p no:logfire - Update documentation if needed
- Add tests for new features
- Update CHANGELOG if applicable
- Request review from maintainers
- Address review feedback
- Wait for approval before merging
Project Structure
src/: Main source codetests/: Test files (unit/andintegration/)docs/: Documentation source files (MkDocs)examples/: Example usage scriptspyproject.toml: Project configuration and dependencies.pre-commit-config.yaml: Pre-commit hook configuration
Questions?
- Open an issue on GitHub
- Check existing documentation
- Review code examples in the codebase
Thank you for contributing to The DETERMINATOR!