Spaces:

awellis
/

bfh-studadmin-assist

Sleeping

App Files Files Community

awellis commited on Oct 7

Commit

0021e2f

1 Parent(s): 5df4a2a

Refactor RAG Email Assistant for in-memory processing; update configurations, implement memory indexing and retrieval, enhance Gradio UI, and streamline document ingestion.

Browse files

Files changed (12) hide show

.env.example +2 -11
.gitignore +1 -0
QUICKSTART.md +86 -48
app.py +2 -2
app_hf.py +139 -0
requirements.txt +1 -3
scripts/ingest_documents_memory.py +81 -0
src/config.py +0 -2
src/indexing/memory_indexer.py +96 -0
src/pipeline/memory_orchestrator.py +192 -0
src/retrieval/memory_retriever.py +200 -0
src/ui/gradio_app_memory.py +326 -0

.env.example CHANGED Viewed

@@ -1,19 +1,10 @@
-# OpenAI Configuration
 OPENAI_API_KEY=your_openai_api_key_here
-LLM_MODEL=gpt-4o
 EMBEDDING_MODEL=text-embedding-3-small
 LLM_TEMPERATURE=0.7
 LLM_MAX_TOKENS=2000
-# OpenSearch Configuration
-OPENSEARCH_HOST=localhost
-OPENSEARCH_PORT=9200
-OPENSEARCH_USER=admin
-OPENSEARCH_PASSWORD=your_password_here
-OPENSEARCH_USE_SSL=true
-OPENSEARCH_VERIFY_CERTS=false
-INDEX_NAME=bfh_admin_docs
 # Document Processing Configuration
 DOCUMENTS_PATH=assets/markdown
 CHUNK_SIZE=300

+# OpenAI Configuration (required)
 OPENAI_API_KEY=your_openai_api_key_here
+LLM_MODEL=gpt-5-nano
 EMBEDDING_MODEL=text-embedding-3-small
 LLM_TEMPERATURE=0.7
 LLM_MAX_TOKENS=2000
 # Document Processing Configuration
 DOCUMENTS_PATH=assets/markdown
 CHUNK_SIZE=300

.gitignore CHANGED Viewed

@@ -159,6 +159,7 @@ rag_email_assistant_haystack_2_pydantic_ai_gradio_modular_2025_baseline.py
 *.xlsx
 *.xls
 *.parquet
 data/
 datasets/

 *.xlsx
 *.xls
 *.parquet
+*.pkl
 data/
 datasets/

QUICKSTART.md CHANGED Viewed

@@ -1,15 +1,22 @@
-# Quick Start Guide
 ## Prerequisites
 1. **Python 3.10+** installed
-2. **OpenSearch instance** running with k-NN plugin enabled
-3. **OpenAI API key**
-## Setup (5 minutes)
 ### 1. Install Dependencies
 ```bash
 pip install -r requirements.txt
 ```
@@ -17,46 +24,46 @@ pip install -r requirements.txt
 ### 2. Configure Environment
 ```bash
-# Copy the example environment file
 cp .env.example .env
-# Edit .env and add your credentials
 nano .env  # or use your preferred editor
 ```
-**Required variables:**
-- `OPENAI_API_KEY` - Your OpenAI API key
-- `OPENSEARCH_HOST` - OpenSearch host (e.g., localhost)
-- `OPENSEARCH_PORT` - OpenSearch port (e.g., 9200)
-- `OPENSEARCH_USER` - OpenSearch username
-- `OPENSEARCH_PASSWORD` - OpenSearch password
-### 3. Index Documents
 ```bash
-python scripts/ingest_documents.py
 ```
-This will:
-- Load markdown documents from `assets/markdown/`
-- Chunk them semantically
-- Generate embeddings
-- Index in OpenSearch
-Expected output:
-```
-Successfully indexed X document chunks
-Total documents in index: X
-✅ Document ingestion completed successfully!
 ```
-### 4. Run the Application
 ```bash
 python app.py
 ```
-The Gradio interface will launch at `http://localhost:7860`
 ## Usage
@@ -71,46 +78,77 @@ The Gradio interface will launch at `http://localhost:7860`
 ## Example Queries
-German:
 - "Wie kann ich mich exmatrikulieren?"
 - "Was kostet eine Namensänderung?"
 - "Ich möchte ein Modul zurückziehen. Was muss ich beachten?"
 - "Welche Fristen gibt es für die Beurlaubung?"
-English:
 - "How can I withdraw from the university?"
 - "What are the fees for changing my name?"
 - "I want to take a leave of absence. What do I need to know?"
-## Troubleshooting
-### Cannot connect to OpenSearch
-- Check that OpenSearch is running: `curl -X GET "localhost:9200"`
-- Verify credentials in `.env`
-- Check firewall settings
-### No documents indexed
-- Verify markdown files exist in `assets/markdown/`
-- Check OpenSearch index: `curl -X GET "localhost:9200/_cat/indices"`
-- Review ingestion script logs
-### OpenAI API errors
 - Verify API key in `.env`
 - Check API quota and billing
 - Ensure internet connectivity
 ## Next Steps
 - Review [README.md](README.md) for full documentation
-- Check [docs/RAG_Email_Assistant_Specifications_v1.0.md](docs/RAG_Email_Assistant_Specifications_v1.0.md) for architecture details
 - See [CLAUDE.md](CLAUDE.md) for development guidance
 ## Support
-For issues, please check:
-1. Environment variables are correctly set
-2. OpenSearch is accessible
-3. Documents are properly indexed
-4. API keys are valid
-Need help? Open an issue on GitHub.

+# Quick Start Guide (No Docker Needed!)
 ## Prerequisites
 1. **Python 3.10+** installed
+2. **OpenAI API key** (or use gpt-4o-mini for low cost)
+**That's it!** No Docker, no OpenSearch needed!
+## Setup (2 minutes)
 ### 1. Install Dependencies
+Using `uv` (recommended - faster):
+```bash
+uv pip install -r requirements.txt
+```
+Or using `pip`:
 ```bash
 pip install -r requirements.txt
 ```
 ### 2. Configure Environment
 ```bash
+# Copy the example file
 cp .env.example .env
+# Edit and add your OpenAI API key
 nano .env  # or use your preferred editor
 ```
+**Required:**
 ```bash
+OPENAI_API_KEY=sk-your-key-here
 ```
+**Optional (has good defaults):**
+```bash
+LLM_MODEL=gpt-4o-mini  # Very affordable!
+EMBEDDING_MODEL=text-embedding-3-small
 ```
+### 3. Run the Application
 ```bash
 python app.py
 ```
+**That's it!** The app will:
+- Automatically load markdown documents from `assets/markdown/`
+- Create an in-memory document store
+- Generate embeddings (first run takes ~30 seconds)
+- Save the document store to `data/document_store.pkl` for faster subsequent runs
+- Launch the Gradio interface at `http://localhost:7860`
+## First Run
+The first time you run the app, it will:
+1. Load 8 administrative documents
+2. Chunk them into ~30-50 pieces
+3. Generate embeddings using OpenAI
+4. Save to `data/document_store.pkl`
+**Next runs are instant** - it loads from the pickle file!
 ## Usage
 ## Example Queries
+**German:**
 - "Wie kann ich mich exmatrikulieren?"
 - "Was kostet eine Namensänderung?"
 - "Ich möchte ein Modul zurückziehen. Was muss ich beachten?"
 - "Welche Fristen gibt es für die Beurlaubung?"
+**English:**
 - "How can I withdraw from the university?"
 - "What are the fees for changing my name?"
 - "I want to take a leave of absence. What do I need to know?"
+## Pre-indexing (Optional)
+If you want to pre-index documents separately:
+```bash
+python scripts/ingest_documents_memory.py
+```
+This creates `data/document_store.pkl` which the app will use automatically.
+## Cost Estimate
+With **gpt-4o-mini**:
+- Typical email: **< $0.001** (less than a tenth of a cent)
+- First-time indexing (8 documents): **~$0.01**
+- Embeddings are cached in the pickle file
+## Hugging Face Spaces Deployment
+1. **Push your code** to a HF Space
+2. **Add Secret:** `OPENAI_API_KEY` in Space settings
+3. **Done!** The app auto-indexes on first run
+The document store persists in the Space, so it only indexes once.
+## Troubleshooting
+### First run is slow
+- Normal! It's generating embeddings for all documents
+- Subsequent runs load from pickle (instant)
+### OpenAI API errors
 - Verify API key in `.env`
 - Check API quota and billing
 - Ensure internet connectivity
+### Import errors
+- Run: `uv pip install -r requirements.txt` or `pip install -r requirements.txt`
+## Advantages Over Docker Version
+✅ **No Docker needed**
+✅ **No OpenSearch setup**
+✅ **Works on any machine**
+✅ **Perfect for HF Spaces**
+✅ **Faster setup (2 min vs 15 min)**
+✅ **In-memory = instant retrieval**
+✅ **Portable (just copy the pickle file)**
 ## Next Steps
 - Review [README.md](README.md) for full documentation
+- Check [docs/RAG_Email_Assistant_Specifications_v1.0.md](docs/RAG_Email_Assistant_Specifications_v1.0.md) for architecture
 - See [CLAUDE.md](CLAUDE.md) for development guidance
 ## Support
+Need help? The setup is simple:
+1. Install dependencies
+2. Add OpenAI API key
+3. Run `python app.py`
+That's it! 🚀

app.py CHANGED Viewed

@@ -1,7 +1,7 @@
 """Main application entry point for Hugging Face Spaces deployment."""
 import logging
-from src.ui.gradio_app import create_gradio_interface
 # Configure logging
 logging.basicConfig(
@@ -12,7 +12,7 @@ logging.basicConfig(
 logger = logging.getLogger(__name__)
 # Create and launch the Gradio interface
-logger.info("Starting BFH Student Administration Email Assistant...")
 demo = create_gradio_interface()

 """Main application entry point for Hugging Face Spaces deployment."""
 import logging
+from src.ui.gradio_app_memory import create_gradio_interface
 # Configure logging
 logging.basicConfig(
 logger = logging.getLogger(__name__)
 # Create and launch the Gradio interface
+logger.info("Starting BFH Student Administration Email Assistant (in-memory mode)...")
 demo = create_gradio_interface()

app_hf.py ADDED Viewed

	@@ -0,0 +1,139 @@

+"""Hugging Face Spaces version using HF Inference API with gpt-oss-20b."""
+import gradio as gr
+from huggingface_hub import InferenceClient
+import os
+# Initialize HF Inference Client
+client = InferenceClient(model="openai/gpt-oss-20b")
+def compose_email(
+    query: str,
+    history: list,
+    system_message: str,
+    max_tokens: int,
+    temperature: float,
+    top_p: float,
+    hf_token: gr.OAuthToken | None = None,
+) -> str:
+    """Compose email response using HF Inference API."""
+    # Use OAuth token if available
+    token = hf_token.token if hf_token else os.getenv("HF_TOKEN")
+    client_with_token = InferenceClient(model="openai/gpt-oss-20b", token=token)
+    # Enhanced system message for email composition
+    email_system_prompt = """You are an AI assistant for BFH (Bern University of Applied Sciences) administrative staff.
+Your task is to help compose professional email responses to student inquiries about:
+- Exmatriculation (leaving university)
+- Leave of absence (Beurlaubung)
+- Name changes
+- Insurance matters (AHV, health insurance)
+- Fees and payments
+- Course withdrawals and deadlines
+Compose professional, accurate, and helpful email responses in the same language as the query.
+Include a subject line and body. Use formal tone for German (Sie form).
+Format your response as:
+Subject: [subject line]
+[email body]"""
+    messages = [{"role": "system", "content": email_system_prompt}]
+    # Add history
+    if history:
+        messages.extend(history)
+    # Add current query
+    messages.append({"role": "user", "content": f"Student query: {query}\n\nCompose an appropriate email response."})
+    # Stream response
+    response = ""
+    for message in client_with_token.chat_completion(
+        messages,
+        max_tokens=max_tokens,
+        stream=True,
+        temperature=temperature,
+        top_p=top_p,
+    ):
+        if message.choices and message.choices[0].delta.content:
+            response += message.choices[0].delta.content
+            yield response
+    return response
+# Create Gradio interface
+with gr.Blocks(title="BFH Email Assistant", theme=gr.themes.Soft()) as demo:
+    gr.Markdown(
+        """
+        # 📧 BFH Student Administration Email Assistant
+        AI-powered assistant for composing email responses to student inquiries.
+        Uses **gpt-oss-20b** via Hugging Face Inference API (free!).
+        """
+    )
+    chatbot = gr.ChatInterface(
+        compose_email,
+        type="messages",
+        additional_inputs=[
+            gr.Textbox(
+                value="You are a professional university administrative assistant.",
+                label="System message",
+                visible=False,
+            ),
+            gr.Slider(
+                minimum=256,
+                maximum=2048,
+                value=1024,
+                step=1,
+                label="Max tokens",
+            ),
+            gr.Slider(
+                minimum=0.1,
+                maximum=2.0,
+                value=0.7,
+                step=0.1,
+                label="Temperature",
+            ),
+            gr.Slider(
+                minimum=0.1,
+                maximum=1.0,
+                value=0.95,
+                step=0.05,
+                label="Top-p",
+            ),
+        ],
+        examples=[
+            ["Wie kann ich mich exmatrikulieren?"],
+            ["What are the fees for changing my name?"],
+            ["Ich möchte ein Modul zurückziehen. Was muss ich beachten?"],
+            ["How do I apply for a leave of absence?"],
+        ],
+    )
+    with gr.Sidebar():
+        gr.LoginButton()
+        gr.Markdown(
+            """
+            ### About
+            This assistant helps compose email responses for BFH administrative staff.
+            ### Topics Covered
+            - Exmatriculation
+            - Leave of absence
+            - Name changes
+            - Insurance
+            - Fees
+            - Course withdrawals
+            """
+        )
+if __name__ == "__main__":
+    demo.launch()

requirements.txt CHANGED Viewed

@@ -1,10 +1,8 @@
 # Core dependencies
 python-dotenv==1.1.1
-# Haystack and integrations
 haystack-ai==2.8.0
-opensearch-haystack==1.1.0
-opensearch-py==2.8.0
 # PydanticAI for agents
 pydantic-ai==0.0.14

 # Core dependencies
 python-dotenv==1.1.1
+# Haystack (no OpenSearch needed!)
 haystack-ai==2.8.0
 # PydanticAI for agents
 pydantic-ai==0.0.14

scripts/ingest_documents_memory.py ADDED Viewed

	@@ -0,0 +1,81 @@

+#!/usr/bin/env python3
+"""Script to ingest documents and save to pickle for in-memory use."""
+import sys
+import logging
+import pickle
+from pathlib import Path
+# Add src to path
+sys.path.insert(0, str(Path(__file__).parent.parent))
+from src.config import get_config
+from src.document_processing.loader import MarkdownDocumentLoader
+from src.document_processing.chunker import SemanticChunker
+from src.indexing.memory_indexer import MemoryDocumentIndexer
+def setup_logging():
+    """Configure logging."""
+    logging.basicConfig(
+        level=logging.INFO,
+        format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
+    )
+def main():
+    """Main ingestion workflow."""
+    setup_logging()
+    logger = logging.getLogger(__name__)
+    logger.info("Starting document ingestion process (in-memory)...")
+    # Load configuration
+    config = get_config()
+    logger.info(f"Using documents path: {config.document_processing.documents_path}")
+    # Load documents
+    logger.info("Loading markdown documents...")
+    loader = MarkdownDocumentLoader(config.document_processing.documents_path)
+    documents = loader.load_documents()
+    if not documents:
+        logger.error("No documents loaded. Exiting.")
+        sys.exit(1)
+    logger.info(f"Loaded {len(documents)} documents")
+    # Chunk documents
+    logger.info("Chunking documents...")
+    chunker = SemanticChunker(
+        chunk_size=config.document_processing.chunk_size,
+        chunk_overlap=config.document_processing.chunk_overlap,
+        min_chunk_size=config.document_processing.min_chunk_size,
+    )
+    chunked_documents = chunker.chunk_documents(documents)
+    logger.info(f"Created {len(chunked_documents)} chunks")
+    # Index documents in memory
+    logger.info("Indexing documents in memory...")
+    indexer = MemoryDocumentIndexer(llm_config=config.llm)
+    indexed_count = indexer.index_documents(chunked_documents)
+    logger.info(f"Successfully indexed {indexed_count} document chunks")
+    # Save document store to pickle for later use
+    output_file = Path("data/document_store.pkl")
+    output_file.parent.mkdir(parents=True, exist_ok=True)
+    logger.info(f"Saving document store to {output_file}...")
+    with open(output_file, "wb") as f:
+        pickle.dump(indexer.document_store, f)
+    logger.info("✅ Document ingestion completed successfully!")
+    logger.info(f"Document store saved to: {output_file}")
+    logger.info(f"Total documents indexed: {indexed_count}")
+if __name__ == "__main__":
+    main()

src/config.py CHANGED Viewed

@@ -105,7 +105,6 @@ class RetrievalConfig:
 class AppConfig:
     """Main application configuration."""
-    opensearch: OpenSearchConfig
     llm: LLMConfig
     document_processing: DocumentProcessingConfig
     retrieval: RetrievalConfig
@@ -115,7 +114,6 @@ class AppConfig:
     def from_env(cls) -> "AppConfig":
         """Create complete configuration from environment variables."""
         return cls(
-            opensearch=OpenSearchConfig.from_env(),
             llm=LLMConfig.from_env(),
             document_processing=DocumentProcessingConfig.from_env(),
             retrieval=RetrievalConfig.from_env(),

 class AppConfig:
     """Main application configuration."""
     llm: LLMConfig
     document_processing: DocumentProcessingConfig
     retrieval: RetrievalConfig
     def from_env(cls) -> "AppConfig":
         """Create complete configuration from environment variables."""
         return cls(
             llm=LLMConfig.from_env(),
             document_processing=DocumentProcessingConfig.from_env(),
             retrieval=RetrievalConfig.from_env(),

src/indexing/memory_indexer.py ADDED Viewed

	@@ -0,0 +1,96 @@

+"""Document indexer using in-memory document store (no Docker/OpenSearch needed)."""
+from typing import List
+from haystack import Document
+from haystack.components.embedders import OpenAIDocumentEmbedder
+from haystack.document_stores.in_memory import InMemoryDocumentStore
+import logging
+from ..config import LLMConfig
+logger = logging.getLogger(__name__)
+class MemoryDocumentIndexer:
+    """Indexes documents in memory with embeddings (no external dependencies)."""
+    def __init__(self, llm_config: LLMConfig):
+        """
+        Initialize the in-memory document indexer.
+        Args:
+            llm_config: LLM configuration for embeddings
+        """
+        self.llm_config = llm_config
+        # Initialize in-memory document store
+        self.document_store = InMemoryDocumentStore()
+        # Initialize embedder
+        self.embedder = OpenAIDocumentEmbedder(
+            api_key=llm_config.api_key,
+            model=llm_config.embedding_model,
+        )
+    def index_documents(self, documents: List[Document]) -> int:
+        """
+        Index documents with embeddings in memory.
+        Args:
+            documents: List of documents to index
+        Returns:
+            Number of documents successfully indexed
+        """
+        if not documents:
+            logger.warning("No documents to index")
+            return 0
+        logger.info(f"Indexing {len(documents)} documents in memory")
+        try:
+            # Generate embeddings for documents
+            logger.info("Generating embeddings...")
+            result = self.embedder.run(documents=documents)
+            embedded_docs = result.get("documents", [])
+            if not embedded_docs:
+                logger.error("Failed to generate embeddings")
+                return 0
+            logger.info(f"Generated embeddings for {len(embedded_docs)} documents")
+            # Write documents to in-memory store
+            logger.info("Writing documents to memory...")
+            self.document_store.write_documents(embedded_docs)
+            doc_count = self.document_store.count_documents()
+            logger.info(f"Successfully indexed documents. Total documents in store: {doc_count}")
+            return len(embedded_docs)
+        except Exception as e:
+            logger.error(f"Error indexing documents: {e}")
+            raise
+    def clear_index(self):
+        """Clear all documents from the index."""
+        try:
+            self.document_store.delete_documents()
+            logger.info("Cleared all documents from index")
+        except Exception as e:
+            logger.error(f"Error clearing index: {e}")
+            raise
+    def get_document_count(self) -> int:
+        """
+        Get number of documents in the index.
+        Returns:
+            Document count
+        """
+        try:
+            return self.document_store.count_documents()
+        except Exception as e:
+            logger.error(f"Error getting document count: {e}")
+            return 0

src/pipeline/memory_orchestrator.py ADDED Viewed

	@@ -0,0 +1,192 @@

+"""RAG pipeline orchestrator using in-memory components (no Docker needed)."""
+from typing import Dict, Any, List
+from pydantic import BaseModel
+from haystack import Document
+import logging
+from ..config import AppConfig
+from ..agents.intent_agent import IntentAgent, IntentData
+from ..agents.composer_agent import ComposerAgent, EmailDraft
+from ..agents.fact_checker_agent import FactCheckerAgent, FactCheckResult
+from ..retrieval.memory_retriever import MemoryRetriever
+from ..indexing.memory_indexer import MemoryDocumentIndexer
+logger = logging.getLogger(__name__)
+class PipelineResult(BaseModel):
+    """Complete result from the RAG pipeline."""
+    query: str
+    intent: IntentData
+    retrieved_docs: List[Dict[str, Any]]
+    email_draft: EmailDraft
+    fact_check: FactCheckResult
+    processing_time: float = 0.0
+class MemoryRAGOrchestrator:
+    """Orchestrates the multi-agent RAG pipeline using in-memory components."""
+    def __init__(self, config: AppConfig, document_indexer: MemoryDocumentIndexer):
+        """
+        Initialize the RAG orchestrator.
+        Args:
+            config: Application configuration
+            document_indexer: Memory document indexer instance
+        """
+        self.config = config
+        # Initialize agents
+        self.intent_agent = IntentAgent(
+            api_key=config.llm.api_key,
+            model=f"openai:{config.llm.model_name}",
+        )
+        self.composer_agent = ComposerAgent(
+            api_key=config.llm.api_key,
+            model=f"openai:{config.llm.model_name}",
+        )
+        self.fact_checker_agent = FactCheckerAgent(
+            api_key=config.llm.api_key,
+            model=f"openai:{config.llm.model_name}",
+        )
+        # Initialize retriever
+        self.retriever = MemoryRetriever(
+            document_store=document_indexer.document_store,
+            llm_config=config.llm,
+            retrieval_config=config.retrieval,
+        )
+    async def process_query(self, query: str) -> PipelineResult:
+        """
+        Process a user query through the complete RAG pipeline.
+        Args:
+            query: User's query text
+        Returns:
+            Complete pipeline result
+        """
+        import time
+        start_time = time.time()
+        logger.info(f"Processing query: {query[:100]}...")
+        try:
+            # Step 1: Extract intent
+            logger.info("Step 1: Extracting intent...")
+            intent = await self.intent_agent.extract_intent(query)
+            # Step 2: Retrieve relevant documents
+            logger.info("Step 2: Retrieving relevant documents...")
+            retrieved_docs = self.retriever.retrieve(query)
+            logger.info(f"Retrieved {len(retrieved_docs)} documents")
+            # Step 3: Compose email draft
+            logger.info("Step 3: Composing email draft...")
+            email_draft = await self.composer_agent.compose_email(
+                query=query,
+                intent=intent,
+                context_docs=retrieved_docs,
+            )
+            # Step 4: Fact-check the draft
+            logger.info("Step 4: Fact-checking email draft...")
+            fact_check = await self.fact_checker_agent.fact_check(
+                email_draft=email_draft,
+                source_docs=retrieved_docs,
+            )
+            processing_time = time.time() - start_time
+            # Build result
+            result = PipelineResult(
+                query=query,
+                intent=intent,
+                retrieved_docs=self._serialize_documents(retrieved_docs),
+                email_draft=email_draft,
+                fact_check=fact_check,
+                processing_time=processing_time,
+            )
+            logger.info(f"Pipeline completed in {processing_time:.2f}s")
+            return result
+        except Exception as e:
+            logger.error(f"Error in pipeline: {e}")
+            raise
+    def _serialize_documents(self, documents: List[Document]) -> List[Dict[str, Any]]:
+        """
+        Serialize Haystack documents to dictionaries.
+        Args:
+            documents: List of Haystack documents
+        Returns:
+            List of document dictionaries
+        """
+        serialized = []
+        for doc in documents:
+            serialized.append(
+                {
+                    "content": doc.content,
+                    "score": doc.score,
+                    "meta": doc.meta or {},
+                }
+            )
+        return serialized
+    async def refine_draft(
+        self,
+        original_query: str,
+        current_draft: str,
+        user_feedback: str,
+        retrieved_docs: List[Document],
+    ) -> EmailDraft:
+        """
+        Refine an email draft based on user feedback.
+        Args:
+            original_query: Original user query
+            current_draft: Current email draft text
+            user_feedback: User's feedback or refinement request
+            retrieved_docs: Previously retrieved documents
+        Returns:
+            Refined email draft
+        """
+        logger.info("Refining email draft based on user feedback...")
+        # Create refinement prompt
+        refinement_query = f"""Original Query: {original_query}
+Current Draft:
+{current_draft}
+User Feedback/Refinement Request:
+{user_feedback}
+Please revise the email draft according to the user's feedback while maintaining accuracy and professionalism."""
+        # Re-extract intent with refinement context
+        intent = await self.intent_agent.extract_intent(refinement_query)
+        # Compose refined draft
+        refined_draft = await self.composer_agent.compose_email(
+            query=refinement_query,
+            intent=intent,
+            context_docs=retrieved_docs,
+        )
+        logger.info("Email draft refined")
+        return refined_draft

src/retrieval/memory_retriever.py ADDED Viewed

	@@ -0,0 +1,200 @@

+"""Retriever using in-memory document store (no Docker/OpenSearch needed)."""
+from typing import List
+from haystack import Document
+from haystack.components.embedders import OpenAITextEmbedder
+from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever, InMemoryBM25Retriever
+from haystack.document_stores.in_memory import InMemoryDocumentStore
+import logging
+from ..config import RetrievalConfig, LLMConfig
+logger = logging.getLogger(__name__)
+class MemoryRetriever:
+    """Retrieves documents using in-memory hybrid BM25 + vector search."""
+    def __init__(
+        self,
+        document_store: InMemoryDocumentStore,
+        llm_config: LLMConfig,
+        retrieval_config: RetrievalConfig,
+    ):
+        """
+        Initialize the in-memory hybrid retriever.
+        Args:
+            document_store: InMemory document store
+            llm_config: LLM configuration for embeddings
+            retrieval_config: Retrieval configuration
+        """
+        self.document_store = document_store
+        self.llm_config = llm_config
+        self.retrieval_config = retrieval_config
+        # Initialize BM25 retriever
+        self.bm25_retriever = InMemoryBM25Retriever(
+            document_store=document_store,
+        )
+        # Initialize embedding retriever
+        self.embedding_retriever = InMemoryEmbeddingRetriever(
+            document_store=document_store,
+        )
+        # Initialize text embedder for queries
+        self.text_embedder = OpenAITextEmbedder(
+            api_key=llm_config.api_key,
+            model=llm_config.embedding_model,
+        )
+    def retrieve(self, query: str) -> List[Document]:
+        """
+        Retrieve documents using hybrid search.
+        Args:
+            query: Search query
+        Returns:
+            List of relevant documents with scores
+        """
+        logger.info(f"Retrieving documents for query: {query[:100]}...")
+        try:
+            # Get BM25 results
+            logger.debug("Running BM25 retrieval...")
+            bm25_results = self.bm25_retriever.run(
+                query=query,
+                top_k=self.retrieval_config.top_k * 2,
+            )
+            bm25_docs = bm25_results.get("documents", [])
+            logger.debug(f"BM25 retrieved {len(bm25_docs)} documents")
+            # Generate query embedding
+            logger.debug("Generating query embedding...")
+            embedding_result = self.text_embedder.run(text=query)
+            query_embedding = embedding_result.get("embedding")
+            if not query_embedding:
+                logger.warning("Failed to generate query embedding, using BM25 only")
+                return self._apply_score_threshold(bm25_docs)
+            # Get vector search results
+            logger.debug("Running vector retrieval...")
+            vector_results = self.embedding_retriever.run(
+                query_embedding=query_embedding,
+                top_k=self.retrieval_config.top_k * 2,
+            )
+            vector_docs = vector_results.get("documents", [])
+            logger.debug(f"Vector search retrieved {len(vector_docs)} documents")
+            # Merge and rank results
+            merged_docs = self._merge_results(bm25_docs, vector_docs)
+            # Apply score threshold and limit
+            final_docs = self._apply_score_threshold(merged_docs)
+            final_docs = final_docs[: self.retrieval_config.top_k]
+            logger.info(f"Retrieved {len(final_docs)} documents after hybrid ranking")
+            return final_docs
+        except Exception as e:
+            logger.error(f"Error during retrieval: {e}")
+            return []
+    def _merge_results(
+        self, bm25_docs: List[Document], vector_docs: List[Document]
+    ) -> List[Document]:
+        """
+        Merge BM25 and vector search results using weighted scoring.
+        Args:
+            bm25_docs: Documents from BM25 search
+            vector_docs: Documents from vector search
+        Returns:
+            Merged and ranked documents
+        """
+        from typing import Dict, Any
+        # Create score maps
+        doc_scores: Dict[str, Dict[str, Any]] = {}
+        # Process BM25 results
+        for doc in bm25_docs:
+            doc_id = doc.id or doc.content[:50]
+            bm25_score = doc.score or 0.0
+            if doc_id not in doc_scores:
+                doc_scores[doc_id] = {
+                    "document": doc,
+                    "bm25_score": 0.0,
+                    "vector_score": 0.0,
+                }
+            doc_scores[doc_id]["bm25_score"] = bm25_score
+        # Process vector results
+        for doc in vector_docs:
+            doc_id = doc.id or doc.content[:50]
+            vector_score = doc.score or 0.0
+            if doc_id not in doc_scores:
+                doc_scores[doc_id] = {
+                    "document": doc,
+                    "bm25_score": 0.0,
+                    "vector_score": 0.0,
+                }
+            doc_scores[doc_id]["vector_score"] = vector_score
+        # Normalize and combine scores
+        bm25_scores = [info["bm25_score"] for info in doc_scores.values()]
+        vector_scores = [info["vector_score"] for info in doc_scores.values()]
+        max_bm25 = max(bm25_scores) if bm25_scores else 1.0
+        max_vector = max(vector_scores) if vector_scores else 1.0
+        merged_docs = []
+        for doc_id, info in doc_scores.items():
+            # Normalize scores
+            norm_bm25 = info["bm25_score"] / max_bm25 if max_bm25 > 0 else 0.0
+            norm_vector = info["vector_score"] / max_vector if max_vector > 0 else 0.0
+            # Combine with weights
+            combined_score = (
+                self.retrieval_config.bm25_weight * norm_bm25
+                + self.retrieval_config.vector_weight * norm_vector
+            )
+            doc = info["document"]
+            doc.score = combined_score
+            if doc.meta is None:
+                doc.meta = {}
+            doc.meta["bm25_score"] = info["bm25_score"]
+            doc.meta["vector_score"] = info["vector_score"]
+            doc.meta["combined_score"] = combined_score
+            merged_docs.append(doc)
+        # Sort by combined score
+        merged_docs.sort(key=lambda x: x.score or 0.0, reverse=True)
+        return merged_docs
+    def _apply_score_threshold(self, documents: List[Document]) -> List[Document]:
+        """
+        Filter documents by minimum score threshold.
+        Args:
+            documents: Documents to filter
+        Returns:
+            Filtered documents
+        """
+        return [
+            doc
+            for doc in documents
+            if doc.score and doc.score >= self.retrieval_config.min_score
+        ]

src/ui/gradio_app_memory.py ADDED Viewed

	@@ -0,0 +1,326 @@

+"""Gradio UI for the RAG Email Assistant (in-memory, no Docker needed)."""
+import gradio as gr
+from typing import Tuple, List, Dict, Any
+import logging
+import asyncio
+import pickle
+from pathlib import Path
+from ..config import get_config, AppConfig
+from ..indexing.memory_indexer import MemoryDocumentIndexer
+from ..pipeline.memory_orchestrator import MemoryRAGOrchestrator, PipelineResult
+from ..document_processing.loader import MarkdownDocumentLoader
+from ..document_processing.chunker import SemanticChunker
+logger = logging.getLogger(__name__)
+class GradioEmailAssistant:
+    """Gradio interface for the email assistant (in-memory)."""
+    def __init__(self, config: AppConfig):
+        """
+        Initialize the Gradio assistant.
+        Args:
+            config: Application configuration
+        """
+        self.config = config
+        # Initialize indexer
+        self.indexer = MemoryDocumentIndexer(llm_config=config.llm)
+        # Load or create document store
+        self._load_or_create_documents()
+        # Initialize orchestrator
+        self.orchestrator = MemoryRAGOrchestrator(
+            config=config,
+            document_indexer=self.indexer,
+        )
+        # Store last pipeline result for refinement
+        self.last_result: PipelineResult | None = None
+    def _load_or_create_documents(self):
+        """Load documents from pickle or create fresh."""
+        doc_store_path = Path("data/document_store.pkl")
+        if doc_store_path.exists():
+            logger.info(f"Loading document store from {doc_store_path}...")
+            try:
+                with open(doc_store_path, "rb") as f:
+                    self.indexer.document_store = pickle.load(f)
+                logger.info(f"Loaded {self.indexer.get_document_count()} documents")
+                return
+            except Exception as e:
+                logger.warning(f"Failed to load document store: {e}")
+        # Create documents if not found
+        logger.info("Creating fresh document index...")
+        loader = MarkdownDocumentLoader(self.config.document_processing.documents_path)
+        documents = loader.load_documents()
+        chunker = SemanticChunker(
+            chunk_size=self.config.document_processing.chunk_size,
+            chunk_overlap=self.config.document_processing.chunk_overlap,
+            min_chunk_size=self.config.document_processing.min_chunk_size,
+        )
+        chunked_docs = chunker.chunk_documents(documents)
+        self.indexer.index_documents(chunked_docs)
+        # Save for next time
+        doc_store_path.parent.mkdir(parents=True, exist_ok=True)
+        with open(doc_store_path, "wb") as f:
+            pickle.dump(self.indexer.document_store, f)
+        logger.info(f"Saved document store to {doc_store_path}")
+    async def process_query_async(
+        self, query: str
+    ) -> Tuple[str, str, str, str, str, List[Dict[str, Any]]]:
+        """
+        Process a user query asynchronously.
+        Args:
+            query: User query text
+        Returns:
+            Tuple of (subject, body, intent_info, fact_check_info, stats, sources)
+        """
+        try:
+            # Process through pipeline
+            result = await self.orchestrator.process_query(query)
+            self.last_result = result
+            # Extract components
+            subject = result.email_draft.subject
+            body = result.email_draft.body
+            # Format intent information
+            intent_info = f"""**Action Type:** {result.intent.action_type}
+**Topic:** {result.intent.topic}
+**Language:** {result.intent.language}
+**Urgency:** {result.intent.urgency}
+**Key Entities:** {', '.join(result.intent.key_entities) if result.intent.key_entities else 'None'}
+**Questions:** {', '.join(result.intent.specific_questions) if result.intent.specific_questions else 'None'}"""
+            # Format fact check information
+            accuracy_emoji = "✅" if result.fact_check.is_accurate else "⚠️"
+            fact_check_info = f"""**Status:** {accuracy_emoji} {'Accurate' if result.fact_check.is_accurate else 'Issues Found'}
+**Accuracy Score:** {result.fact_check.accuracy_score:.1%}
+**Verified Claims:**
+{self._format_list(result.fact_check.verified_claims)}
+**Issues Found:**
+{self._format_list(result.fact_check.issues_found) if result.fact_check.issues_found else 'None'}
+**Suggestions:**
+{self._format_list(result.fact_check.suggestions) if result.fact_check.suggestions else 'None'}"""
+            # Format statistics
+            stats = f"""**Processing Time:** {result.processing_time:.2f}s
+**Documents Retrieved:** {len(result.retrieved_docs)}
+**Confidence:** {result.email_draft.confidence:.1%}"""
+            # Format sources
+            sources = []
+            for i, doc in enumerate(result.retrieved_docs, 1):
+                sources.append(
+                    {
+                        "Number": i,
+                        "Source": doc["meta"].get("source_file", "Unknown"),
+                        "Score": f"{doc['score']:.3f}",
+                        "Preview": doc["content"][:200] + "...",
+                    }
+                )
+            return subject, body, intent_info, fact_check_info, stats, sources
+        except Exception as e:
+            logger.error(f"Error processing query: {e}")
+            error_msg = f"Error: {str(e)}"
+            return (
+                "Error",
+                error_msg,
+                error_msg,
+                error_msg,
+                error_msg,
+                [],
+            )
+    def process_query_sync(
+        self, query: str
+    ) -> Tuple[str, str, str, str, str, List[Dict[str, Any]]]:
+        """Synchronous wrapper for async query processing."""
+        return asyncio.run(self.process_query_async(query))
+    async def refine_draft_async(
+        self, subject: str, body: str, feedback: str
+    ) -> Tuple[str, str]:
+        """
+        Refine the current draft based on user feedback.
+        Args:
+            subject: Current subject
+            body: Current body
+            feedback: User feedback
+        Returns:
+            Tuple of (new_subject, new_body)
+        """
+        if not self.last_result:
+            return subject, "Error: No draft to refine. Please generate a draft first."
+        try:
+            # Get retrieved docs from last result
+            from haystack import Document
+            retrieved_docs = [
+                Document(content=doc["content"], meta=doc["meta"])
+                for doc in self.last_result.retrieved_docs
+            ]
+            # Refine the draft
+            refined = await self.orchestrator.refine_draft(
+                original_query=self.last_result.query,
+                current_draft=body,
+                user_feedback=feedback,
+                retrieved_docs=retrieved_docs,
+            )
+            return refined.subject, refined.body
+        except Exception as e:
+            logger.error(f"Error refining draft: {e}")
+            return subject, f"Error refining draft: {str(e)}"
+    def refine_draft_sync(self, subject: str, body: str, feedback: str) -> Tuple[str, str]:
+        """Synchronous wrapper for async draft refinement."""
+        return asyncio.run(self.refine_draft_async(subject, body, feedback))
+    def _format_list(self, items: List[str]) -> str:
+        """Format a list of items as markdown."""
+        if not items:
+            return "None"
+        return "\n".join([f"- {item}" for item in items])
+def create_gradio_interface() -> gr.Blocks:
+    """
+    Create and configure the Gradio interface.
+    Returns:
+        Gradio Blocks interface
+    """
+    # Load configuration
+    config = get_config()
+    # Initialize assistant
+    assistant = GradioEmailAssistant(config)
+    # Create interface
+    with gr.Blocks(
+        title="BFH Student Administration Email Assistant",
+        theme=gr.themes.Soft(),
+    ) as demo:
+        gr.Markdown(
+            """
+        # 📧 BFH Student Administration Email Assistant
+        AI-powered email assistant for university administrative staff using RAG (Retrieval-Augmented Generation).
+        **No Docker Required!** Uses in-memory document store.
+        **Features:**
+        - Intent extraction from student queries
+        - Hybrid retrieval (BM25 + semantic search)
+        - Multi-agent email composition
+        - Automated fact-checking
+        - Draft refinement based on feedback
+        """
+        )
+        with gr.Row():
+            with gr.Column(scale=1):
+                gr.Markdown("### 📝 Query Input")
+                query_input = gr.Textbox(
+                    label="Student Query",
+                    placeholder="Enter the student's question or email content here...",
+                    lines=5,
+                )
+                process_btn = gr.Button("Generate Email Draft", variant="primary")
+            with gr.Column(scale=1):
+                gr.Markdown("### 📊 Analysis")
+                intent_output = gr.Markdown(label="Intent Analysis")
+                stats_output = gr.Markdown(label="Statistics")
+        gr.Markdown("### ✉️ Email Draft")
+        with gr.Row():
+            with gr.Column(scale=2):
+                subject_output = gr.Textbox(label="Subject", lines=1)
+                body_output = gr.Textbox(label="Body", lines=15)
+            with gr.Column(scale=1):
+                fact_check_output = gr.Markdown(label="Fact Check Results")
+        gr.Markdown("### 🔄 Refine Draft")
+        with gr.Row():
+            feedback_input = gr.Textbox(
+                label="Feedback / Refinement Instructions",
+                placeholder="E.g., 'Make it more formal', 'Add information about deadlines', 'Translate to English'",
+                lines=3,
+            )
+            refine_btn = gr.Button("Refine Draft", variant="secondary")
+        gr.Markdown("### 📚 Retrieved Sources")
+        sources_output = gr.Dataframe(
+            headers=["Number", "Source", "Score", "Preview"],
+            label="Source Documents",
+        )
+        # Event handlers
+        process_btn.click(
+            fn=assistant.process_query_sync,
+            inputs=[query_input],
+            outputs=[
+                subject_output,
+                body_output,
+                intent_output,
+                fact_check_output,
+                stats_output,
+                sources_output,
+            ],
+        )
+        refine_btn.click(
+            fn=assistant.refine_draft_sync,
+            inputs=[subject_output, body_output, feedback_input],
+            outputs=[subject_output, body_output],
+        )
+        gr.Markdown(
+            """
+        ---
+        **Note:** This system uses AI to assist with email composition. Always review and verify the generated content before sending.
+        """
+        )
+    return demo
+if __name__ == "__main__":
+    # Configure logging
+    logging.basicConfig(
+        level=logging.INFO,
+        format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
+    )
+    # Create and launch interface
+    demo = create_gradio_interface()
+    demo.launch()