--- title: CTP Slack Bot emoji: 🦥 colorFrom: red colorTo: green sdk: docker pinned: false license: mit short_description: Spring 2025 CTP Slack Bot RAG system app_port: 8080 --- # CTP Slack Bot ## _Modus Operandi_ in a Nutshell * Intelligently responds to Slack messages (when mentioned) based on a repository of data. * Periodically checks for new content to add to its repository. ## How to Run the Application You need to configure it first. This is done via environment variables, or an `.env` file based on the template, `.env.template`. Obtaining the values requires setting up API tokens/secrets with: * Slack: for `slack_bot_token` and `slack_app_token` * MongoDB: for `mongodb_uri` * OpenAI: for `openai_api_key` * Google Drive: for `google_project_id`, `google_client_id`, `google_client_email`, `google_private_key_id`, and `google_private_key` * For Google Drive, set up a service account. It’s the only supported authentication type. ### Normally Just run the Docker image. 😉 Build it with: ```sh docker build . -t ctp-slack-bot ``` Run it with: ```sh docker run --volume ./logs:/data --env-file=.env -p 8000:8000 --name my-ctp-slack-bot-instance ctp-slack-bot ``` ### For Development Development usually requires rapid iteration. That means a change in the code ought to be reflected as soon as possible in the behavior of the application. First, make sure you are set up with a Python virtual environment created by the Python `venv` module and that it’s activated. Then install dependencies from `pyproject.toml` within the environment using: ```sh pip3 install -e . ``` Make a copy of `.env.template` as `.env` and define the environment variables. (You can also define them by other means, but this has the least friction.) This file should not be committed and is excluded by `.gitignore`! If `localhost` port `8080` is free, running the following will make the application available on that port: ```sh scripts/run-dev.sh ``` Visiting http://localhost:8080/health will return HTTP status OK and a payload containing the health status of individual components if everything is working. ## Tech Stack * Hugging Face Spaces for hosting * OpenAI for embeddings and language models * Google Drive for reference data (i.e., the material to be incorporated into the bot’s knowledge base) * MongoDB for data persistence * Docker for containerization * Python * Slack Bolt client for interfacing with Slack * See `pyproject.toml` for additional Python packages. ## General Project Structure Not every file or folder is listed, but the important stuff is here. * `src/` * `ctp_slack_bot/` * `core/`: fundamental components like configuration (using pydantic), logging setup (loguru), and custom exceptions * `config.py`: application settings model * `db/`: data connection and interface logic * `repositories/`: data collection/table interface logic * `mongo_db_vectorized_repository_base.py`: base implementation of a repository corresponding to a MongoDB collection with a search index * `vectorized_chunk_repository.py`: repository interface for `VectorizedChunk`s * `models/`: data models * `mime_type_handlers`: parsers for converting bytes of different MIME types to `Chunk`s * `services/`: business logic * `answer_retrieval_service.py`: obtains an answer to a question from a language model using relevant context * `application_health_service.py`: collects the health status of the application components * `content_ingestion_service.py`: converts content into chunks and stores them into the database * `context_retrieval_service.py`: queries for relevant context from the database to answer a question * `embeddings_model_service.py`: converts text to embeddings * `event_brokerage_service.py`: brokers events between decoupled components * `google_drive_service.py`: interfaces with Google Drive * `language_model_service.py`: answers questions using relevant context * `question_dispatch_service.py`: listens for questions and retrieves relevant context to get answers * `task_service.py`: runs periodic background tasks * `slack_service.py`: handles events from Slack and sends back responses * `vectorization_service.py`: converts chunks into chunks with embeddings * `tasks/`: scheduled tasks to run in the background * `utils/`: reusable utilities * `app.py`: application entry point * `containers.py`: the dependency injection container * `tests/`: unit tests * `scripts/`: utility scripts for development, deployment, etc. * `run-dev.sh`: script to run the application locally * `notebooks/`: Jupyter notebooks for exploration and model development * `.env`: local environment variables for development purposes (to be created for local use only from `.env.template`) * `Dockerfile`: Docker container build definition * `pyproject.toml`: project definition and dependencies