StepWise-Math-AI / README.md
DreamyDetective's picture
Upload 2 files
5680735 verified
---
title: StepWise Math AI
emoji: πŸŽ“
colorFrom: gray
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: mit
tags:
- building-mcp-track-consumer
- building-mcp-track-creative
- mcp
- gradio
- gemini
- education
- mathematics
- ai
- visualization
- interactive-learning
---
# StepWise Math
**Transform Static Math Problems into Living, Interactive Step-by-Step Visual Proofs**
![](./img/hero.jpg)
This is a **Gradio MCP Framework** implementation of the StepWise Math React app, providing the same powerful features in a Python-based web interface.
[![MCP's 1st Birthday Hackathon](https://img.shields.io/badge/MCP%27s%201st%20Birthday-Hackathon-blue)](https://github.com/modelcontextprotocol)
[![Track 1: Building MCP](https://img.shields.io/badge/Track%201-Building%20MCP%20(Consumer)-blue)](https://huggingface.co/MCP-1st-Birthday)
[![Powered by Google Gemini](https://img.shields.io/badge/Powered%20by-Google%20Gemini%202.5%20Flash-blue)](https://ai.google.dev/)
[![Powered by Google Gemini](https://img.shields.io/badge/Powered%20by-Google%20Gemini%203.0%20Pro-blue)](https://ai.google.dev/)
## Overview
### What This Project Does
StepWise Math is an **MCP-capable service** that transforms static math problems into interactive, step-by-step visual proofs. Built as a Gradio-based MCP server, it provides both a user-friendly web interface and programmatic MCP endpoints for AI agents and developer tools.
The system operates through a **two-stage AI pipeline**:
- **Stage 1 β€” Concept Analysis** (Gemini 2.5 Flash): Generates a pedagogical JSON specification from text, URL, or image input
- **Stage 2 β€” Code Generation** (Gemini 3.0 Pro): Synthesizes a self-contained HTML/JS interactive proof application
### Why This Project Matters
- **Educational Impact**: Empowers teachers and students to visualize mathematical reasoning step-by-step, transforming abstract concepts into concrete, interactive experiences
- **MCP Showcase**: Demonstrates best practices for building MCP servers that integrate multi-step LLM workflows, streaming thoughts, and developer-facing prompts/resources
- **Reference Implementation**: Provides a complete example of combining AI-powered analysis with code generation in an MCP-compliant architecture
## Documentation
For complete product specifications, feature requirements, and technical implementation details, see the **[Product Requirements Document (PRD.md)](./PRD.md)**.
The PRD covers:
- Target audience and user personas (Grades 6-10 students, teachers, tutors)
- Detailed functional requirements and data models
- UI/UX design specifications
- Example use cases (Pythagorean Theorem, Slope-Intercept Form)
- System constraints and technical architecture
## Quick Start
### Using Gradio UI
1. Enter your **Gemini API Key** in the Configuration section (get one free at [ai.google.dev](https://ai.google.dev/)). This is needed only when the embedded API key is out of credits.
2. Choose your input method (Text, Image, or URL)
3. Describe a math problem or concept
4. Click **Generate Guided Proof**
5. Explore the interactive visualization!
### Using MCP Clients
1. Point your MCP client (e.g., Claude Desktop, VSCode) to the deployed MCP server URL: https://mcp-1st-birthday-stepwise-math-ai.hf.space/gradio_api/mcp/
2. Configure the MCP server settings in your client as follows:
<details>
<summary><strong>Claude Desktop</strong> (<code>claude_desktop_config.json</code>)</summary>
```json
{
"mcpServers": {
"stepwise": {
"command": "npx",
"args": [
"mcp-remote",
"https://dreamydetective-stepwise-math-mcp-server-1.hf.space/gradio_api/mcp/",
"--transport",
"streamable-http"
]
}
}
}
```
After updating, restart Claude Desktop.
</details>
<details>
<summary><strong>VSCode</strong> (<code>settings.json</code>)</summary>
```json
{
"servers": {
"stepwise": {
"url": "https://mcp-1st-birthday-stepwise-math-ai.hf.space/gradio_api/mcp/",
"type": "http"
}
}
}
```
</details>
3. Open your MCP client and discover the available prompts and tools.
4. Invoke the `create_visual_math_proof` prompt or the underlying tools to generate interactive proofs programmatically.
## Features
### MCP Tools, Prompts & Resources
- **Prompts**: High-level conversational wrappers
- **Tools**: Programmatic calls
- **Resources**: JSON templates and examples
- **Discovery**: All prompts/tools are registered in the server schema for MCP client access
- **Authentication**: Use API keys (`GEMINI_API_KEY`)
### Multi-Modal Input
- **Text Input**: Describe any math problem in natural language
- **Image Upload**: Upload photos of textbook problems or handwritten equations
- **URL Import**: Reference YouTube videos, Khan Academy lessons, or web resources
### Dual-Stage AI Pipeline
- **Stage 1 - The Teacher** (Gemini 2.5 Flash): Analyzes concepts and designs pedagogical step sequences
- **Stage 2 - The Engineer** (Gemini 3.0 Pro + Extended Thinking): Generates production-ready interactive visualizations
### Interactive Step Navigation
- Progressive disclosure of mathematical concepts
- Back/Forward buttons to review at your own pace
- Visual state changes synchronized with each step
- Real-time equation updates as you interact
## System Architecture
The system utilizes a **Two-Stage AI Pipeline** orchestrated by a Python/Gradio core to transform abstract math concepts into interactive HTML5 applications.
1. **Ingestion (MCP & Web):** Users submit requests via MCP-enabled clients (Claude Desktop, VSCode) or the direct Web UI.
2. **Stage 1 - Analysis (Gemini 2.5 Flash):** The "Architect" model decomposes the mathematical concept into a structured `MathSpec` JSON, defining learning steps and visual logic.
3. **Stage 2 - Implementation (Gemini 3.0 Pro):** The "Builder" model consumes the spec to generate self-contained, interactive HTML5/Canvas code.
4. **Delivery:** The final executable app is rendered in the UI or returned to the MCP client for immediate use.
![architecture](./img/architecture.jpg)
**What's Included:**
- Functioning **MCP server** exposing tools for creating math specifications and building interactive proofs
- **Gradio UI** (`app.py`) for submitting text, URL, or image inputs and viewing generated proofs
- MCP **prompts** and **resources** accessible to MCP clients (Claude Desktop, VSCode, etc.)
## Usage Guide
### Generating Your First Proof
1. **Select Input Method**: Choose Text, Image, or URL
2. **Provide Your Problem**:
- Text: "Prove that the sum of angles in a triangle is 180 degrees"
- Image: Upload a photo of a textbook problem or handwritten equation
- URL: Paste a link to a Math problem. Example - https://cemc.uwaterloo.ca/sites/default/files/documents/2025/POTWC-25-G-11-P.html
3. **Click Generate**: The AI will analyze and create your interactive proof
4. **Explore**: Navigate through the steps in the Guided Proof tab
### Refining Your Proof
1. **View the Generated Proof**: Check the interactive simulation
2. **Provide Feedback**: Type suggestions like "Make the triangle larger" or "Add labels to the vertices"
3. **Apply Refinement**: The AI will regenerate with your feedback
## Technical Details
### MCP Integration Architecture & Programmatic Flow
The diagram illustrates how the Gradio application exposes its functionality for programmatic access via the Model Context Protocol (MCP). The **Gradio App Server** acts as the central hub, exposing capabilities in two categories:
* **MCP Prompts:** High-level, conversational wrappers (e.g., `create_visual_math_proof`) for guiding multi-step workflows.
* **MCP Resources / Tools:** Direct, callable functions (e.g., `create_math_specification_from_text`, `build_from_specification`) for specific tasks.
MCP-aware clients or agents can interact with these components by first **discovering** the available tools through the server schema and then **authenticating** with an API key.
The bottom section details a typical **programmatic flow**:
1. **Discover** available prompts and resources.
2. **Call** a `create_math_specification_from_*` tool with the appropriate input (Text, URL, or Image) to generate a structured `MathSpec JSON`.
3. **Call** the `build_interactive_proof_from_specification` tool with the generated JSON to produce the final, self-contained `index.html` interactive proof.
![programmatic-flow](./img/programmatic-flow.jpg)
***Note**: All data is processed securely. Your API key is only used to communicate with Google's Gemini API.*
## Resources & Links
### Live Demo & Demo Video
- **Live Demo**: [StepWise Math](https://huggingface.co/spaces/MCP-1st-Birthday/StepWise-Math-AI)
- **Demo Video**: [StepWise Math AI](https://www.youtube.com/watch?v=UjYFjKAgJh0)
### Social Media
Read the announcement and join the discussion:
- **Blog Post**: [Bringing Math to Life: Building StepWise Math for the MCP Hackathon](https://huggingface.co/blog/MCP-1st-Birthday/bringing-math-to-life-stepwise-math)
- **Twitter/X**: [Introducing StepWise Math](https://x.com/vikasgupta1812/status/1993875748948447272?s=20)
- **LinkedIn**: [StepWise Math - An MCP Server Built with Gradio and Gemini](https://www.linkedin.com/posts/vikasgupta1812_im-excited-to-share-stepwise-math-my-submission-activity-7399645567385264128-aRFu)
- **Discord**: [HuggingFace Discord](https://discord.com/channels/879548962464493619/1443152796500103274/1443440922137198673)
## Judging Criteria
This submission addresses all hackathon requirements for Track 1 (Building MCP) as follows:
- **Completion**:
- [x] [Live HF Space](https://huggingface.co/spaces/MCP-1st-Birthday/StepWise-Math-AI)
- [x] [Demo Video](https://www.youtube.com/watch?v=UjYFjKAgJh0)
- [x] [Social Posts](#social-media)
- [x] Complete Documentation
- **UI/UX Polish**: Dark-themed interface with auto-loaded examples, collapsible accordions for advanced features, and responsive iframe rendering for seamless proof exploration
- **Functionality**: MCP server at `/gradio_api/mcp/` exposing 4 tools (`create_math_specification_from_text/url/image`, `build_interactive_proof_from_specification`), 3 prompts and 4 resources for programmatic access from Claude Desktop, VSCode, and other MCP clients
- **Creativity**: Two-stage AI pipeline (Stage 1: Gemini 2.5 Flash concept analysis + Stage 2: Gemini 3.0 Pro code generation), multi-modal input (text/URL/image with OCR), streaming thought processes, and interactive HTML5/Canvas visualizations
- **Documentation**: Comprehensive [README.md](./README.md), [PRD.md](./PRD.md) with technical specs, [demo video](https://www.youtube.com/watch?v=UjYFjKAgJh0), and inline code documentation
## Contributing & Support
**Contributions Welcome!** Areas of interest:
- Performance optimizations
- UI/UX improvements
- Additional mathematical domains
- Documentation and examples
**Support Channels:**
- **Discussions**: [HuggingFace Discussions](https://huggingface.co/spaces/MCP-1st-Birthday/StepWise-Math-AI/discussions)
## Acknowledgments
- **Google AI** for the incredible Gemini models
- **Gradio** for the amazing Python web framework
- **Nano Banana** for image assets
- **GitHub Copilot** for Vibe Coding
## License
This project is licensed under the MIT License. Original work created for the MCP's 1st Birthday Hackathon (November 2025).
<div align="center">
**Built with ❀️ for visual learners everywhere**
*Making abstract math concrete, one proof at a time*
**Like this space if you found it helpful!**
</div>