abhaypratapsingh111's picture
Upload README.md with huggingface_hub
518d9ff verified
---
title: Chronos 2 Forecasting
emoji: πŸ“ˆ
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: apache-2.0
---
# Chronos 2 Time Series Forecasting Application
A production-ready web application for testing Amazon's **Chronos 2** time series forecasting model using the latest `Chronos2Pipeline` API. Built with Dash for enterprise scalability and designed for both local development and cloud deployment.
## Features
- **Latest Chronos 2 API**: Uses `Chronos2Pipeline.predict_df()` with DataFrame-based interface
- **Interactive Forecasting**: Generate forecasts up to 365 days with adjustable confidence intervals
- **Dual Model Support**: Switch between Fast (Chronos-Bolt) and Accurate (Chronos-2) variants
- **Multivariate Ready**: Built on Chronos 2 architecture supporting multivariate forecasting
- **Flexible Data Input**: Upload CSV/Excel files or use sample datasets
- **Rich Visualizations**: Interactive Plotly charts with confidence bands and zoom capabilities
- **Data Quality Analysis**: Automatic preprocessing with quality reports
- **GPU Acceleration**: Automatic CUDA support with CPU fallback
- **Security Hardened**: Non-root Docker containers, server-side validation, filename sanitization
- **Production Ready**: Designed for deployment on local machines or Databricks Apps
## Architecture
Built following best practices for scalability and maintainability:
- **Dash Framework**: Handles thousands of concurrent users
- **Plotly Visualizations**: Smooth rendering of 100K+ data points
- **Model Caching**: Chronos 2 loaded once at startup for fast inference
- **Client-Side State**: Efficient state management without server sessions
- **Modular Design**: Clean separation of components, services, and utilities
## Installation
### Prerequisites
- Python 3.10+
- CUDA-capable GPU (optional, for faster inference)
- 8GB+ RAM (4-8GB for model + overhead)
### Local Setup
1. **Clone the repository**
```bash
git clone <repository-url>
cd chronos2-forecasting-app
```
2. **Create a virtual environment**
```bash
python -m venv venv
# On Windows
venv\Scripts\activate
# On Linux/Mac
source venv/bin/activate
```
3. **Install dependencies**
```bash
pip install -r requirements.txt
```
4. **Run the application**
```bash
python app.py
```
5. **Access the app**
Open your browser to `http://127.0.0.1:8050`
## Usage Guide
### Quick Start
1. **Load Sample Data**
- Click one of the sample dataset buttons (Electricity, Retail, Manufacturing)
- Or upload your own CSV/Excel file
2. **Configure Data**
- Select the date column
- Select the target variable to forecast
- (Optional) Select an ID column for multivariate series
3. **Set Forecast Parameters**
- Adjust the forecast horizon (1-365 days)
- Select confidence levels (80%, 90%, 95%, 99%)
- Choose model variant (Fast or Accurate)
4. **Generate Forecast**
- Click "Generate Forecast" button
- Wait for model inference (typically 1-5 seconds)
- View interactive chart with confidence intervals
### Data Requirements
Your data should have:
- **Date column**: Any standard date format
- **Target column**: Numeric values to forecast
- **Minimum rows**: At least 2x the forecast horizon
- **File size**: Up to 100MB
- **Formats**: CSV, XLSX, XLS
### Tips for Best Results
- Use at least 2x the forecast horizon in historical data
- Clean your data before upload (though the app handles basic preprocessing)
- Start with the Fast model variant for quick testing
- Use the Accurate variant for final forecasts
- Larger confidence intervals provide more conservative forecasts
## Project Structure
```
chronos2-forecasting-app/
β”œβ”€β”€ app.py # Main Dash application
β”œβ”€β”€ components/ # UI components
β”‚ β”œβ”€β”€ upload.py # File upload component
β”‚ β”œβ”€β”€ chart.py # Chart generation
β”‚ └── controls.py # Parameter controls
β”œβ”€β”€ services/ # Business logic
β”‚ β”œβ”€β”€ model_service.py # Chronos model wrapper
β”‚ β”œβ”€β”€ data_processor.py # Data preprocessing
β”‚ └── cache_manager.py # Caching logic
β”œβ”€β”€ utils/ # Utilities
β”‚ β”œβ”€β”€ validators.py # Input validation
β”‚ └── metrics.py # Forecast metrics
β”œβ”€β”€ config/ # Configuration
β”‚ β”œβ”€β”€ settings.py # Environment settings
β”‚ └── constants.py # App constants
β”œβ”€β”€ datasets/ # Sample datasets
β”œβ”€β”€ static/ # Static assets
β”‚ └── custom.css # Custom styles
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ Dockerfile # Container definition
└── README.md # This file
```
## Configuration
### Environment Variables
- `ENVIRONMENT`: Set to `local` or `production`
- `DEVICE`: Set to `auto`, `cuda`, or `cpu`
- `LOG_LEVEL`: Set to `DEBUG`, `INFO`, `WARNING`, or `ERROR`
- `DATABRICKS_APP_PORT`: Port for Databricks deployment (default: 8080)
### Local vs Databricks Configuration
The app automatically detects the environment and adjusts settings:
**Local Development:**
- Host: 127.0.0.1
- Port: 8050
- Debug: Enabled
- Storage: Local directories
**Databricks Deployment:**
- Host: 0.0.0.0
- Port: 8080 (or DATABRICKS_APP_PORT)
- Debug: Disabled
- Storage: /tmp and /dbfs
## Deployment
### Hugging Face Spaces (Recommended for Free Hosting)
The easiest way to deploy this app for free:
1. **Create a Hugging Face account** at https://huggingface.co
2. **Create a new Space**
- Go to https://huggingface.co/spaces
- Click "Create new Space"
- Select "Dash" as the SDK
- Choose a name for your Space
3. **Upload your code**
- Option A: Connect your GitHub repository (recommended)
- Option B: Upload files directly through the web interface
4. **Configure the Space**
- The app will automatically use `app.py` as the entry point
- HuggingFace Spaces provides 16GB RAM (sufficient for Chronos-2)
- Optional: Request GPU upgrade for faster inference
5. **Access your deployed app**
- Your app will be live at: `https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME`
**Note**: First startup may take 2-3 minutes as the Chronos-2 model downloads (~500MB).
### Docker Deployment
1. **Build the image**
```bash
docker build -t chronos2-forecasting .
```
2. **Run the container**
```bash
docker run -p 8080:8080 chronos2-forecasting
```
3. **With GPU support**
```bash
docker run --gpus all -p 8080:8080 chronos2-forecasting
```
### Databricks Apps Deployment
1. **Upload code to DBFS**
```bash
databricks fs cp -r . dbfs:/apps/chronos2-forecasting/
```
2. **Create Databricks App**
- Use the Databricks Apps UI
- Point to the uploaded directory
- Set environment variable: `ENVIRONMENT=production`
3. **Configure resources**
- Minimum: 8GB RAM
- Recommended: GPU instance for faster inference
### Production Considerations
- **Memory**: Allocate 6-8GB for the model + overhead
- **Scaling**: Use multiple workers with Gunicorn
- **Monitoring**: Check `/health` endpoint for status
- **Logging**: Logs to stdout for easy collection
- **Timeouts**: Set to 300s+ for large forecasts
## API Reference
### Health Check Endpoint
```
GET /health
```
Returns:
```json
{
"status": "healthy",
"model_loaded": true,
"model_variant": "fast",
"device": "cuda"
}
```
## Troubleshooting
### Model Loading Issues
**Problem**: Model fails to load
- Check available memory (need 4-8GB)
- Try CPU mode: Set `DEVICE=cpu`
- Check internet connection (first run downloads model)
### GPU Not Detected
**Problem**: CUDA device not found
- Verify CUDA installation: `python -c "import torch; print(torch.cuda.is_available())"`
- Install correct PyTorch version for your CUDA
- App will automatically fall back to CPU
### Upload Failures
**Problem**: File upload fails
- Check file size (<100MB)
- Verify file format (CSV, XLSX, XLS)
- Ensure file is not corrupted
### Slow Performance
**Problem**: Forecasts take too long
- Use Fast model variant instead of Accurate
- Reduce forecast horizon
- Enable GPU acceleration
- Limit data points (app decimates to 10K for display)
### Memory Errors
**Problem**: Out of memory during inference
- Switch to Fast model variant (smaller)
- Use CPU instead of GPU
- Reduce batch size in model_service.py
- Close other applications
## Performance Tuning
### For Development
- Enable debug mode for detailed logging
- Use Fast model variant
- Work with smaller datasets initially
### For Production
- Disable debug mode
- Use GPU for inference
- Enable caching (already configured)
- Use Gunicorn with 4 workers
- Set up monitoring and alerting
## Contributing
Contributions are welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests if applicable
5. Submit a pull request
## License
This project is provided as-is for educational and research purposes.
## Acknowledgments
- **Chronos Model**: Amazon Science
- **Dash Framework**: Plotly
- **Sample Data**: Generated for demonstration purposes
## Support
For issues, questions, or suggestions:
- Open an issue in the repository
- Check existing documentation
- Review troubleshooting guide above
## Changelog
### Version 1.0.1 (Latest - Chronos 2 Full Implementation)
- **BREAKING**: Migrated to Chronos 2 API with `Chronos2Pipeline`
- Fixed deprecated pandas methods (`fillna(method=...)` β†’ `ffill()`/`bfill()`)
- Updated to `chronos-forecasting==2.0.0` package
- Fixed type hints (`any` β†’ `Any`) across all modules
- Added DataFrame-based prediction interface
- Security improvements:
- Non-root user in Docker container
- Server-side file validation
- Filename sanitization
- Health check timeout configuration
- Updated model paths to support Chronos-2 (s3://autogluon/chronos-2)
- Fixed data format compatibility (id/timestamp/target columns)
- Added `requests` library for health checks
### Version 1.0.0 (Initial Release)
- Chronos 2 model integration
- Single-page Dash application
- CSV/Excel upload support
- Interactive visualizations
- Confidence interval display
- Sample datasets included
- Docker deployment ready
- Databricks Apps compatible
## Roadmap
Future enhancements being considered:
- Multi-series forecasting UI
- Model comparison features
- Export forecast results
- Custom model fine-tuning
- Real-time data streaming
- Advanced metrics dashboard
- API-only mode for programmatic access
---
Built with Dash and Chronos 2 for production-ready time series forecasting.