Spaces:
Build error
Build error
metadata
title: PDF OCR (Detectron2 + TrOCR)
emoji: 🧠
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
PDF OCR (Detectron2 + TrOCR) - Hugging Face Spaces
This repo contains a deployable Gradio app that detects text lines with Detectron2 and reads them with TrOCR. Optional Gemini correction can refine the text.
Files
app.py: Gradio UIinference.py: OCR pipeline (Detectron2 + TrOCR)requirements.txt: Python dependencies (Detectron2 installed in Dockerfile)Dockerfile: CUDA-enabled image for GPU Spacemodel_final.pth: Detectron2 weights
Deploy on Hugging Face Spaces (Docker Space)
- Create a new Space on Hugging Face → Type: Docker → Hardware: GPU (T4/A10G).
- Push these files to the Space repository (or connect this folder and
git push). - Set optional secret:
GEMINI_API_KEY(for correction) in Space Settings → Secrets. - Wait for the build to finish. The app will start on port 7860.
Use
- Upload a PDF.
- (Optional) Toggle Split-page (currently standard pipeline is used) and Gemini correction.
- Click Process.
- Download the ZIP of per-page JSONs. The full combined text is shown in the textbox.
Local run (GPU recommended)
docker build -t ocr-app .
docker run --gpus all -p 7860:7860 ocr-app
Then open http://localhost:7860
Notes
- Detectron2 requires GPU for reasonable speed; CPU will be slow.
TEXTLINE_MODEL_PATHcan be overridden via env var if the weights are elsewhere.- TrOCR models are downloaded on first run and cached in the container layer after warmup.