John Ho PRO
AI & ML interests
Recent Activity
Organizations
- RunningFeatured238
PaddleOCR-VL Online Demo
π238Extract text, tables, formulas, and charts from images
- Running on ZeroFeatured449
DeepSeek OCR Demo
π449An interactive demo for the DeepSeek-OCR model.
- Running on ZeroFeatured109
LightOnOCR 2 1B Demo
π¨109Extract text from images or PDFs with OCR
- Running on ZeroMCPFeatured142
Multimodal OCR2
π»142FireRed / Nanonets / Monkey / Thyme / Typhoon / SmolDocling
- Build error51
Quant
π»51Display interactive data visualizations and apps
- RunningFeatured46
Porting nanochat to Transformers: an AI modeling history lesson
π46Learn about ML and Transformers through nanochat
- Running on CPU UpgradeFeatured3.1k
The Smol Training Playbook
π3.1kThe secrets to building world-class LLMs
- Running on ZeroFeatured837
Florence 2
π837Perform image captioning, detection, OCR and more with Florenceβ2
- Runtime errorFeatured515
Florence2 + SAM2
π₯515Segment and caption objects in images and videos
- Running on T4Featured117
SAM2 Video Predictor
π₯117Segment objects in a video with clickβbased masks
- Running23
SAM2 Video Predictor
π₯23Segment and track objects in videos
-
EvanZhouDev/open-genmoji
Text-to-Image β’ Updated β’ 294 β’ β’ 68 - Running on ZeroFeatured656
ACE Step
π»656A Step Towards Music Generation Foundation Model
- Running on ZeroFeatured598
DreamO
π¨598A Unified Framework for Image Customization
- Running on ZeroFeatured984
Tile Upscaler
π984Enhance and upscale images with HDR and AI details
- Configuration errorFeatured1.45k
EasyControl Ghibli
π¦1.45kNew Ghibli EasyControl model is now released!!
-
akiyamasho/AnimeBackgroundGAN-Miyazaki
Image-to-Image β’ Updated β’ 25 - Build error72
Ghibli Multilingual Text-Rendering
π¦72Elevating Ghibli-style AI art beyond ChatGPT's capabilities.
- Running on A100MCP45
EasyControl Ghibli
π¦45New Ghibli EasyControl model is now released!!
- Running on ZeroFeatured62
LightGlue
β62LightGlue demo
- Running on ZeroMCPFeatured33
Qwen3 VL HF Demo
π₯33Object Detection, Visual Grounding, Keypoint Detection
-
prithivMLmods/MetaCLIP-2-Age-Range-Estimator
Image Classification β’ 21.7M β’ Updated β’ 15 β’ 7 - RunningFeatured737
Remove Background Web
πΌ737In-browser background removal
- Runtime error16
AI Video Editor
π16Create videos with FFMPEG + Qwen2.5-Coder
-
Searchium-ai/clip4clip-webvid150k
Text-to-Video β’ 0.2B β’ Updated β’ 621 β’ 44 - Configuration errorFeatured446
FastVLM WebGPU
π446Real-time video captioning powered by FastVLM
- Runtime errorFeatured36
AudioRag Demo
π΅36Search audio for relevant chunks
- Running on T4Featured467
Parakeet-TDT-0.6b-V2
Β467Transcribe audio files with timestamps and download transcripts
- Running on Zero52
Fast Whisper Turbo
β‘52Ultra-fast Whisper Turbo inference β‘
-
openai/whisper-large-v3-turbo
Automatic Speech Recognition β’ 0.8B β’ Updated β’ 5.75M β’ β’ 2.91k - Running on ZeroFeatured346
Realtime Whisper Turbo
π€―346Realtime implementation of Whisper large turbo
- Running on T4131
RF-DETR
π₯131SOTA real-time object detection model
- Running on CPU Upgrade50
YOLO ARENA
π50compare performance of top object detectors
- Running23
SAM2 Video Predictor
π₯23Segment and track objects in videos
- Running on ZeroFeatured114
VLM Object Understanding
π¦114Explore object detection, visual grounding, keypoint Detecti
- Running on ZeroFeatured110
Qwen2 VL Localization
π110Detect objects in images using text prompts
- Build errorFeatured160
Seed1.5 VL
π160Seed1.5-VL API Demo
- Runtime error2
Vision Language SmolVLM2
π2Video + text to text with SmolVLM2
- Running on ZeroFeatured142
Gemma 3n E4B It
β‘142Chat with a multimodal assistant using text, images, audio, or video
- Runtime error9
Cantonese TTS Text To Speech
π9Generate Cantonese speech from text
- Runtime error4
Cantonese TTS Playground
π₯4Generate speech from Cantonese text using selected or custom voice
- Running on ZeroFeatured1.76k
Dia 1.6B
π―1.76kGenerate realistic dialogue from a script, using Dia!
- Runtime errorFeatured81
Daily Paper Podcast
π81Generates a podcast about today's top trending paper.
- RunningFeatured238
PaddleOCR-VL Online Demo
π238Extract text, tables, formulas, and charts from images
- Running on ZeroFeatured449
DeepSeek OCR Demo
π449An interactive demo for the DeepSeek-OCR model.
- Running on ZeroFeatured109
LightOnOCR 2 1B Demo
π¨109Extract text from images or PDFs with OCR
- Running on ZeroMCPFeatured142
Multimodal OCR2
π»142FireRed / Nanonets / Monkey / Thyme / Typhoon / SmolDocling
- Running on ZeroFeatured62
LightGlue
β62LightGlue demo
- Running on ZeroMCPFeatured33
Qwen3 VL HF Demo
π₯33Object Detection, Visual Grounding, Keypoint Detection
-
prithivMLmods/MetaCLIP-2-Age-Range-Estimator
Image Classification β’ 21.7M β’ Updated β’ 15 β’ 7 - RunningFeatured737
Remove Background Web
πΌ737In-browser background removal
- Build error51
Quant
π»51Display interactive data visualizations and apps
- RunningFeatured46
Porting nanochat to Transformers: an AI modeling history lesson
π46Learn about ML and Transformers through nanochat
- Running on CPU UpgradeFeatured3.1k
The Smol Training Playbook
π3.1kThe secrets to building world-class LLMs
- Runtime error16
AI Video Editor
π16Create videos with FFMPEG + Qwen2.5-Coder
-
Searchium-ai/clip4clip-webvid150k
Text-to-Video β’ 0.2B β’ Updated β’ 621 β’ 44 - Configuration errorFeatured446
FastVLM WebGPU
π446Real-time video captioning powered by FastVLM
- Runtime errorFeatured36
AudioRag Demo
π΅36Search audio for relevant chunks
- Running on T4Featured467
Parakeet-TDT-0.6b-V2
Β467Transcribe audio files with timestamps and download transcripts
- Running on Zero52
Fast Whisper Turbo
β‘52Ultra-fast Whisper Turbo inference β‘
-
openai/whisper-large-v3-turbo
Automatic Speech Recognition β’ 0.8B β’ Updated β’ 5.75M β’ β’ 2.91k - Running on ZeroFeatured346
Realtime Whisper Turbo
π€―346Realtime implementation of Whisper large turbo
- Running on ZeroFeatured837
Florence 2
π837Perform image captioning, detection, OCR and more with Florenceβ2
- Runtime errorFeatured515
Florence2 + SAM2
π₯515Segment and caption objects in images and videos
- Running on T4Featured117
SAM2 Video Predictor
π₯117Segment objects in a video with clickβbased masks
- Running23
SAM2 Video Predictor
π₯23Segment and track objects in videos
- Running on T4131
RF-DETR
π₯131SOTA real-time object detection model
- Running on CPU Upgrade50
YOLO ARENA
π50compare performance of top object detectors
- Running23
SAM2 Video Predictor
π₯23Segment and track objects in videos
- Running on ZeroFeatured114
VLM Object Understanding
π¦114Explore object detection, visual grounding, keypoint Detecti
- Running on ZeroFeatured110
Qwen2 VL Localization
π110Detect objects in images using text prompts
- Build errorFeatured160
Seed1.5 VL
π160Seed1.5-VL API Demo
- Runtime error2
Vision Language SmolVLM2
π2Video + text to text with SmolVLM2
- Running on ZeroFeatured142
Gemma 3n E4B It
β‘142Chat with a multimodal assistant using text, images, audio, or video
-
EvanZhouDev/open-genmoji
Text-to-Image β’ Updated β’ 294 β’ β’ 68 - Running on ZeroFeatured656
ACE Step
π»656A Step Towards Music Generation Foundation Model
- Running on ZeroFeatured598
DreamO
π¨598A Unified Framework for Image Customization
- Running on ZeroFeatured984
Tile Upscaler
π984Enhance and upscale images with HDR and AI details
- Runtime error9
Cantonese TTS Text To Speech
π9Generate Cantonese speech from text
- Runtime error4
Cantonese TTS Playground
π₯4Generate speech from Cantonese text using selected or custom voice
- Running on ZeroFeatured1.76k
Dia 1.6B
π―1.76kGenerate realistic dialogue from a script, using Dia!
- Runtime errorFeatured81
Daily Paper Podcast
π81Generates a podcast about today's top trending paper.
- Configuration errorFeatured1.45k
EasyControl Ghibli
π¦1.45kNew Ghibli EasyControl model is now released!!
-
akiyamasho/AnimeBackgroundGAN-Miyazaki
Image-to-Image β’ Updated β’ 25 - Build error72
Ghibli Multilingual Text-Rendering
π¦72Elevating Ghibli-style AI art beyond ChatGPT's capabilities.
- Running on A100MCP45
EasyControl Ghibli
π¦45New Ghibli EasyControl model is now released!!