EchoGemma
Multimodal echocardiography report generation from DICOM studies. EchoGemma combines an EchoPrime video encoder and a LoRA-fine-tuned MedGemma language model to process full echocardiographic studies and generate clinical text reports.
Input
A folder of DICOM echocardiography video files (a complete study). The model processes all video clips, extracts embeddings and view classifications, then generates a structured clinical report.
Output
A structured echocardiography text report.
Requirements
- Python >= 3.10
- PyTorch 2.10+
- CUDA-capable GPU (recommended)
- ~18 GB disk space for model weights