Spaces:
Running
Running
| # Audio Input Display Fix | |
| ## Issue | |
| The audio input (microphone button) was not displaying in the ChatInterface multimodal textbox. | |
| ## Root Cause | |
| When `multimodal=True` is set on `gr.ChatInterface`, it should automatically show image and audio buttons. However: | |
| 1. The buttons might be hidden in a dropdown menu | |
| 2. Browser permissions might be blocking microphone access | |
| 3. The `file_types` parameter might not have been explicitly set | |
| ## Fix Applied | |
| ### 1. Added `file_types` Parameter | |
| Explicitly specified which file types are accepted to ensure audio is enabled: | |
| ```python | |
| gr.ChatInterface( | |
| fn=research_agent, | |
| multimodal=True, | |
| file_types=["image", "audio", "video"], # Explicitly enable image, audio, and video | |
| ... | |
| ) | |
| ``` | |
| **File:** `src/app.py` (line 929) | |
| ### 2. Enhanced UI Description | |
| Updated the description to make it clearer where to find the audio input: | |
| - Added explicit instructions about clicking the π· and π€ icons | |
| - Added a tip about looking for icons in the text input box | |
| - Clarified drag & drop functionality | |
| **File:** `src/app.py` (lines 942-948) | |
| ## How It Works Now | |
| 1. **Audio Recording Button**: The π€ microphone icon should appear in the textbox toolbar when `multimodal=True` is set | |
| 2. **File Upload**: Users can drag & drop audio files or click to upload | |
| 3. **Browser Permissions**: Browser will prompt for microphone access when user clicks the audio button | |
| ## Testing | |
| To verify the fix: | |
| 1. Look for the π€ microphone icon in the text input box | |
| 2. Click it to start recording (browser will ask for microphone permission) | |
| 3. Alternatively, drag & drop an audio file into the textbox | |
| 4. Check browser console for any permission errors | |
| ## Browser Requirements | |
| - **Chrome/Edge**: Should work with microphone permissions | |
| - **Firefox**: Should work with microphone permissions | |
| - **Safari**: May require additional configuration | |
| - **HTTPS Required**: Microphone access typically requires HTTPS (or localhost) | |
| ## Troubleshooting | |
| If audio input still doesn't appear: | |
| 1. **Check Browser Permissions**: | |
| - Open browser settings | |
| - Check microphone permissions for the site | |
| - Ensure microphone is not blocked | |
| 2. **Check Browser Console**: | |
| - Open Developer Tools (F12) | |
| - Look for permission errors or warnings | |
| - Check for any JavaScript errors | |
| 3. **Try Different Browser**: | |
| - Some browsers have stricter permission policies | |
| - Try Chrome or Firefox if Safari doesn't work | |
| 4. **Check Gradio Version**: | |
| - Ensure `gradio>=6.0.0` is installed | |
| - Update if needed: `pip install --upgrade gradio` | |
| 5. **HTTPS Requirement**: | |
| - Microphone access requires HTTPS (or localhost) | |
| - If deploying, ensure SSL is configured | |
| ## Additional Notes | |
| - The audio button is part of the MultimodalTextbox component | |
| - It should appear as an icon in the textbox toolbar | |
| - If it's still not visible, it might be in a dropdown menu (click the "+" or "..." button) | |
| - The `file_types` parameter ensures audio files are accepted for upload | |