| | --- |
| | license: mit |
| | language: |
| | - hi |
| | - en |
| | - bn |
| | - gu |
| | - te |
| | - mr |
| | --- |
| | # SYSPIN Hackathon TTS API Documentation |
| |
|
| | ## Overview |
| |
|
| | This API provides a Text-to-Speech (TTS) service that converts input text into speech audio. It supports multiple Indian languages and offers voice customization through speaker references provided by the user . |
| |
|
| | --- |
| |
|
| | ## Endpoint: `/Get_Inference` |
| | |
| | * **Method**: `GET` |
| | * **Description**: Generates speech audio from the provided text using the specified language and speaker reference file. |
| | |
| | ### Query Parameters |
| | |
| | | Parameter | Type | Required | Description | |
| | | --------- | ------ | -------- | ----------- | |
| | | `text` | string | Yes | The input text to be converted into speech.| |
| | | `lang` | string | Yes | The language of the input text. Acceptable values include: `bhojpuri`, `bengali`, `english`, `gujarati`, `hindi`, `chhattisgarhi`, `kannada`, `magahi`, `maithili`, `marathi`, `telugu`. | |
| | | `speaker_wav` | WAV file (Bytes) | Yes | Must be a WAV file | |
| |
|
| | ### Available Languages |
| |
|
| | | Language | Language codes | |
| | | --------- | ---------------- | |
| | | chhattisgarhi | hne | |
| | | kannada | kn | |
| | | maithili | mai | |
| | | telugu | te | |
| | | bengali | bn | |
| | | bhojpuri | bho | |
| | | marathi | mr | |
| | | gujarati | gu | |
| | | hindi | hi | |
| | | magahi | mag | |
| | | english | en | |
| |
|
| | ### Responses |
| |
|
| | * **200 OK**: Returns a WAV audio file as a streaming response containing the synthesized speech. |
| |
|
| | * **422 Unprocessable Entity**: Returned when: |
| |
|
| | * Any of the required query parameters (`text`, `lang`, `speaker_wav`) are missing. |
| | * The specified `lang` is not supported. |
| | * The specified `speaker_wav` is not available. |
| |
|
| |
|
| |
|
| | ## Running the Server |
| |
|
| | To start the FastAPI server: |
| |
|
| | ```bash |
| | docker build -t your_image_name ./ |
| | docker run -d -v /path/to/this/code/dir/:/app/ -p 8080:8080 your_image_name API_main.py |
| | ``` |
| |
|
| | ## Hosting on a GPU |
| |
|
| | To run your FastAPI-based Text-to-Speech (TTS) server inside a Docker container with GPU support, follow these steps: |
| |
|
| | --- |
| |
|
| | ## Prerequisites |
| |
|
| | 1. **NVIDIA GPU**: Ensure your system has an NVIDIA GPU installed. |
| |
|
| | 2. **NVIDIA Drivers**: Install the appropriate NVIDIA drivers for your GPU. |
| |
|
| | 3. **Docker**: Install Docker on your system. |
| |
|
| | 4. **NVIDIA Container Toolkit**: Install the NVIDIA Container Toolkit to enable GPU support in Docker containers. |
| |
|
| | --- |
| |
|
| | ## Installation Steps |
| |
|
| | ### 1. Install NVIDIA Drivers |
| |
|
| | Ensure that the NVIDIA drivers compatible with your GPU are installed on your system. |
| |
|
| | ### 2. Install Docker |
| |
|
| | If Docker is not already installed, you can install it by following the official Docker installation guide for your operating system. |
| |
|
| | ### 3. Install NVIDIA Container Toolkit |
| |
|
| | The NVIDIA Container Toolkit allows Docker containers to utilize the GPU. |
| |
|
| | **For Ubuntu:** |
| |
|
| | ```bash |
| | # Add the package repositories |
| | distribution=$(. /etc/os-release;echo $ID$VERSION_ID) |
| | curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - |
| | curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \ |
| | sudo tee /etc/apt/sources.list.d/nvidia-docker.list |
| | |
| | # Update the package lists |
| | sudo apt-get update |
| | |
| | # Install the NVIDIA Container Toolkit |
| | sudo apt-get install -y nvidia-container-toolkit |
| | |
| | # Restart the Docker daemon to apply changes |
| | sudo systemctl restart docker |
| | ``` |
| |
|
| | **For other operating systems:** Refer to the [NVIDIA Container Toolkit installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) for detailed instructions. |
| |
|
| | ### 4. Verify GPU Access in Docker |
| |
|
| | To confirm that Docker can access your GPU, run the following command: |
| |
|
| | ```bash |
| | docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi |
| | ``` |
| |
|
| |
|
| | ## Running Your FastAPI TTS Server with GPU Support |
| |
|
| | Assuming your FastAPI TTS application is containerized and ready to run: |
| |
|
| | 1. **Build Your Docker Image** |
| |
|
| | Navigate to the directory containing your `Dockerfile` and build the Docker image: |
| |
|
| | ```bash |
| | docker build -t your_image_name . |
| | ``` |
| |
|
| |
|
| | 2. **Run the Docker Container with GPU Support** |
| |
|
| | Start the container with GPU access enabled: |
| |
|
| | ```bash |
| | docker run --gpus all -p 8080:8080 -v /path/to/this/code/dir/:/app/ your_image_name API_main.py |
| | ``` |
| |
|
| | ## Example API Call |
| |
|
| | ```python |
| | import requests |
| | |
| | # Define the base URL of your API |
| | base_url = 'http://localhost:8080/Get_Inference' |
| | |
| | # Set up the query parameters |
| | WavPath = 'path/to/wavfile.wav' |
| | |
| | params = { |
| | 'text': 'ಮಾದರಿಯು ಸರಿಯಾಗಿ ಕಾರ್ಯನಿರ್ವಹಿಸುತ್ತಿದೆಯೇ ಎಂದು ಖಚಿತಪಡಿಸಿಕೊಳ್ಳಲು ಬಳಸಲಾಗುವ ಪರೀಕ್ಷಾ ವಾಕ್ಯ ಇದು.', |
| | 'lang': 'kannada', |
| | } |
| | |
| | # Send the GET request |
| | with open(WavPath, "rb") as AudioFile: |
| | response = requests.get(base_url, params = params, files = { 'speaker_wav': AudioFile.read() }) |
| | |
| | # Check if the request was successful |
| | if response.status_code == 200: |
| | # Save the audio content to a file |
| | with open('output.wav', 'wb') as f: |
| | f.write(response.content) |
| | print("Audio saved as 'output.wav'") |
| | else: |
| | # Print the error message |
| | print(f"Request failed with status code {response.status_code}") |
| | print("Response:", response.text) |
| | ``` |