Text to Speech Endpoint

Overview

The Text-to-Speech endpoint enables you to generate audio by providing a text input along with pre trained voices.

Sample Generation

Example 1

Prompt

In the ancient land of Eldoria, where the skies were painted with shades of mystic hues and the forests whispered secrets of old, there existed a dragon named Zephyros. Unlike the fearsome tales of dragons that plagued human hearts with terror, Zephyros was a creature of wonder and wisdom, revered by all who knew of his existence.

Generated Output

Request

--request POST 'https://modelslab.com/api/v1/enterprise/text_to_speech/make' \

Make a POST request to 'https://modelslab.com/api/v1/enterprise/text_to_speech/make endpoint and pass the required parameters as a request body.

Body Attributes

Parameter	Description	Values
key	The API key used for authenticating your request.	String
prompt	The text prompt that describes the audio to be generated.	Text
voice_id	The ID of trained voice only. You can get the list of trained voices here.	`string` trained voice ID
language	The language for the voice. Defaults to `english` if not specified.	`american english`, `british english`, `spanish`, `japanese`, `mandarin chinese`, `french`, `brazilian portuguese`, `hindi`, `italian`
output_format	The format of the generated audio. Defaults to `wav` if not specified.	`wav` or `mp3`
speed	playback speed of the generated audio. `Defaults` to `1.0`.	Integral value
emotion	Whether to enable emotion support. Currently only available in English. Defaults to `false`.	TRUE or FALSE
temp	Specifies if temporary links should be used valid for 24 hours. This can help if access to certain storage sites is blocked. Defaults to `false` .	TRUE or FALSE
webhook	A URL where the API will send a POST request once the audio generation is complete.	URL
track_id	An ID returned in the API response, used to identify webhook requests	Integral value

Note: If you use language: en, then emotion: true is required to use emotion tags in your prompt.

Emotion Support

Emotion support is currently only available in English (en) language. When emotion is enabled, you can use special tags in your text prompt to add expressive elements to the generated speech.

Available Emotion-Supported Voices

Female Voices

Tara
Leah
Jess
Mia
Zoe

Male Voices

Supported Emotion Tags

The following emotion tags can be added to speech prompts to enhance expressiveness:

Tag	Description
`<laugh>`	Adds a laughing effect
`<chuckle>`	A soft chuckle for a subtle humorous tone
`<sigh>`	Expresses disappointment, relief, or tiredness
`<cough>`	Simulates a short cough
`<sniffle>`	Mimics a sniffle, indicating sadness or a cold
`<groan>`	Adds a groaning effect for frustration or discomfort
`<yawn>`	Simulates yawning to express boredom or tiredness
`<gasp>`	Expresses shock or surprise

Example

Body

Body
{   
 "key": "",
"prompt":"Build next-generation AI products without worrying about GPUs",
"language":"american english",
"voice_id":"madison",
"speed":1,
"emotion":false
}

Request

JS
PHP
NODE
PYTHON
JAVA

var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");

var raw = JSON.stringify({
  "key": "",
  "prompt":"Build next-generation AI products without worrying about GPUs",
    "language":"american english",
    "voice_id":"madison",
    "speed":1,
    "emotion":false
});

var requestOptions = {
  method: 'POST',
  headers: myHeaders,
  body: raw,
  redirect: 'follow'
};

fetch("'https://modelslab.com/api/v1/enterprise/text_to_speech/make", requestOptions)
  .then(response => response.text())
  .then(result => console.log(result))
  .catch(error => console.log('error', error));

<?php

$payload = [
  "key" => "",
"prompt" =>"Build next-generation AI products without worrying about GPUs",
"language" => "american english",
"voice_id" =>"madison",
"speed" => 1,
"emotion" =>false 
];

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://modelslab.com/api/v1/enterprise/text_to_speech/make',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'POST',
  CURLOPT_POSTFIELDS => json_encode($payload),
  CURLOPT_HTTPHEADER => array(
    'Content-Type: application/json'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;

var request = require('request');
var options = {
  'method': 'POST',
  'url': 'https://modelslab.com/api/v1/enterprise/text_to_speech/make',
  'headers': {
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    "key": "",
   "prompt":"Build next-generation AI products without worrying about GPUs",
    "language":"american english",
    "voice_id":"madison",
    "speed":1,
    "emotion":false
  })
};

request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});

import requests
import json

url = "'https://modelslab.com/api/v1/enterprise/text_to_speech/make"

payload = json.dumps({
  "key": "",
    "prompt":"Build next-generation AI products without worrying about GPUs",
    "language":"american english",
    "voice_id":"madison",
    "speed":1,
    "emotion":false
})

headers = {
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
MediaType mediaType = MediaType.parse("application/json");
RequestBody body = RequestBody.create(mediaType, "{\n    \"key\":\"\",\n    \"prompt\":\"Build next-generation AI products without worrying about GPUs\",\n    \"language\":\"american english\",\n    \"voice_id\":\"madison\",\n    \"speed\":1,\n    \"emotion\":false\n}");
Request request = new Request.Builder()
  .url("'https://modelslab.com/api/v1/enterprise/text_to_speech/make")
  .method("POST", body)
  .addHeader("Content-Type", "application/json")
  .addHeader("X-API-Key", "{{token}}")
  .build();
Response response = client.newCall(request).execute();

Response

Success
Processing
Error

{
    "status": "success",
    "generationTime": 1.904285192489624,
    "id": 334166,
    "output": [
        "https://pub-3626123a908346a7a8be8d9295f44e26.r2.dev/generations/b2dff60e-4636-4178-9a72-04a10a309185.wav"
    ],
    "proxy_links": [
        "https://cdn2.stablediffusionapi.com/generations/b2dff60e-4636-4178-9a72-04a10a309185.wav"
    ],
    "meta": {
        "base64": "no",
        "emotion": "Neutral",
        "filename": "b2dff60e-4636-4178-9a72-04a10a309185.wav",
        "input_sound_clip": [
            "tmp/0-b2dff60e-4636-4178-9a72-04a10a309185.wav"
        ],
        "input_text": "Narrative voices capable of pronouncing terminologies & acronyms in training and ai learning materials.",
        "language": "english",
        "speed": 1,
        "temp": "no"
    }
}

{
    "status": "processing",
    "tip": "Your audio is processing in background, you can get this audio using fetch API",
    "eta": 100,
    "message": "Try to fetch request after seconds estimated",
    "fetch_result": "https://modelslab.com/api/v6/voice/fetch/334166",
    "id": 334166,
    "output": [],
    "future_links": [
        "https://pub-3626123a908346a7a8be8d9295f44e26.r2.dev/generations/b2dff60e-4636-4178-9a72-04a10a309185.wav"
    ],
    "proxy_links": [
        "https://cdn2.stablediffusionapi.com/generations/b2dff60e-4636-4178-9a72-04a10a309185.wav"
    ],
    "meta": {
        "base64": "no",
        "emotion": "Neutral",
        "filename": "b2dff60e-4636-4178-9a72-04a10a309185.wav",
        "input_sound_clip": [
            "tmp/0-b2dff60e-4636-4178-9a72-04a10a309185.wav"
        ],
        "input_text": "Narrative voices capable of pronouncing terminologies & acronyms in training and ai learning materials.",
        "language": "english",
        "speed": 1,
        "temp": "no"
    }
}

{
    "status": "error",
    "message": "Error message"
}

Overview​

Sample Generation​

Example 1​

Prompt​

Generated Output​

Request​

Body Attributes​

Emotion Support​

Available Emotion-Supported Voices​

Female Voices​

Male Voices​

Supported Emotion Tags​

Example​

Body​

Request​

Response​

Overview

Sample Generation

Example 1

Prompt

Generated Output

Request

Body Attributes

Emotion Support

Available Emotion-Supported Voices

Female Voices

Male Voices

Supported Emotion Tags

Example

Body

Request

Response