TTS (Premium)

Prev Next

Available in Classic and VPC

Synthesize speech by taking in text to convert and parameters like tone, speed, and emotion.

Request

This section describes the request format. The method and URI are as follows:

Method URI
POST /tts

Request headers

For information about the headers common to all CLOVA Voice APIs, see Common CLOVA Voice headers.

Request body

You can include the following data in the body of your request:

Field Type Required Description
speaker String Required Voice type to use for speech synthesis
text String Required Text to be converted to speech
  • Support only UTF-8 encoded text
  • Text in symbols or parentheses is not converted.
  • Maximum character limit by language
    • Korean, Japanese, Chinese, Taiwanese: 2,000 characters
    • English, Spanish: 3,000 characters
volume Integer Optional Speech volume
  • -5-5 (default: 0)
    • -5: Synthesize 0.5x smaller.
    • 0: Synthesize to normal volume.
    • 5: Synthesize 1.5x larger.
speed Integer Optional Speech speed
  • -5~10 (default: 0)
    • -5: 2.0x speed (duration: 0.5x, fast)
    • 0: original speed
    • 10: 0.5x speed (duration: 2.0x, slow)
pitch Integer Optional Speech pitch
  • -5-5 (default: 0)
    • -5: Synthesize 1.2x higher.
    • 0: Synthesize to normal pitch.
    • 5: Synthesize 0.8x lower.
emotion Integer Optional Emotion level of speech
  • Supported voices: nara | vara | vmikyung | vdain | vyuna | vgoeun | vdaeseong
  • 0-3 (default: 0)
    • 0: neutral
    • 1: sad
    • 2: happy
    • 3: angry (nara not supported)
emotion-strength Integer Optional Emotion intensity of speech
  • Supported voices: vara | vmikyung | vdain | vyuna | vgoeun | vdaeseong
  • 0-2 (default: 1)
    • 0: weak
    • 1: normal
    • 2: strong
format String Optional Speech file format
  • mp3 (default) | wav
sampling-rate Integer Optional Sampling rate of speech
  • Only supported for wav format
  • 8000 | 16000 | 24000 (default)| 48000
    • Exceptionally, mijin only supports the 16000 rate.
alpha Integer Optional Tone
  • -5-5 (default: 0)
    • If higher than 0: high tone
    • If lower than 0: low tone
end-pitch Integer Optional End-pitch processing of speech
  • Supported voices: clara | matt | meimei | liangliang | chiahua | kuanlin | carmen | jose | all voices starting with d- (Example: dara)
  • -5-5 (default: 0)
    • If higher than 0: high end-pitch
    • If lower than 0: low end-pitch

List of speech synthesis voices

The following describes in detail the types of voices that will be used for speech synthesis.

Field Name Language Description
dara_ang Ara (angry) Korean Female
jinho Jinho Korean Male
mijin Mijin Korean Female
napple Neulbom Korean Female
nara_call Ara (agent) Korean Female
nara Ara Korean Female
nbora Bora Korean Female
ndaeseong Daeseong Korean Male
ndain Dain Korean Child (female)
ndonghyun Donghyun Korean Male
nes_c_hyeri Hyeri Korean Female
nes_c_kihyo Kihyo Korean Male
nes_c_mikyung Mikyung Korean Female
nes_c_sohyun Sohyun Korean Female
neunseo Eunseo Korean Female
neunwoo Eunwoo Korean Male
neunyoung Eunyoung Korean Female
ngaram Garam Korean Child (female)
ngoeun Goeun Korean Female
ngyeongjun Gyeongjun Korean Male
nhajun Hajun Korean Child (male)
nheera Heera Korean Female
nian Ian Korean Male
nihyun Ihyun Korean Female
njaewook Jaewook Korean Male
njangj Dream Korean Female
njihun Jihun Korean Male
njihwan Jihwan Korean Male
njinho Jinho Korean Male
njiwon Jiwon Korean Female
njiyun Jiyun Korean Female
njonghyeok Jonghyeok Korean Male
njonghyun Jonghyun Korean Male
njooahn Jooahn Korean Male
njoonyoung Joonyoung Korean Male
nkitae Kitae Korean Male
nkyunglee Kyunglee Korean Female
nkyungtae Kyungtae Korean Male
nkyuwon Kyuwon Korean Male
nmammon Demon Mammon Korean Male
nmeow Meow Korean Child (female)
nmijin Mijin Korean Female
nminjeong Minjeong Korean Female
nminsang Minsang Korean Male
nminseo Minseo Korean Female
nminyoung Minyoung Korean Female
nmovie Movie Choi Korean Male
noyj Bomdal Korean Female
nraewon Raewon Korean Male
nreview Review Park Korean Male
nsabina Witch Sabina Korean Female
nsangdo Sangdo Korean Male
nseonghoon Seonghoon Korean Male
nseungpyo Seungpyo Korean Male
nshasha Shasha Korean Female
nsinu Sinu Korean Male
nsiyoon Siyoon Korean Male
nsujin Sujin Korean Female
nsunhee Sunhee Korean Female
nsunkyung Sunkyung Korean Female
ntaejin Taejin Korean Male
ntiffany Kiseo Korean Female
nwontak Wontak Korean Male
nwoof Woof Korean Child (male)
nwoosik Woosik Korean Male
nyeji Yeji Korean Female
nyejin Yejin Korean Female
nyounghwa Movie Jeong Korean Female
nyoungil Youngil Korean Male
nyoungmi Youngmi Korean Female
nyujin Yujin Korean Female
nyuna Yuna Korean Female
vara Ara (Pro) Korean Female
vdaeseong Daeseong (Pro) Korean Male
vdain Dain (Pro) Korean Female
vdonghyun Donghyun (Pro) Korean Male
vgoeun Goeun (Pro) Korean Female
vhyeri Hyeri (Pro) Korean Female
vian Ian (Pro) Korean Male
vmikyung Mikyung (Pro) Korean Female
vyuna Yuna (Pro) Korean Female
dara-danna Ara & Anna Korean + English (U.S.) Female
dsinu-matt Sinu & Matt Korean + English (U.S.) Male
liangliang Liangliang Chinese Male
meimei Meimei Chinese Female
dayumu Ayumu Japanese Male
ddaiki Daiki Japanese Male
deriko Eriko Japanese Female
dhajime Hajime Japanese Male
dmio Mio Japanese Female
dnaomi Naomi Japanese Female
dnaomi_formal Naomi (news) Japanese Female
dnaomi_joyful Naomi (happy) Japanese Female
driko Riko Japanese Female
dsayuri Sayuri Japanese Female
dtomoko Tomoko Japanese Female
nnaomi Naomi Japanese Female
nsayuri Sayuri Japanese Female
ntomoko Tomoko Japanese Female
shinji Shinji Japanese Male
clara Clara English Female
danna Anna English Female
djoey Joey English Female
matt Matt English Male
carmen Carmen Spanish Female
jose Jose Spanish Male
chiahua Chiahua Taiwanese Female
kuanlin Kuanlin Taiwanese Male

Request example

The request example is as follows:

curl --location --request POST 'https://naveropenapi.apigw.ntruss.com/tts-premium/v1/tts' \
--header 'X-NCP-APIGW-API-KEY-ID: {Client ID issued when registering the app}' \
--header 'X-NCP-APIGW-API-KEY: {Client secret issued when registering the app}' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--data-urlencode 'speaker=nara' \
--data-urlencode 'text=Hello' \
--data-urlencode 'volume=0' \
--data-urlencode 'speed=-1' \
--data-urlencode 'pitch=1' \
--data-urlencode 'emotion=2' \
--data-urlencode 'emotion-strength=1' \
--data-urlencode 'format=wav' \
--data-urlencode 'sampling-rate=8000' \
--data-urlencode 'alpha=0' \
--data-urlencode 'end-pitch=0'

Response

This section describes the response format.

Response body

The response body includes the following data:

Field Type Required Description
Responded TTS audio Binary - Binary voice data in MP3 or WAV format

Response status codes

For information about the HTTP status codes common to all CLOVA Voice APIs, see Common CLOVA Voice response status codes.

Response example

The response example is as follows:

{Binary voice data in MP3 or WAV format}