STT (Speech-to-Text)

Prev Next

Available in Classic and VPC

Accept speech data in MP3, AAC, AC3, OGG, FLAC, and WAV formats, as well as the language to be used for speech recognition, and convert the recognition results to text.

Request

The following describes the request format for the endpoint. The request format is as follows:

Method URI
POST /stt

Request headers

For headers common to all CLOVA Speech Recognition (CSR) APIs, see Common CLOVA Speech Recognition (CSR) headers.

Request query parameters

The following describes the request query parameters.

Field Type Required Description
lang String Required Language of the converted text
  • Kor | Eng | Jpn | Chn
    • Kor: Korean
    • Eng: English
    • Jpn: Japanese
    • Chn: Chinese (Simplified)

Request body

The following describes the request body.

Field Type Required Description
Voice data to convert Binary Required Binary voice data in MP3, AAC, AC3, OGG, FLAC, or WAV format
  • Playback time up to 60 seconds

Request example

The following is a sample request.

curl --location --request POST 'https://naveropenapi.apigw.ntruss.com/recog/v1/stt
?lang=Kor' \
--header 'X-NCP-APIGW-API-KEY-ID: {Client ID issued when registering the app}' \
--header 'X-NCP-APIGW-API-KEY: {Client secret issued when registering the app}' \
--header 'Content-Type: application/octet-stream' \
--data '@{file}'

Response

The following describes the response format.

Response body

The following describes the response body.

Field Type Required Description
text String - Converted text from voice file

Response status codes

For response status codes common to all CLOVA Speech Recognition (CSR) APIs, see Common CLOVA Speech Recognition (CSR) response status codes.

Response example

The following is a sample example.

{
    "text": "Hello,"
}