CLOVA Speech Recognition (CSR) overview

Prev Next

Available in Classic and VPC

CLOVA Speech Recognition is a NAVER Cloud Platform service that converts human speech into text. The CLOVA Speech Recognition service provides APIs in RESTful form for various voice recognition features utilized in assistant applications, chatbots, voice memos, and more. For mobile environments, it provides APIs in the form of Android and iOS SDKs to receive voice input from users.

Note

CLOVA Speech allows you to upload long audio/video files to check the speech recognition results. On the other hand, the CSR (CLOVA Speech Recognition) service is optimized for imperative speech recognition within one minute.

Common CLOVA Speech Recognition (CSR) settings

The following describes commonly used request and response formats in CLOVA Speech Recognition APIs.

Request

The following describes the common request format.

API URL

The request API URL is as follows.

https://naveropenapi.apigw.ntruss.com/recog/v1
Note

See the Mobile SDK documentation for platform-specific ways to use the APIs in a mobile environment.

Request headers

The following describes the request headers.

Field Required Description
x-ncp-apigw-api-key-id Required Client ID issued after application registration in NAVER Cloud Platform console
x-ncp-apigw-api-key Required Client secret issued after application registration in NAVER Cloud Platform console
Content-Type Required Request data format
  • application/octet-stream
Note

For information on how to register an application in the NAVER Cloud Platform console to obtain authentication information (client ID, client secret) required to use the API, see CLOVA Speech Recognition (CSR) User Guide.
In the console, make sure that the API you want to use after registering the application is selected by clicking the [Edit] button. If it is not selected, you will receive a 429 (Quota Exceed) error.

Response

The following describes the common response format.

Response status codes

The following describes the response status codes.

  • STT (Speech-to-Text)
HTTP status code Code Message Description
413 STT000 Request Entity Too Large Entered voice data value exceeds the allowed capacity (up to 3 MB)
413 STT001 Exceed Sound Data length Entered voice data value exceeds the allowed length (up to 60 seconds)
400 STT002 Invalid Content Type Content-Type other than application/octet-stream is entered
400 STT003 Empty Sound Data No voice data entered
400 STT004 Empty Language Language (lang) parameter not entered
400 STT005 Invalid Language Entered language (lang) parameter not supported
500 STT006 Failed to pre-processing Error during speech recognition preprocessing
  • Need to ensure voice data is legitimate wav, mp3, or flac file
400 STT007 Too Short Sound Data Voice data is too short (400 ms or less)
500 STT998 Failed to STT Error during speech recognition
  • Contact Support
500 STT999 Internal Server Error Internal server error
  • Contact Support
  • Mobile SDK
HTTP status code Code Message Description
- 10 ERROR_NETWORK_INITIALIZE Error resetting network resources
- 11 ERROR_NETWORK_FINALIZE Error releasing network resources
- 12 ERROR_NETWORK_READ Error receiving network data
  • Timeout due to slow network environment on client device
- 13 ERROR_NETWORK_WRITE Error sending network data
  • Timeout due to slow network environment on client device
- 14 ERROR_NETWORK_NACK Error on speech recognition server
  • Timeout due to slow network environment on the client device not sending voice packets to the server in time
- 15 ERROR_INVALID_PACKET Error due to sending invalid packets
- 20 ERROR_AUDIO_INITIALIZE Error resetting audio resources
  • Need to verify audio permissions
- 21 ERROR_AUDIO_FINALIZE Error releasing audio resources
- 22 ERROR_AUDIO_RECORD Error during voice input (recording)
  • Need to verify audio permissions
- 30 ERROR_SECURITY Authentication permission error
- 40 ERROR_INVALID_RESULT Recognition result error
- 41 ERROR_TIMEOUT Failed to send voice to the server for a period of time or didn't receive recognition results
- 42 ERROR_NO_CLIENT_RUNNING Detected certain speech recognition-related event in situation where client is not performing speech recognition
- 50 ERROR_UNKNOWN_EVENT Detected undefined event inside the client
- 60 ERROR_VERSION Protocol version error
- 61 ERROR_CLIENTINFO Client information error
- 62 ERROR_SERVER_POOL Not enough servers available for speech recognition
- 63 ERROR_SESSION_EXPIRED Speech recognition server session expired
- 64 ERROR_SPEECH_SIZE_EXCEEDED Speech packet size exceeded
- 65 ERROR_EXCEED_TIME_LIMIT Error in timestamp for authentication
- 66 ERROR_WRONG_SERVICE_TYPE Invalid service type
- 67 ERROR_WRONG_LANGUAGE_TYPE Invalid language type
- 70 ERROR_OPENAPI_AUTH Error when authenticating with Open API
  • Invalid client ID and registered package name (Android) or Bundle ID information (iOS)
- 71 ERROR_QUOTA_OVERFLOW Exhausted a set API call limit (quota)
  • Other errors and inquiries
Phenomenon or inquiry Cause or solution
UnsatifiedLinkError occurred
  • The CSR API provides libraries built with armeabi and armeabi-v7a
  • If any of the libraries used by the app you're developing don't support armeabi and armeabi-v7a, you may encounter this error
android fatal signal 11 (sigsegv) error occurred
  • Must prepare resources before accepting voice input using the CSR API
  • Make sure initialize() and release() are called before calling recognize()
""(null) returned as recognition result
  • Can occur if the user speaks in a very low voice, or if the voice is not recognized due to ambient noise
  • Although it is extremely rare, it is recommended to throw an exception when the recognition result is null (empty)
Audio file recognition CSR API does not support audio file recognition
Does not work well on low-end smartphones Android SDK version 10 or higher, iOS version 8 or higher devices supported
Note

For response status codes common to NAVER Cloud Platform, see Ncloud API response status codes.

CLOVA Speech Recognition API

The following describes the APIs provided by the CLOVA Speech Recognition service.

API Description
STT (Speech-to-Text) Extract speech to text
Mobile SDK Extract speech to text in mobile environment

CLOVA Speech Recognition related resources

NAVER Cloud Platform provides a variety of related resources to help users better understand CLOVA Speech Recognition APIs.