- Print
- PDF
Mobile SDK
- Print
- PDF
Available in Classic and VPC
APIs are provided in the form of Android and iOS SDKs that allow you to select the language to be used for speech recognition and input voice data in MP3, AAC, AC3, OGG, FLAC, and WAV formats and convert it to text.
Preparation
For a description of prerequisites for the Mobile SDK, see Common CLOVA Speech Recognition (CSR) settings.
Use API
CSR APIs are provided through SDKs for Android and iOS. This section describes how to use the CSR API for each platform.
Request
Android API
Here's how to use the Android API.
Add the following syntax to
app/build.gradle
file.repositories { jcenter() } dependencies { compile 'com.naver.speech.clientapi:naverspeech-ncp-sdk-android:1.1.6'
Configure the Android manifest file (AndroidManifest.xml) as follows.
Package name: The value of the
manifest
attribute of thepackage
element must be the same as the Android app package name registered in the NAVER Cloud Platform console.Set permissions: The user's voice input needs to be recorded through the microphone and the recorded data needs to be sent to the server, so be sure to set permissions for
android.permission.INTERNET
andandroid.permission.RECORD_AUDIO
.<manifest xmlns:android="http://schemas.android.com/apk/res/android" package="com.naver.naverspeech.client" android:versionCode="1" android:versionName="1.0"> <uses-permission android:name="android.permission.INTERNET" /> <uses-permission android:name="android.permission.RECORD_AUDIO" /> <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" /> <uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
- (Optional) Add the following code to the proguard-rules.pro file.
This code makes the app run lighter and more secure.
-keep class com.naver.speech.clientapi.SpeechRecognizer { protected private *;
NAVER Open API supports Android SDK version 10 or later. Therefore, you need to set the minSdkVersion
value in your build.gradle file accordingly.
- The client performs a series of event flows such as Preparation, Recording, Intermediate result output, Endpoint extraction, Final result output.
- The application developer inherits the
SpeechRecognitioinListener
interface and implements the behavior to be handled when those events occur.
See https://github.com/NaverCloudPlatform/naverspeech-sdk-android for more information on the API.
iOS API
Here's how to use the iOS API.
Clone the Example for iOS or download it as a ZIP file and unzip it.
git clone https://github.com/NaverCloudPlatform/naverspeech-sdk-ios.git or wget https://github.com/NaverCloudPlatform/naverspeech-sdk-ios/archive/ncp.zip unzip ncp.zip
In the OS example, add the
framework/NaverSpeech.framework
directory to the Embedded Binaries of the app you are developing.Set the iOS Bundle Identifier as follows.
- Bundle Identifier: Must be the same as the iOS Bundle ID registered in NAVER Cloud Platform console.
- Set permissions: The user's voice input needs to be recorded through the microphone and the recorded data needs to be sent to the server, Therefore, set the
key
value as follows.
<key>NSMicrophoneUsageDescription<key> <string></string>
- NAVER Open API provides a framework in the form of Universal binary (Fat binary) to provide the iOS API. Therefore, the Enable Bitcode option in Build Setting is not available, so please set it to No.
- NAVER Open API supports iOS version 8 or later, so set the Deployment Target value accordingly.
- The client performs a series of event flows such as Preparation, Recording, Intermediate result output, Endpoint extraction, Final result output.
- The application developer implements the
NSKRecognizerDelegate
protocol to perform the desired action when those events occur.
For more information about the API, see the NAVER Speech documentation.
UX considerations
In general, users tend to want to start speaking as soon as they press the speech recognition button. However, calling the recognize()
method, which initiates speech recognition, may result in missing parts of the user's utterance because the app needs to perform preparations such as allocating memory for speech recognition, allocating microphone resources, connecting to the speech recognition server, and authenticating. Therefore, the app should inform the user that it is okay to speak after all preparations are complete. This can be handled as follows.
- When everything is ready, the
onReady
callback method is called. - Until the
onReady
callback method is called, you should display a message such asWe're getting ready.
or some UI indication that you're getting ready. - Once the
onReady
callback method is called, you should display a message such asSpeak now.
or display a UI indicating that it is available.
- (Android API) The callback methods of
SpeechRecognitionListener
, such asonReady
andonRecord
, are methods that are called from Worker Thread, and must be registered and used in a Handler. - (iOS API) When you call the
cancel()
method, the delegation methods are not called from the time you call it. Therefore, jobs that need to be processed when speech recognition is finished must be performed separately after calling thecancel()
method.
Response status codes
For response status codes common to all CLOVA Speech Recognition (CSR) APIs, see Common CLOVA Speech Recognition (CSR) response status codes.