Available in Classic and VPC
You may experience the following situations while using the CLOVA Speech service. Check the questions and answers and act accordingly.
Guide to managing gRPC connections
I would like to know if there is a recommended connection management method for gRPC connections.
Solution
-
When you finish using voice recognition, we recommend that you manage the channel by closing the gRPC connection.
- You can use common development codes such as
channel.shutdown
andchannel.close
to close the channel. For a detailed example, see the content of Real-time streaming recognition. - You can maintain the channel connection by maintaining a single stream unit in which voice recognition requests are continuously made.
- Example: In the case of a live broadcast, terminate the gRPC connection at the end of the live broadcast.
- You can use common development codes such as
-
It is recommended to apply timeout logic according to the transmission/reception of voice recognition data.
- If the connection is maintained without transmission/reception, problems may occur in acquiring the gRPC channel.
Whether gRPC service has connection lifetime limit
I would like to know if there is a limit to the connection lifetime (the duration of the connection between the server and client) for the gRPC service.
Solution
The gRPC service has a connection lifetime limit of 100 hours, but it may be disconnected due to network problems, so it is recommended to reflect retry logic for stable service use.
Whether gRPC service pausing is supported
I would like to know if the gRPC service supports pausing.
Solution
The gRPC service does not support pausing, but it can be implemented in the Recognize API by setting the epFlag
entry in the extraContents
field to true
, sending a request, and then not making a Recognize request for a period of time. For more information about the epFlag
entry, see CLOVA Speech live streaming API.
- If you request Recognize without setting
epFlag
totrue
and do not re-request for a certain period of time, the server processes the buffered Recognize request based onunvoiceTime
(10 seconds) set internally and displays the response results.
Usage of epFlag
and seqId
in extraContents
field
I would like to know the usage of epFlag
and seqId
in the extraContents
field of the Recognize API.
Solution
You can use them for pausing purposes, or to check if you have received a complete response to a request you sent.
Whether all responses have been received
I would like to know if I have received all the responses to the request I sent.
Solution
When calling the Recognize API, you can leverage the epFlag
and seqId
entries in the extraContents
field. The result of a Recognize request with epFlag
entry set to true
and seqId
entry set to any non-zero value can be verified by comparing epFlag
and seqId
in the Recognize response. For more information, see the JSON response body of Recognize response.
Whether epFlag
value is required to be true
before calling Close API
I would like to know if it is mandatory to set the epFlag
entry to true
in the Recognize API's extraContents
before calling the Close API.
Solution
It is not mandatory to set the epFlag
item to true
. However, it is recommended to set the epFlag
entry to true
if you want to receive a fast response result for the last Recognize request. For more information about the epFlag
entry, see CLOVA Speech live streaming API.
Recognize API sound source data format
I would like to know about the sound source data format for the Recognize API.
Solution
Currently, we only support PCM (headerless raw wave) format at 16 kHz, 1 channel, 16 bits per sample.