Available in Classic and VPC
Generate evidence-based RAG answers by leveraging RAG Reasoning model trained on answer types, such as citation sources that increase credibility and citation source indexing notations. RAG Reasoning calls the engine in the function calling format. You can specify a single or multiple RAG functions, and LLM will autonomously select the best function for the context to generate search augmentation. When used in conjunction with Reranker, you can get more stable results.
Request
This section describes the request format. The method and URI are as follows:
Method | URI |
---|---|
POST | /v1/api-tools/rag-reasoning |
Request headers
The following describes the request headers.
Field | Required | Description |
---|---|---|
Authorization |
Required | API key for authentication Example: Bearer nv-************ |
X-NCP-CLOVASTUDIO-REQUEST-ID |
Optional | Request ID for the request |
Content-Type |
Required | Request data format |
Request body
You can include the following data in the body of your request:
Field | Type | Required | Description |
---|---|---|---|
messages |
Array | Required | Conversation messages |
topP |
Double | Optional | Sample generated token candidates based on cumulative probability.
|
topK |
Integer | Optional | Sample K high-probability candidates from the pool of generated token candidates
|
maxTokens |
Integer | Optional | Maximum number of generated tokens
|
temperature |
Double | Optional | Degree of diversity for the generated tokens (higher values generate more diverse sentences)temperature ≤ 1.00 (default: 0.50) |
repetitionPenalty |
Double | Optional | Degree of penalty for generating the same token (the higher the setting, the less likely it is to generate the same result repeatedly)repetitionPenalty ≤ 2.0 (default: 1.1) |
stop |
Array | Optional | Token generation stop character |
seed |
Integer | Optional | Adjust consistency level of output on model iterations.seed ≤ 4294967295: seed value of result value you want to generate consistently, or a user-specified seed value |
includeAiFilters |
Boolean | Optional | Whether to display the AI filter results (degree of the generated results in categories such as profanity, degradation/discrimination/hate, sexual harassment/obscenity, etc.)
|
tools |
Array | Required | List of tools available for Function calling : tools |
toolChoice |
String | Object | Optional | Function calling tool call behavior
|
toolChoice.type |
String | Optional | Tool type to be called by the Function calling model |
toolChoice.function |
Object | Optional | Tool to be called by the Function calling model
|
toolChoice.function.name |
String | Optional | Tool name to be called by the Function calling model |
messages
The following describes messages
.
Field | Type | Required | Description |
---|---|---|---|
role |
Enum | Required | Role of conversation messages
|
content |
String | Required | Conversation message content
|
toolCalls |
Array | Conditional | Assistant call tool informationrole is a tool , enter as the assistant's toolCalls request. |
toolCallId |
String | Conditional | Tool ID
|
If the role
is tool
, the content
of messages
should include the list of documents retrieved from the search database or search API (search_result
). Include id: {unique ID of the document}
, doc: {original document retrieved}
in the search result (search_result
) so that it can be used for citation marks in RAG answers. See below for an example.
{
"role": "tool",
"content": "[
{
\"search_result\": [{\"id\": \"doc-1493058999\",
\"doc\": \"Login with NAVER ID is only available for individual members. Business members can't use this feature.\"
},
...
]"
}
tools
The following describes tools
.
Field | Type | Required | Description |
---|---|---|---|
type |
String | Required | Tool type
|
function |
Object | Required | Call function information |
function.name |
String | Required | function name |
function.description |
String | Required | function description |
function.parameters |
Object | Required | Parameter passed when using function
|
toolCalls
The following describes toolCalls
.
Field | Type | Required | Description |
---|---|---|---|
id |
String | - | Tool identifier |
type |
String | - | Tool type
|
function |
Object | - | Call function information |
function.name |
String | - | function name |
function.arguments |
Object | - | Parameter passed when using function |
Request example
The request example is as follows:
- Step 1. Enter your query in
role: user
and call the best function to generate the answer (Check response)curl --location --request POST 'https://clovastudio.stream.ntruss.com/v1/api-tools/rag-reasoning' \ --header 'Authorization: Bearer <access_token>' \ --header 'Content-Type: application/json' \ --data-raw '{ "messages": [ { "content": "How to rent an A100 GPU", "role": "user" } ], "tools": [ { "function": { "description": "This is the tool you use to do Ncloud-related searches.\nUse the tool by breaking up your query if you need to ask multiple questions.\nIf you can't find information, you can use the tool again with suggested_queries as a reference without giving a final answer.", "name": "ncloud_cs_retrieval", "parameters": { "properties": { "query": { "description": "Refine and enter the user's search keywords.", "type": "string" } }, "required": [ "query" ], "type": "object" } }, "type": "function" } ], "toolChoice": "auto", "maxTokens": 1024 }'
- Step 2. Request with
role: tool
to generate the final answer (Check response)curl --location --request POST 'https://clovastudio.stream.ntruss.com/v1/api-tools/rag-reasoning' \ --header 'Authorization: Bearer <access_token>' \ --header 'Content-Type: application/json' \ --data-raw '{ "messages": [ { "content": "How to rent an A100 GPU", "role": "user" }, { "role": "assistant", "content": "", "toolCalls": [ { "id": "call_enTEYb0kWBjOwtkngbl7FGTm", "type": "function", "function": { "name": "ncloud_cs_retrieval", "arguments": { "query": "How to rent an A100 GPU" } } } ] }, { "content": "{\"search_result\": [{\"id\": \"doc-179\", \"doc\": \"GPU A100 can only be created in KR-1. When creating an A100, select a subnet in KR-1. Up to 5 GPU servers can be created for corporate members only.\"}, {\"id\": \"doc-248\", \"doc\": \"GPU A100 servers can be created in the Services > Compute > Server menu. For more information, see the Create server guide.\"}, {\"id\": \"doc-156\", \"doc\": \"For individual members who need more GPU servers or need to create a GPU server, please refer to the FAQ and contact Support.\"}]}", "role": "tool", "toolCallId": "call_enTEYb0kWBjOwtkngbl7FGTm" } ], "tools": [ { "function": { "description": "This is the tool you use to do Ncloud-related searches.\nUse the tool by breaking up your query if you need to ask multiple questions.\nIf you can't find information, you can use the tool again with suggested_queries as a reference without giving a final answer.", "name": "ncloud_cs_retrieval", "parameters": { "properties": { "query": { "description": "Refine and enter the user's search keywords.", "type": "string" } }, "required": [ "query" ], "type": "object" } }, "type": "function" } ] }'
Response
This section describes the response format.
Response headers
The following describes the response headers.
Headers | Required | Description |
---|---|---|
Content-Type |
- | Response data format
|
Response body
The response body includes the following data:
Field | Type | Required | Description |
---|---|---|---|
status |
Object | - | Response status |
result |
Object | - | Response result |
result.message |
ChatMessage | - | Conversation message list |
result.message.role |
Enum | - | Role of conversation messages
|
result.message.content |
String | - | Content of conversation messages |
result.message.thinkingContent |
String | - | Decision flow in model |
result.message.toolCalls |
Array | - | toolCalls |
result.usage |
Object | - | Token usage |
result.usage.completionTokens |
Integer | - | Generated token count |
result.usage.promptTokens |
Integer | - | Number of input (prompt) tokens |
result.usage.totalTokens |
Integer | - | Total number of tokens
|
toolCalls
The following describes toolCalls
.
Field | Type | Required | Description |
---|---|---|---|
id |
String | - | Tool identifier |
type |
String | - | Tool type
|
function |
Object | - | Call function information |
function.name |
String | - | function name |
function.arguments |
Object | - | Parameter passed when using function |
Response example
The response example is as follows:
-
Example response to Step 1. (Check response)
{ "status": { "code": "20000", "message": "OK" }, "result": { "message": { "role": "assistant", "content": "", "thinkingContent": "The user has inquired about \"how to rent an A100 GPU\". To find the answer to this question, you need to use the tool "ncloud_cs_retrieval" to retrieve relevant information.", "toolCalls": [ { "id": "call_enTEYb0kWBjOwtkngbl7FGTm", "type": "function", "function": { "name": "ncloud_cs_retrieval", "arguments": { "query": "How to rent an A100 GPU" } } } ] }, "usage": { "promptTokens": 135, "completionTokens": 84, "totalTokens": 219 } } }
-
Example response to Step 2. (LLM returns the final answer) (Check response)
{ "status": { "code": "20000", "message": "OK" }, "result": { "message": { "role": "assistant", "content": "To rent an A100 GPU, <doc-248>you can create a GPU A100 server from the Services > Compute > Server menu in the NAVER Cloud Platform console.</doc-248>However, <doc-179>GPU A100 can only be created in KR-1, and you must select a subnet in KR-1 when creating an A100.</doc-179> Also, <doc-179>up to GPU servers can be created for corporate members only.</doc-179> If you need more GPU servers or are an individual member who needs to create a GPU server, <doc-156>please refer to the FAQ and contact Support.</doc-156>" }, "usage": { "promptTokens": 332, "completionTokens": 146, "totalTokens": 478 } } }
Failure
The following is a sample response upon a failed call.