Create training

Article summary

Did you find this summary helpful?

Thank you for your feedback

Available in Classic and VPC

Create a training. After the training you created is complete, you can call Chat Completions API or Completions API to generate interactive sentences to see the answers CLOVA Studio generated.

Request

This section describes the request format. The method and URI are as follows:

Method	URI
POST	/tuning/v2/tasks

Request headers

The following describes the request headers.

Field	Required	Description
`Authorization`	Required	API key for authentication <e.g.,> `Bearer nv-************`
`X-NCP-CLOVASTUDIO-REQUEST-ID`	Optional	Request ID for the request
`Content-Type`	Required	Request data format `multipart/form-data`

Request body

You can include the following data in the body of your request:

Field	Type	Required	Description
`name`	String	Optional	Training name Auto-generate with training creation date if not entered
`model`	String	Required	Model name to use for tuning
`tuningType`	String	Optional	Tuning method `PEFT` (default)
`taskType`	String	Optional	Training type `GENERATION` (default) \| `CLASSIFICATION` `GENERATION`: generation `CLASSIFICATION`: classification
`trainEpochs`	String	Optional	Number of epochs trained on the model 1 ≤ `trainEpochs` ≤ 20 (default: 8)
`learningRate`	String	Optional	Training rate (level or degree to which model parameters are relearned when tuning) 1 ≤ `learningRate` ≤ 1.0E-6 (default: 1.0E-4)
`trainingDatasetFilePath`	String	Conditional	Path to the dataset file to train on Required when uploaded to the Object Storage service
`trainingDatasetBucket`	String	Conditional	Name of the bucket where the dataset file to be trained is uploaded Required when entering `trainingDatasetFilePath`
`trainingDatasetAccessKey`	String	Conditional	Access key for accessing large dataset file to train on Required when entering `trainingDatasetFilePath`
`trainingDatasetSecretKey`	String	Conditional	Secret key for accessing large dataset file to train on Required when entering `trainingDatasetFilePath`

Request example

The request example is as follows:

curl --location --request POST 'https://clovastudio.apigw.ntruss.com/tuning/v2/tasks' \
--header 'Authorization: Bearer {API Key}' \
--header 'Content-Type: multipart/form-data' \
--header 'X-NCP-CLOVASTUDIO-REQUEST-ID: {Request ID}'
--data '{
  "name": "generation_task",
  "model": "HCX-003",
  "tuningType": "PEFT",
  "taskType": "GENERATION",
  "trainEpochs": 8,
  "learningRate": 1e-5f,
  "trainingDatasetFilePath": "root_path/sub_path/file_name",
  "trainingDatasetBucket": "bucket_name",
  "trainingDatasetAccessKey": "access_key",
  "trainingDatasetSecretKey": "secret_key"
}'

Response

This section describes the response format.

Response headers

The following describes the response headers.

Headers	Required	Description
Content-Type	-	Response data format `application/json`

Response body

The response body includes the following data:

Field	Type	Required	Description
`result`	Object	-	Response result
`result.id`	String	-	Training ID
`result.name`	String	-	Training name
`result.model`	String	-	Tuning model name
`result.method`	String	-	Tuning method `LoRA`
`result.taskType`	String	-	Training type `GENERATION` \| `CLASSIFICATION` `GENERATION`: generation `CLASSIFICATION`: classification
`result.trainEpochs`	Integer	-	Number of epochs trained on the model
`result.learningRate`	Double	-	Training rate (level or degree to which model parameters are retrained when tuning)
`result.status`	String	-	Training progress `WAIT` \| `RUNNING` \| `FAILED` \| `SUCCEEDED` `WAIT`: training pending `RUNNING`: training in progress `FAILED`: training stopped `SUCCEEDED`: training completed
`result.statusInfo`	Array	-	Training progress details
`result.createdClientType`	String	-	Type of client requesting training `API` \| `WEB` `API`: API client `WEB`: web client
`result.createdDate`	String	-	Training creation date (ISO 8601 format)
`result.updatedDate`	String	-	Training modification date (ISO 8601 format)

`statusInfo`

The following describes statusInfo.

Field	Type	Required	Description
`label`	Array	-	When the training type is `CLASSIFICATION`: User data labels are displayed When the training type is `GENERATION`: `null`
`dataRows`	Integer	-	Number of data
`numOfTokens`	Integer	-	Number of data tokens
`currStep`	Integer	-	Number of current training steps
`totalTrainSteps`	Integer	-	Number of total training steps
`currEpoch`	Integer	-	Current epoch
`totalTrainEpochs`	Integer	-	All training epochs
`estimatedTime`	Integer	-	Estimated run time Derived by multiplying the average time of 1 epoch by the total number of training epochs
`trainLoss`	Double	-	Training loss
`sendWeightSuccess`	Boolean	-	Whether to save training results `false` \| `true` `false`: Don't save `true`: Save
`failureReason`	String	-	Reason for training failure (`FAILED`)
`message`	String	-	Detailed message for training failure (`FAILED`) reason
`endDatetime`	String	-	Training end date (in ISO 8601 format)

`failureReason`, `message`

The following describes training failure reasons (failureReason) and the detailed message (message) for each training failure reason.

Training failure reason	Message details	Description
`file.extension`	`Unavailable file extension. Please check the file extension again.`	The data file extension does not match the requested `tuningType`
`file.size`	`Exceeded the disk usage limit. Please check if the file size is {limit} or less.`	The expected file size for the training request is exceeded
`file.encoding`	`Unsupported charset`	Non-UTF8-sig encoding
`file.format`	`Invalid json format. {reason}`	Unable to decode the dataset file (.json/.jsonl)
`file.format`	`Invalid dataset: required field empty. {column}`	The dataset file does not have the required columns
`file.format`	`Invalid dataset: unexpected column. {column}`	The dataset file has unexpected columns
`file.format`	`Invalid dataset: duplicate columns. {column}`	The dataset file is not case sensitive and contains duplicate columns
`file.format`	`Invalid dataset: column order`	The column order in the dataset file deviates from System_Prompt, C_ID, T_ID, Text, and Completion
`file.format`	`Invalid dataset: {column}`	The C_ID (or T_ID) in the dataset file does not satisfy the pattern of starting at 0 and incrementing by 1, or the value is empty
`file.error`	-	File read error
`resource.timeout`	-	Response timeout due to GPU acquisition failure; retry required
`clops.error`	-	CLOps error while training
`train.unknown`	-	Non-file related error while training

Response example

The response example is as follows:

Succeeded

The following is a sample response upon a successful call.

{
    "status": {
        "code": "20000",
        "message": "OK"
},
"result": {
    "id": "czf9fbky",
    "name": "230821-130704",
    "model": "HCX-003",
    "method": "LoRA",
    "taskType": "GENERATION",
    "trainEpochs": 8,
    "learningRate": 1.0E-4,
    "status": "WAIT",
    "statusInfo": {
        "label": null,
        "dataRows": null,
        "numOfTokens": null,
        "currStep": null,
        "totalTrainSteps": null,
        "currEpoch": null,
        "totalTrainEpochs": null,
        "estimatedTime": null,
        "trainLoss": null,
        "sendWeightSuccess": null,
        "failureReason": null,
        "message":null,
        "endDatetime": null
        },
        "createdClientType": "API",
        "createdDate": "2023-08-21T13:07:06+0900",
        "updatedDate": "2023-08-21T13:07:06+0900"
    }
}

Failure

The following is a sample response upon a failed call.

Was this article helpful?

What's Next

Delete training

Table of contents

Request
Response