Create training

Prev Next

Available in Classic and VPC

Create a training. After the training you created is complete, you can call Chat Completions API or Completions API to generate interactive sentences to see the answers CLOVA Studio generated.

Request

This section describes the request format. The method and URI are as follows:

Method URI
POST /tuning/v2/tasks

Request headers

The following describes the request headers.

Field Required Description
Authorization Required API key for authentication Example: Bearer nv-************
X-NCP-CLOVASTUDIO-REQUEST-ID Optional Request ID for the request
Content-Type Required Request data format
  • multipart/form-data
Note

The HCX-005 and HCX-DASH-002 models do not currently support training.

Request body

You can include the following data in the body of your request:

Field Type Required Description
name String Optional Training name
  • Auto-generate with training creation date if not entered.
model String Required Model name to use for tuning
tuningType String Optional Tuning method
  • PEFT (default)
trainEpochs String Optional Number of epochs trained on the model
  • 1 ≤ trainEpochs ≤ 20 (default: 8)
learningRate String Optional Training rate (level or degree to which model parameters are relearned when tuning)
  • 1 ≤ learningRate ≤ 1.0E-6 (default: 1.0E-4)
trainingDatasetFilePath String Conditional Path to the dataset file to train on
  • Required when uploaded to the Object Storage service
trainingDatasetBucket String Conditional Name of the bucket where the dataset file to be trained is uploaded
  • Required when entering trainingDatasetFilePath
trainingDatasetAccessKey String Conditional Access key for accessing large dataset file to train on
  • Required when entering trainingDatasetFilePath
trainingDatasetSecretKey String Conditional Secret key for accessing large dataset file to train on
  • Required when entering trainingDatasetFilePath

Request example

The request example is as follows:

curl --location --request POST 'https://clovastudio.stream.ntruss.com/tuning/v2/tasks' \
--header 'Authorization: Bearer {API Key}' \
--header 'Content-Type: multipart/form-data' \
--header 'X-NCP-CLOVASTUDIO-REQUEST-ID: {Request ID}' 
--data '{
  "name": "generation_task",
  "model": "HCX-003",
  "tuningType": "PEFT",
  "trainEpochs": 8,
  "learningRate": 1e-5f,
  "trainingDatasetFilePath": "root_path/sub_path/file_name",
  "trainingDatasetBucket": "bucket_name",
  "trainingDatasetAccessKey": "access_key",
  "trainingDatasetSecretKey": "secret_key"
}'

Response

This section describes the response format.

Response headers

The following describes the response headers.

Headers Required Description
Content-Type - Response data format
  • application/json

Response body

The response body includes the following data:

Field Type Required Description
result Object - Response result
result.id String - Training ID
result.name String - Training name
result.model String - Tuning model name
result.taskType String - Job type
  • GENERATION: generation (default)
  • CLASSIFICATION: classification
result.trainEpochs Integer - Number of epochs trained on the model
result.learningRate Double - Training rate (level or degree to which model parameters are retrained when tuning)
result.status String - Training progress
  • WAIT | RUNNING | FAILED | SUCCEEDED
    • WAIT: training pending
    • RUNNING: training in progress
    • FAILED: training stopped
    • SUCCEEDED: training completed
result.statusInfo Array - Training progress details
result.createdClientType String - Type of client requesting training
  • API | WEB
    • API: API client
    • WEB: web client
result.createdDate String - Training creation date (ISO 8601 format)
result.updatedDate String - Training modification date (ISO 8601 format)

statusInfo

The following describes statusInfo.

Field Type Required Description
label Array -
  • When the training type is CLASSIFICATION: User data labels are displayed.
  • When the training type is GENERATION: null
dataRows Integer - Number of data
numOfTokens Integer - Number of data tokens
currStep Integer - Number of current training steps
totalTrainSteps Integer - Number of total training steps
currEpoch Integer - Current epoch
totalTrainEpochs Integer - All training epochs
estimatedTime Integer - Estimated run time
  • Derived by multiplying the average time of 1 epoch by the total number of training epochs
trainLoss Double - Training loss
sendWeightSuccess Boolean - Whether to save training results
  • false | true
    • false: Don't save
    • true: Save
failureReason String - Reason for training failure (FAILED)
message String - Detailed message for training failure (FAILED) reason
endDatetime String - Training end date (in ISO 8601 format)

failureReason, message

The following describes training failure reasons (failureReason) and the detailed message (message) for each training failure reason.

Training failure reason Message details Description
file.extension Unavailable file extension. Please check the file extension again. The data file extension does not match the requested tuningType.
file.size Exceeded the disk usage limit. Please check if the file size is {limit} or less. The expected file size for the training request is exceeded.
file.encoding Unsupported charset Non-UTF8-sig encoding
file.format Invalid json format. {reason} Unable to decode the dataset file (.json/.jsonl)
file.format Invalid dataset: required field empty. {column} The dataset file does not have the required columns.
file.format Invalid dataset: unexpected column. {column} The dataset file has unexpected columns.
file.format Invalid dataset: duplicate columns. {column} The dataset file is not case sensitive and contains duplicate columns.
file.format Invalid dataset: column order The column order in the dataset file deviates from System_Prompt, C_ID, T_ID, Text, and Completion.
file.format Invalid dataset: {column} The C_ID (or T_ID) in the dataset file does not satisfy the pattern of starting at 0 and incrementing by 1, or the value is empty.
file.error - File read error upon training
file.noexist - If the dataset file does not exist
File not found - If the file can't be found in Object Storage
resource.timeout - Response timeout due to GPU acquisition failure; retry required
clops.error - CLOps error while training
train.unknown - Non-file related error while training

Response example

The response example is as follows:

Succeeded

The following is a sample response upon a successful call.

{
    "status": {
        "code": "20000",
        "message": "OK"
},
"result": {
    "id": "czf9fbky",
    "name": "230821-130704",
    "model": "HCX-003",
    "trainEpochs": 8,
    "learningRate": 1.0E-4,
    "status": "WAIT",
    "statusInfo": {
        "label": null,
        "dataRows": null,
        "numOfTokens": null,
        "currStep": null,
        "totalTrainSteps": null,
        "currEpoch": null,
        "totalTrainEpochs": null,
        "estimatedTime": null,
        "trainLoss": null,
        "sendWeightSuccess": null,
        "failureReason": null,
        "message":null,
        "endDatetime": null
        },
        "createdClientType": "API",
        "createdDate": "2023-08-21T13:07:06+0900",
        "updatedDate": "2023-08-21T13:07:06+0900"
    }
}

Failure

The following is a sample response upon a failed call.