LoRAs

Upload PEFT-trained LoRAs for supported base models. After upload, give the LoRA a name and pass that name or ID in request metadata. If you train LoRAs with Tinker, you can also sample directly from Tinker checkpoints without uploading — see Tinker.

Supported base models

LoRA serving is available for:

Base model	Max rank	Target modules
`moonshotai/Kimi-K2.6`	32	all

If a LoRA is incompatible with the base model or exceeds the rank limit, validation or inference fails.

Adapter requirements

Train with PEFT and export the two standard files:

adapter_config.json
adapter_model.safetensors

Use adapter config values compatible with the base model:

base_model_name_or_path should identify the base model you register in supported_models (e.g. moonshotai/Kimi-K2.6).
peft_type should be "LORA".
task_type should be "CAUSAL_LM".
r (rank) must be no more than the base model’s max rank.
target_modules can include any modules supported by the base model and runtime. Sail does not restrict Kimi K2.6 LoRAs to a fixed target-module allowlist.

Other adapter-config fields (lora_alpha, lora_dropout, bias settings, etc.) are preserved as-is. The adapter_config.json and adapter_model.safetensors files can each be up to 100 GiB.

Add a LoRA

Create a LoRA in three steps: upload the config file, upload the weights file, then call POST /v1/loras. The response includes validation records for each requested model.

1. Upload the two adapter files

Use POST /v1/files (multipart):

from openai import OpenAI

client = OpenAI(base_url="https://api.sailresearch.com/v1", api_key="YOUR_KEY")

with open("my-adapter/adapter_config.json", "rb") as f:
    cfg = client.files.create(file=f, purpose="lora")

with open("my-adapter/adapter_model.safetensors", "rb") as f:
    wts = client.files.create(file=f, purpose="lora")

2. Create the LoRA

import requests

resp = requests.post(
    "https://api.sailresearch.com/v1/loras",
    headers={"Authorization": "Bearer YOUR_KEY"},
    json={
        "name": "funnier-v1",
        "supported_models": ["moonshotai/Kimi-K2.6"],
        "config_file_id": cfg.id,
        "weights_file_id": wts.id,
        # optional
        "display_name": "Funnier v1",
        "description": "Fine-tuned on a standup-comedy corpus to write punchier, joke-forward replies.",
    },
    timeout=30,
)
resp.raise_for_status()
lora = resp.json()
print(lora["id"], lora["status"])  # e.g. 3fa85f64-... pending_validation

Each supported_models entry must be a known Sail model ID. Naming rules for the name field:

2–64 characters
lowercase alphanumeric or dashes ([a-z0-9-])
must start and end with an alphanumeric character
unique within your organization (duplicate → 409)

The file IDs you pass must belong to the same organization as your API key.

Validation flow

POST /v1/loras creates one validation record per supported_models entry. Each record appears in the response under validations:

{
  "id": "3fa85f64-...",
  "name": "funnier-v1",
  "status": "pending_validation",
  "supported_models": ["moonshotai/Kimi-K2.6"],
  "validations": [
    {
      "model": "moonshotai/Kimi-K2.6",
      "status": "pending",
      "response_id": "resp_..."
    }
  ]
}

Sail validates the LoRA on each model in supported_models. Poll GET /v1/loras/{name} until validation finishes for the model you want to use.

Validation status	Meaning
`pending`	The validation task has been created.
`running`	The validation request is in progress.
`succeeded`	The LoRA loaded and completed a validation request for that model.
`failed`	The LoRA is not usable for that model. `result_code` and `result_message` describe the failure.
`unverified`	Sail could not complete validation automatically.

Read validations for model-specific status. Requests using a LoRA are rejected only when the latest validation for that model is failed; pending, running, and unverified records remain usable. If you later add a model with PATCH /v1/loras/{name}, Sail creates validation records for newly added models that have not already succeeded validation.

3. Fetch LoRAs

# by name
curl -H "Authorization: Bearer $SAIL_API_KEY" https://api.sailresearch.com/v1/loras/funnier-v1

# by id
curl -H "Authorization: Bearer $SAIL_API_KEY" https://api.sailresearch.com/v1/loras/3fa85f64-...

# list all loras for your org
curl -H "Authorization: Bearer $SAIL_API_KEY" https://api.sailresearch.com/v1/loras

Use a LoRA

Pass the LoRA’s name (or its UUID) as metadata.lora on any Responses, Chat Completions, or Messages request. model must be one of the LoRA’s supported_models:

response = client.responses.create(
    model="moonshotai/Kimi-K2.6",
    input=[{"role": "user", "content": "Write a one-liner about a centrifuge that's having a bad day."}],
    metadata={
        "lora": "funnier-v1",
        "completion_window": "priority",
    },
    background=True,
)

Chat Completions:

response = client.chat.completions.create(
    model="moonshotai/Kimi-K2.6",
    messages=[{"role": "user", "content": "Tell me a joke about Go channels."}],
    extra_body={"metadata": {"lora": "funnier-v1", "completion_window": "priority"}},
)

Constraints on LoRA requests

completion_window cannot be asap. Use priority, standard, or flex — see Completion Windows for what each means.
model must be in the LoRA’s supported_models list. Requesting a different base model returns 400.
A failed model validation blocks that model. GET /v1/loras/{name} includes validations[].result_message when validation fails.
The LoRA must belong to your organization. Names are scoped per-org; two orgs can independently own a LoRA called funnier-v1.
You can use either the LoRA’s name or its UUID in metadata.lora.

Getting started

Inference

Guides

Sailbox

Voyages

Supported base models

Adapter requirements

Add a LoRA

1. Upload the two adapter files

2. Create the LoRA

Validation flow

3. Fetch LoRAs

Use a LoRA

Constraints on LoRA requests

​Supported base models

​Adapter requirements

​Add a LoRA

​1. Upload the two adapter files

​2. Create the LoRA

​Validation flow

​3. Fetch LoRAs

​Use a LoRA

​Constraints on LoRA requests

Supported base models

Adapter requirements

Add a LoRA

1. Upload the two adapter files

2. Create the LoRA

Validation flow

3. Fetch LoRAs

Use a LoRA

Constraints on LoRA requests