Supported base models
LoRA serving is currently in a limited pilot with select partners. If you’d
like to fine-tune and serve LoRAs on Sail, get in
touch.
| Base model | Max rank | Target modules |
|---|---|---|
moonshotai/Kimi-K2.6 | 32 | all |
Adapter requirements
Train with PEFT and export the two standard files:adapter_config.jsonadapter_model.safetensors
base_model_name_or_pathshould identify the base model you register insupported_models(e.g.moonshotai/Kimi-K2.6).peft_typeshould be"LORA".task_typeshould be"CAUSAL_LM".r(rank) must be no more than the base model’s max rank.target_modulescan include any modules supported by the base model and runtime. Sail does not restrict Kimi K2.6 LoRAs to a fixed target-module allowlist.
lora_alpha, lora_dropout, bias settings, etc.) are preserved as-is.
The adapter_config.json and adapter_model.safetensors files can each be up to 100 GiB.
Add a LoRA
Create a LoRA in three steps: upload the config file, upload the weights file, then callPOST /v1/loras. The response includes validation records for each requested model.
1. Upload the two adapter files
UsePOST /v1/files (multipart):
2. Create the LoRA
supported_models entry must be a known Sail model ID.
Naming rules for the name field:
- 2–64 characters
- lowercase alphanumeric or dashes (
[a-z0-9-]) - must start and end with an alphanumeric character
- unique within your organization (duplicate →
409)
Validation flow
POST /v1/loras creates one validation record per supported_models entry. Each record appears in the response under validations:
supported_models. Poll GET /v1/loras/{name} until validation finishes for the model you want to use.
| Validation status | Meaning |
|---|---|
pending | The validation task has been created. |
running | The validation request is in progress. |
succeeded | The LoRA loaded and completed a validation request for that model. |
failed | The LoRA is not usable for that model. result_code and result_message describe the failure. |
unverified | Sail could not complete validation automatically. |
validations for model-specific status. Requests using a LoRA are rejected only when the latest validation for that model is failed; pending, running, and unverified records remain usable.
If you later add a model with PATCH /v1/loras/{name}, Sail creates validation records for newly added models that have not already succeeded validation.
3. Fetch LoRAs
Use a LoRA
Pass the LoRA’s name (or its UUID) asmetadata.lora on any Responses, Chat Completions, or Messages request. model must be one of the LoRA’s supported_models:
Constraints on LoRA requests
completion_windowcannot beasap. Usepriority,standard, orflex— see Completion Windows for what each means.modelmust be in the LoRA’ssupported_modelslist. Requesting a different base model returns400.- A failed model validation blocks that model.
GET /v1/loras/{name}includesvalidations[].result_messagewhen validation fails. - The LoRA must belong to your organization. Names are scoped per-org; two orgs can independently own a LoRA called
funnier-v1. - You can use either the LoRA’s name or its UUID in
metadata.lora.