Skip to main content
In SambaStack, you can deploy your own custom or fine-tuned checkpoints for inference in the same manner as deploying standard model offerings, with a few additional steps to prepare your checkpoint for use in the platform. Once prepared and deployed, custom checkpoints behave just like any other checkpoint you deploy on SambaStack.

Overview

Deploying a custom checkpoint involves four high-level actions:
  1. Convert your checkpoint into a SambaNova-compatible format using the Checkpoint Conversion Tool.
  2. Upload your converted checkpoint to your private Google Cloud Storage bucket configured with read permissions granted to your SambaNova-provided service account.
  3. Register your checkpoint by creating a Model Manifest.
  4. Reference the checkpoint in your deployment Bundle by providing its path and specifying a name for it.
Before starting this workflow, ensure you have completed the checkpoint conversion process. See the Checkpoint Conversion Tool page for instructions.

Prerequisites

Before deploying a custom checkpoint, ensure you have:
  • A converted checkpoint in SambaNova-compatible format (see Checkpoint Conversion Tool)
  • Access to a Google Cloud Storage (GCS) bucket
  • Your SambaNova-provided service account JSON file
  • kubectl configured with access to your SambaStack cluster
  • Familiarity with Model deployment concepts including Bundles and Bundle Templates

Supported models for custom checkpoints

Custom checkpoint deployment is supported for a growing set of base models in SambaStack. On the SambaStack models page, the Features and optimizations section includes a field called Import checkpoint. Models marked Import checkpoint: Yes allow you to deploy your own custom or fine-tuned checkpoint for that model family.

Steps to deploy a custom checkpoint

1

Convert your checkpoint

Custom or fine-tuned checkpoints must be converted into a format optimized for SambaNova’s SN40L hardware before they can be deployed. SambaNova provides a Checkpoint Conversion Tool, delivered as a Docker container that you can run locally. The tool generates converted checkpoint artifacts that can then be uploaded and deployed for inference on SambaStack.To begin, follow the instructions in the Download and set up section of the Checkpoint Conversion Tool documentation. Setup is complete once you have downloaded the conversion tool container and synced the model metadata with your specific SambaStack instance.After setup, use the steps described in the Convert and validate checkpoint section of the Checkpoint Conversion Tool documentation to convert your custom checkpoint into the SambaNova-compatible format.
2

Configure GCS bucket permissions

SambaStack uses Google Cloud Storage (GCS) to store checkpoints and other SambaStack artifacts. For custom checkpoints, you’ll store the converted checkpoint artifacts in your own GCS bucket. To make these artifacts available to SambaStack during deployment, your SambaNova-provided service account needs read access to your bucket.
This is a one-time setup step. After permissions are in place, you can upload any number of custom checkpoints to your bucket and use them directly in your deployments.

Identifying your service account

Your service account information is provided as a JSON file. Locate the client_email field - this is the identity that needs read access to your bucket. For example:
{
  "type": "service_account",
  "project_id": "example-project-id",
  "private_key_id": "example-private-key-id",
  "private_key": "-----BEGIN PRIVATE KEY-----\n<private key contents>\n-----END PRIVATE KEY-----\n",
  "client_email": "ss-artifacts-reader@example-project-id.iam.gserviceaccount.com",
  "client_id": "12345678901234567890",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/ss-artifacts-reader%40example-project-id.iam.gserviceaccount.com",
  "universe_domain": "googleapis.com"
}

Granting Storage Object Viewer role

To allow SambaStack to access your custom checkpoints, grant the service account the Storage Object Viewer role on your bucket. This provides read-only access to objects without allowing writes or modifications.Using the Google Cloud Console:
  1. Open the Google Cloud Console.
  2. Navigate to Storage → Buckets, and select the bucket you plan to use.
  3. Go to the Permissions tab.
  4. Click + Add principal.
  5. In the New principals field, enter your service account’s client_email.
  6. In the Role dropdown, choose: Cloud Storage → Storage Object Viewer
  7. Click Save.
Using the gcloud CLI:Before running the commands below, identify:
  • <BUCKET_NAME> — the name of your GCS bucket
  • <SERVICE_ACCOUNT_EMAIL> — the client_email value from your service account JSON
  • <PROJECT_ID> — the Google Cloud project that owns the bucket
To grant the Storage Object Viewer role at the bucket level:
gcloud storage buckets add-iam-policy-binding gs://<BUCKET_NAME> \
    --member="serviceAccount:<SERVICE_ACCOUNT_EMAIL>" \
    --role="roles/storage.objectViewer" \
    --project=<PROJECT_ID>
To verify that the role was successfully applied:
gcloud storage buckets get-iam-policy gs://<BUCKET_NAME> \
    --project=<PROJECT_ID>
You should see an entry resembling:
bindings:
- members:
  - serviceAccount:<SERVICE_ACCOUNT_EMAIL>
  role: roles/storage.objectViewer
For additional guidance, see Google’s IAM documentation:
3

Upload your converted checkpoint

After converting your checkpoint, upload the directory containing the converted checkpoint files to your GCS bucket.
This step may take a while depending on the size of your checkpoint.
Using the Google Cloud Console:
  1. Open the Google Cloud Console.
  2. Navigate to Storage → Buckets and select the bucket you’ve configured for custom checkpoints.
  3. Click Upload folder (or Upload files, depending on your structure).
  4. Select the directory containing your converted checkpoint artifacts.
  5. Wait for the upload to complete; the structure should remain intact.
Using the gcloud CLI:You can upload the entire converted checkpoint directory recursively with:
gcloud storage cp -r <LOCAL_CONVERTED_CHECKPOINT_DIR> gs://<BUCKET_NAME>/<DESTINATION_PREFIX>/

Verifying the upload

After uploading, verify that all files were transferred successfully:
gsutil ls gs://<BUCKET_NAME>/<DESTINATION_PREFIX>/
Confirm that the checkpoint directory contains:
  • All safetensors files (e.g., model-00001-of-000NN.safetensors)
  • Configuration files (config.json, tokenizer.json, etc.)
  • The DONE file indicating successful conversion
4

Register your checkpoint with a Model Manifest

Register your checkpoint by creating a Model Manifest. The Model Manifest stores relevant information about your checkpoint such as the name to use in API requests, supported languages, and a description of the checkpoint.

Model Manifest structure

Below is an example Model Manifest:
apiVersion: sambanova.ai/v1alpha1
kind: Model
metadata:
  name: My-Custom-Llama3.1-8B
spec:
  aliases:
  - my-custom-llama3.1-8b
  - My-Custom-Llama3.1-8B
  metadata:
    architecture: Llama 3.1
    category:
    - General
    - Instruct
    github_link: https://huggingface.co/path/to/My-Custom-Llama3.1-8B
    hf_link: 'https://huggingface.co/path/to/My-Custom-Llama3.1-8B'
    languages:
    - English
    - German
    - French
    - Italian
    - Portuguese
    - Hindi
    - Spanish
    - Thai
    license: llama3.1
    name: Salesforce/My-Custom-Llama3.1-8B
    overview: A description or overview of My-Custom-Llama3.1-8B. Typically this is the description found in a modelcard.
    status: active
    vocabulary_size: 128256
  name: My-Custom-Llama3.1-8B
  owner: jane@doe.ai
  price:
    input_tokens: 10
    output_tokens: 20
  public: true
  tokenizer:
    endpointUrl: ''
    path: ./Meta-Llama-3.1-8B-Instruct_tokenizer
The name field (e.g., My-Custom-Llama3.1-8B) will be the name you use in subsequent steps and in API requests.

Configuring the tokenizer field

The tokenizer field can be set to the base model used for your custom checkpoint. For instance, if your custom checkpoint is fine-tuned from Meta-Llama-3.1-70B-Instruct, set the tokenizer path as follows:
tokenizer:
  endpointUrl: ''
  path: ./Meta-Llama-3.1-70B-Instruct_tokenizer
The tokenizer field in the Model Manifest is only used for running checks on the inputs to calculate sequence length requirements prior to generation time.

Applying the Model Manifest

After creating your Model Manifest, apply it using:
kubectl apply -f your_model_manifest_filename.yaml
5

Reference the checkpoint in your deployment Bundle

Deploying a custom checkpoint follows the same overall workflow as deploying any standard model in SambaStack. The most straightforward approach is to start from an existing Bundle Template that uses the same model architecture as your custom checkpoint and then modify the relevant fields to point to your custom artifacts.
If you are unfamiliar with Bundle Templates, model deployment, or how to choose a template for your use-case, see the Model deployment page.

Understanding Bundle structure

The objects in the Bundle definition to modify are:
  1. checkpoints — defines checkpoint aliases and their GCS source paths
  2. models — maps model names to checkpoints and templates
Below is an example Bundle for the base Meta-Llama-3.1-8B-Instruct model:
apiVersion: sambanova.ai/v1alpha1
kind: Bundle
metadata:
  name: 8b-3dot1-full
spec:
  checkpoints:
    LLAMA3_8B_3_1_CKPT:      
      source: gs://service-account-gcs-bucket/.../ckpts/meta-llama3-8b-instruct
      toolSupport: true
  models:
    Meta-Llama-3.1-8B-Instruct:
      checkpoint: LLAMA3_8B_3_1_CKPT
      template: Meta-Llama-3.1-8B-Instruct
  secretNames:
  - sambanova-artifact-reader
  template: 8b-3dot1-full

Updating the checkpoints section

In the spec → checkpoints section:
  • Change the checkpoint key to a name (alias) for your custom checkpoint
  • Update the source field to the GCS path of your converted checkpoint directory
  • Set toolSupport to true or false depending on whether your fine-tuned checkpoint is configured for tool-use
For example:
checkpoints:
  CUSTOM_CHECKPOINT_ALIAS:      
    source: gs://your-gcs-bucket/path/to/custom/ckpt/dir
    toolSupport: false

Updating the models section

In the spec → models section:
  • Change the model key to the name you defined in the Model Manifest (this is the name you will use in inference API calls)
  • Set the checkpoint field to the alias you defined above
  • Leave template as the original Model Template
For example:
models:
  My-Custom-Llama3.1-8B:
    checkpoint: CUSTOM_CHECKPOINT_ALIAS
    template: Meta-Llama-3.1-8B-Instruct
The template field should reflect the base model. For instance, if your custom checkpoint is fine-tuned from Meta-Llama-3.3-70B-Instruct, then the template field should be Meta-Llama-3.3-70B-Instruct.

Applying the Bundle

Once you’ve updated the Bundle configuration, apply it:
kubectl apply -f your_bundle_filename.yaml
After your deployment is running, use the model name you defined (e.g., My-Custom-Llama3.1-8B) in your inference API requests.

Verifying your deployment

After applying the Bundle, verify that your custom checkpoint deployment is successful:
  1. Check deployment status:
    kubectl get bundles
    kubectl describe bundle <your-bundle-name>
    
  2. Verify the model is available:
    kubectl get models
    
  3. Test with a sample inference request using your custom model name.

Troubleshooting

Common issues

IssuePossible CauseSolution
Deployment fails with permission errorService account lacks read access to GCS bucketVerify Storage Object Viewer role is granted (see Step 2)
Model not found in API requestsModel name mismatch between Manifest and BundleEnsure the model name in the models section matches the Model Manifest name field
Checkpoint files not foundIncorrect GCS path in BundleVerify the source path matches your uploaded checkpoint location
Inference errorsCheckpoint incompatible with templateEnsure your custom checkpoint uses the same architecture as the specified template

Verifying GCS access

If you suspect permission issues, verify that your service account can access the checkpoint:
gcloud auth activate-service-account --key-file=<path-to-service-account-json>
gsutil ls gs://<BUCKET_NAME>/<CHECKPOINT_PATH>/

Next steps