SambaStack Deploying custom checkpoints

In SambaStack, you can deploy your own custom or fine-tuned checkpoints for inference in the same manner as deploying standard model offerings, with a few additional steps to prepare your checkpoint for use in the platform. Once prepared and deployed, custom checkpoints behave just like any other checkpoint you deploy on SambaStack.

Overview

Deploying a custom checkpoint involves four high-level actions:

Convert your checkpoint into a SambaNova-compatible format using the Checkpoint Conversion Tool.
Upload your converted checkpoint to your private Google Cloud Storage bucket configured with read permissions granted to your SambaNova-provided service account.
Register your checkpoint by creating a Model Manifest.
Reference the checkpoint in your deployment Bundle by providing its path and specifying a name for it.

Before starting this workflow, ensure you have completed the checkpoint conversion process. See the Checkpoint Conversion Tool page for instructions.

Prerequisites

Before deploying a custom checkpoint, ensure you have:

A converted checkpoint in SambaNova-compatible format (see Checkpoint Conversion Tool)
Access to a Google Cloud Storage (GCS) bucket
Your SambaNova-provided service account JSON file
kubectl configured with access to your SambaStack cluster
Familiarity with Model deployment concepts including Bundles and Bundle Templates

Supported models for custom checkpoints

Custom checkpoint deployment is supported for a growing set of base models in SambaStack. On the SambaStack models page, the Features and optimizations section includes a field called Import checkpoint. Models marked Import checkpoint: Yes allow you to deploy your own custom or fine-tuned checkpoint for that model family.

Steps to deploy a custom checkpoint

Convert your checkpoint

Custom or fine-tuned checkpoints must be converted into a format optimized for SambaNova’s SN40L hardware before they can be deployed. SambaNova provides a Checkpoint Conversion Tool, delivered as a Docker container that you can run locally. The tool generates converted checkpoint artifacts that can then be uploaded and deployed for inference on SambaStack.To begin, follow the instructions in the Download and set up section of the Checkpoint Conversion Tool documentation. Setup is complete once you have downloaded the conversion tool container and synced the model metadata with your specific SambaStack instance.After setup, use the steps described in the Convert and validate checkpoint section of the Checkpoint Conversion Tool documentation to convert your custom checkpoint into the SambaNova-compatible format.

Configure GCS bucket permissions

SambaStack uses Google Cloud Storage (GCS) to store checkpoints and other SambaStack artifacts. For custom checkpoints, you’ll store the converted checkpoint artifacts in your own GCS bucket. To make these artifacts available to SambaStack during deployment, your SambaNova-provided service account needs read access to your bucket.

This is a one-time setup step. After permissions are in place, you can upload any number of custom checkpoints to your bucket and use them directly in your deployments.

Identifying your service account

Your service account information is provided as a JSON file. Locate the client_email field - this is the identity that needs read access to your bucket. For example:

{
  "type": "service_account",
  "project_id": "example-project-id",
  "private_key_id": "example-private-key-id",
  "private_key": "-----BEGIN PRIVATE KEY-----\n<private key contents>\n-----END PRIVATE KEY-----\n",
  "client_email": "ss-artifacts-reader@example-project-id.iam.gserviceaccount.com",
  "client_id": "12345678901234567890",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/ss-artifacts-reader%40example-project-id.iam.gserviceaccount.com",
  "universe_domain": "googleapis.com"
}

Granting Storage Object Viewer role

To allow SambaStack to access your custom checkpoints, grant the service account the Storage Object Viewer role on your bucket. This provides read-only access to objects without allowing writes or modifications.Using the Google Cloud Console:

Open the Google Cloud Console.
Navigate to Storage → Buckets, and select the bucket you plan to use.
Go to the Permissions tab.
Click + Add principal.
In the New principals field, enter your service account’s client_email.
In the Role dropdown, choose: Cloud Storage → Storage Object Viewer
Click Save.

Using the gcloud CLI:Before running the commands below, identify:

<BUCKET_NAME> — the name of your GCS bucket
<SERVICE_ACCOUNT_EMAIL> — the client_email value from your service account JSON
<PROJECT_ID> — the Google Cloud project that owns the bucket

To grant the Storage Object Viewer role at the bucket level:

gcloud storage buckets add-iam-policy-binding gs://<BUCKET_NAME> \
    --member="serviceAccount:<SERVICE_ACCOUNT_EMAIL>" \
    --role="roles/storage.objectViewer" \
    --project=<PROJECT_ID>

To verify that the role was successfully applied:

gcloud storage buckets get-iam-policy gs://<BUCKET_NAME> \
    --project=<PROJECT_ID>

You should see an entry resembling:

bindings:
- members:
  - serviceAccount:<SERVICE_ACCOUNT_EMAIL>
  role: roles/storage.objectViewer

For additional guidance, see Google’s IAM documentation:

Upload your converted checkpoint

After converting your checkpoint, upload the directory containing the converted checkpoint files to your GCS bucket.

This step may take a while depending on the size of your checkpoint.

Using the Google Cloud Console:

Open the Google Cloud Console.
Navigate to Storage → Buckets and select the bucket you’ve configured for custom checkpoints.
Click Upload folder (or Upload files, depending on your structure).
Select the directory containing your converted checkpoint artifacts.
Wait for the upload to complete; the structure should remain intact.

Using the gcloud CLI:You can upload the entire converted checkpoint directory recursively with:

gcloud storage cp -r <LOCAL_CONVERTED_CHECKPOINT_DIR> gs://<BUCKET_NAME>/<DESTINATION_PREFIX>/

Verifying the upload

After uploading, verify that all files were transferred successfully:

gsutil ls gs://<BUCKET_NAME>/<DESTINATION_PREFIX>/

Confirm that the checkpoint directory contains:

All safetensors files (e.g., model-00001-of-000NN.safetensors)
Configuration files (config.json, tokenizer.json, etc.)
The DONE file indicating successful conversion

Register your checkpoint with a Model Manifest

Register your checkpoint by creating a Model Manifest. The Model Manifest stores relevant information about your checkpoint such as the name to use in API requests, supported languages, and a description of the checkpoint.

Model Manifest structure

Below is an example Model Manifest:

apiVersion: sambanova.ai/v1alpha1
kind: Model
metadata:
  name: My-Custom-Llama3.1-8B
spec:
  aliases:
  - my-custom-llama3.1-8b
  - My-Custom-Llama3.1-8B
  metadata:
    architecture: Llama 3.1
    category:
    - General
    - Instruct
    github_link: https://huggingface.co/path/to/My-Custom-Llama3.1-8B
    hf_link: 'https://huggingface.co/path/to/My-Custom-Llama3.1-8B'
    languages:
    - English
    - German
    - French
    - Italian
    - Portuguese
    - Hindi
    - Spanish
    - Thai
    license: llama3.1
    name: Salesforce/My-Custom-Llama3.1-8B
    overview: A description or overview of My-Custom-Llama3.1-8B. Typically this is the description found in a modelcard.
    status: active
    vocabulary_size: 128256
  name: My-Custom-Llama3.1-8B
  owner: jane@doe.ai
  price:
    input_tokens: 10
    output_tokens: 20
  public: true
  tokenizer:
    endpointUrl: ''
    path: ./Meta-Llama-3.1-8B-Instruct_tokenizer

The name field (e.g., My-Custom-Llama3.1-8B) will be the name you use in subsequent steps and in API requests.

Configuring the tokenizer field

The tokenizer field can be set to the base model used for your custom checkpoint. For instance, if your custom checkpoint is fine-tuned from Meta-Llama-3.1-70B-Instruct, set the tokenizer path as follows:

tokenizer:
  endpointUrl: ''
  path: ./Meta-Llama-3.1-70B-Instruct_tokenizer

The tokenizer field in the Model Manifest is only used for running checks on the inputs to calculate sequence length requirements prior to generation time.

Applying the Model Manifest

After creating your Model Manifest, apply it using:

kubectl apply -f your_model_manifest_filename.yaml

Reference the checkpoint in your deployment Bundle

Deploying a custom checkpoint follows the same overall workflow as deploying any standard model in SambaStack. The most straightforward approach is to start from an existing Bundle Template that uses the same model architecture as your custom checkpoint and then modify the relevant fields to point to your custom artifacts.

If you are unfamiliar with Bundle Templates, model deployment, or how to choose a template for your use-case, see the Model deployment page.

Understanding Bundle structure

The objects in the Bundle definition to modify are:

checkpoints — defines checkpoint aliases and their GCS source paths
models — maps model names to checkpoints and templates

Below is an example Bundle for the base Meta-Llama-3.1-8B-Instruct model:

apiVersion: sambanova.ai/v1alpha1
kind: Bundle
metadata:
  name: 8b-3dot1-full
spec:
  checkpoints:
    LLAMA3_8B_3_1_CKPT:      
      source: gs://service-account-gcs-bucket/.../ckpts/meta-llama3-8b-instruct
      toolSupport: true
  models:
    Meta-Llama-3.1-8B-Instruct:
      checkpoint: LLAMA3_8B_3_1_CKPT
      template: Meta-Llama-3.1-8B-Instruct
  secretNames:
  - sambanova-artifact-reader
  template: 8b-3dot1-full

Updating the checkpoints section

In the spec → checkpoints section:

Change the checkpoint key to a name (alias) for your custom checkpoint
Update the source field to the GCS path of your converted checkpoint directory
Set toolSupport to true or false depending on whether your fine-tuned checkpoint is configured for tool-use

For example:

checkpoints:
  CUSTOM_CHECKPOINT_ALIAS:      
    source: gs://your-gcs-bucket/path/to/custom/ckpt/dir
    toolSupport: false

Updating the models section

In the spec → models section:

Change the model key to the name you defined in the Model Manifest (this is the name you will use in inference API calls)
Set the checkpoint field to the alias you defined above
Leave template as the original Model Template

For example:

models:
  My-Custom-Llama3.1-8B:
    checkpoint: CUSTOM_CHECKPOINT_ALIAS
    template: Meta-Llama-3.1-8B-Instruct

The template field should reflect the base model. For instance, if your custom checkpoint is fine-tuned from Meta-Llama-3.3-70B-Instruct, then the template field should be Meta-Llama-3.3-70B-Instruct.

Applying the Bundle

Once you’ve updated the Bundle configuration, apply it:

kubectl apply -f your_bundle_filename.yaml

After your deployment is running, use the model name you defined (e.g., My-Custom-Llama3.1-8B) in your inference API requests.

Verifying your deployment

After applying the Bundle, verify that your custom checkpoint deployment is successful:

Check deployment status:

kubectl get bundles
kubectl describe bundle <your-bundle-name>

Verify the model is available:
```
kubectl get models
```
Test with a sample inference request using your custom model name.

Troubleshooting

Common issues

Issue	Possible Cause	Solution
Deployment fails with permission error	Service account lacks read access to GCS bucket	Verify Storage Object Viewer role is granted (see Step 2)
Model not found in API requests	Model name mismatch between Manifest and Bundle	Ensure the model name in the `models` section matches the Model Manifest `name` field
Checkpoint files not found	Incorrect GCS path in Bundle	Verify the `source` path matches your uploaded checkpoint location
Inference errors	Checkpoint incompatible with template	Ensure your custom checkpoint uses the same architecture as the specified template

Verifying GCS access

If you suspect permission issues, verify that your service account can access the checkpoint:

gcloud auth activate-service-account --key-file=<path-to-service-account-json>
gsutil ls gs://<BUCKET_NAME>/<CHECKPOINT_PATH>/

Next steps

To deploy custom checkpoints with speculative decoding, see Deploying with speculative decoding
For monitoring and observability, see SambaStack Monitoring

Overview

Installation

Service Administration

Hardware Administration

Reference Architectures

Resources

Deploying custom checkpoints

Overview

Prerequisites

Supported models for custom checkpoints

Steps to deploy a custom checkpoint

Convert your checkpoint

Configure GCS bucket permissions

Identifying your service account

Granting Storage Object Viewer role

Upload your converted checkpoint

Verifying the upload

Register your checkpoint with a Model Manifest

Model Manifest structure

Configuring the tokenizer field

Applying the Model Manifest

Reference the checkpoint in your deployment Bundle

Understanding Bundle structure

Updating the checkpoints section

Updating the models section

Applying the Bundle

Verifying your deployment

Troubleshooting

Common issues

Verifying GCS access

Next steps

Overview

Installation

Service Administration

Hardware Administration

Reference Architectures

Resources

​Overview

​Prerequisites

​Supported models for custom checkpoints

​Steps to deploy a custom checkpoint

Convert your checkpoint

Configure GCS bucket permissions

​Identifying your service account

​Granting Storage Object Viewer role

Upload your converted checkpoint

​Verifying the upload

Register your checkpoint with a Model Manifest

​Model Manifest structure

​Configuring the tokenizer field

​Applying the Model Manifest

Reference the checkpoint in your deployment Bundle

​Understanding Bundle structure

​Updating the checkpoints section

​Updating the models section

​Applying the Bundle

​Verifying your deployment

​Troubleshooting

​Common issues

​Verifying GCS access

​Next steps

Overview

Prerequisites

Supported models for custom checkpoints

Steps to deploy a custom checkpoint

Identifying your service account

Granting Storage Object Viewer role

Verifying the upload

Model Manifest structure

Configuring the tokenizer field

Applying the Model Manifest

Understanding Bundle structure

Updating the checkpoints section

Updating the models section

Applying the Bundle

Verifying your deployment

Troubleshooting

Common issues

Verifying GCS access

Next steps