Overview
Deploying a custom checkpoint involves four high-level actions:- Convert your checkpoint into a SambaNova-compatible format using the Checkpoint Conversion Tool.
- Upload your converted checkpoint to your private Google Cloud Storage bucket configured with read permissions granted to your SambaNova-provided service account.
- Register your checkpoint by creating a Model Manifest.
- Reference the checkpoint in your deployment Bundle by providing its path and specifying a name for it.
Before starting this workflow, ensure you have completed the checkpoint conversion process. See the Checkpoint Conversion Tool page for instructions.
Prerequisites
Before deploying a custom checkpoint, ensure you have:- A converted checkpoint in SambaNova-compatible format (see Checkpoint Conversion Tool)
- Access to a Google Cloud Storage (GCS) bucket
- Your SambaNova-provided service account JSON file
kubectlconfigured with access to your SambaStack cluster- Familiarity with Model deployment concepts including Bundles and Bundle Templates
Supported models for custom checkpoints
Custom checkpoint deployment is supported for a growing set of base models in SambaStack. On the SambaStack models page, theFeatures and optimizations section includes a field called Import checkpoint. Models marked Import checkpoint: Yes allow you to deploy your own custom or fine-tuned checkpoint for that model family.
Steps to deploy a custom checkpoint
1
Convert your checkpoint
Custom or fine-tuned checkpoints must be converted into a format optimized for SambaNova’s SN40L hardware before they can be deployed. SambaNova provides a Checkpoint Conversion Tool, delivered as a Docker container that you can run locally. The tool generates converted checkpoint artifacts that can then be uploaded and deployed for inference on SambaStack.To begin, follow the instructions in the Download and set up section of the Checkpoint Conversion Tool documentation. Setup is complete once you have downloaded the conversion tool container and synced the model metadata with your specific SambaStack instance.After setup, use the steps described in the Convert and validate checkpoint section of the Checkpoint Conversion Tool documentation to convert your custom checkpoint into the SambaNova-compatible format.
2
Configure GCS bucket permissions
SambaStack uses Google Cloud Storage (GCS) to store checkpoints and other SambaStack artifacts. For custom checkpoints, you’ll store the converted checkpoint artifacts in your own GCS bucket. To make these artifacts available to SambaStack during deployment, your SambaNova-provided service account needs read access to your bucket.To verify that the role was successfully applied:You should see an entry resembling:For additional guidance, see Google’s IAM documentation:
This is a one-time setup step. After permissions are in place, you can upload any number of custom checkpoints to your bucket and use them directly in your deployments.
Identifying your service account
Your service account information is provided as a JSON file. Locate theclient_email field - this is the identity that needs read access to your bucket. For example:Granting Storage Object Viewer role
To allow SambaStack to access your custom checkpoints, grant the service account the Storage Object Viewer role on your bucket. This provides read-only access to objects without allowing writes or modifications.Using the Google Cloud Console:- Open the Google Cloud Console.
- Navigate to Storage → Buckets, and select the bucket you plan to use.
- Go to the Permissions tab.
- Click + Add principal.
- In the New principals field, enter your service account’s
client_email. - In the Role dropdown, choose: Cloud Storage → Storage Object Viewer
- Click Save.
<BUCKET_NAME>— the name of your GCS bucket<SERVICE_ACCOUNT_EMAIL>— theclient_emailvalue from your service account JSON<PROJECT_ID>— the Google Cloud project that owns the bucket
3
Upload your converted checkpoint
After converting your checkpoint, upload the directory containing the converted checkpoint files to your GCS bucket.Using the Google Cloud Console:Confirm that the checkpoint directory contains:
This step may take a while depending on the size of your checkpoint.
- Open the Google Cloud Console.
- Navigate to Storage → Buckets and select the bucket you’ve configured for custom checkpoints.
- Click Upload folder (or Upload files, depending on your structure).
- Select the directory containing your converted checkpoint artifacts.
- Wait for the upload to complete; the structure should remain intact.
Verifying the upload
After uploading, verify that all files were transferred successfully:- All
safetensorsfiles (e.g.,model-00001-of-000NN.safetensors) - Configuration files (
config.json,tokenizer.json, etc.) - The
DONEfile indicating successful conversion
4
Register your checkpoint with a Model Manifest
Register your checkpoint by creating a Model Manifest. The Model Manifest stores relevant information about your checkpoint such as the name to use in API requests, supported languages, and a description of the checkpoint.The
Model Manifest structure
Below is an example Model Manifest:name field (e.g., My-Custom-Llama3.1-8B) will be the name you use in subsequent steps and in API requests.Configuring the tokenizer field
Thetokenizer field can be set to the base model used for your custom checkpoint. For instance, if your custom checkpoint is fine-tuned from Meta-Llama-3.1-70B-Instruct, set the tokenizer path as follows:The tokenizer field in the Model Manifest is only used for running checks on the inputs to calculate sequence length requirements prior to generation time.
Applying the Model Manifest
After creating your Model Manifest, apply it using:5
Reference the checkpoint in your deployment Bundle
Deploying a custom checkpoint follows the same overall workflow as deploying any standard model in SambaStack. The most straightforward approach is to start from an existing Bundle Template that uses the same model architecture as your custom checkpoint and then modify the relevant fields to point to your custom artifacts.
If you are unfamiliar with Bundle Templates, model deployment, or how to choose a template for your use-case, see the Model deployment page.
Understanding Bundle structure
The objects in the Bundle definition to modify are:checkpoints— defines checkpoint aliases and their GCS source pathsmodels— maps model names to checkpoints and templates
Meta-Llama-3.1-8B-Instruct model:Updating the checkpoints section
In thespec → checkpoints section:- Change the checkpoint key to a name (alias) for your custom checkpoint
- Update the
sourcefield to the GCS path of your converted checkpoint directory - Set
toolSupporttotrueorfalsedepending on whether your fine-tuned checkpoint is configured for tool-use
Updating the models section
In thespec → models section:- Change the model key to the name you defined in the Model Manifest (this is the name you will use in inference API calls)
- Set the
checkpointfield to the alias you defined above - Leave
templateas the original Model Template
The
template field should reflect the base model. For instance, if your custom checkpoint is fine-tuned from Meta-Llama-3.3-70B-Instruct, then the template field should be Meta-Llama-3.3-70B-Instruct.Applying the Bundle
Once you’ve updated the Bundle configuration, apply it:Verifying your deployment
After applying the Bundle, verify that your custom checkpoint deployment is successful:-
Check deployment status:
-
Verify the model is available:
- Test with a sample inference request using your custom model name.
Troubleshooting
Common issues
| Issue | Possible Cause | Solution |
|---|---|---|
| Deployment fails with permission error | Service account lacks read access to GCS bucket | Verify Storage Object Viewer role is granted (see Step 2) |
| Model not found in API requests | Model name mismatch between Manifest and Bundle | Ensure the model name in the models section matches the Model Manifest name field |
| Checkpoint files not found | Incorrect GCS path in Bundle | Verify the source path matches your uploaded checkpoint location |
| Inference errors | Checkpoint incompatible with template | Ensure your custom checkpoint uses the same architecture as the specified template |
Verifying GCS access
If you suspect permission issues, verify that your service account can access the checkpoint:Next steps
- To deploy custom checkpoints with speculative decoding, see Deploying with speculative decoding
- For monitoring and observability, see SambaStack Monitoring
