SambaCloud Release Notes - SambaNova Documentation

Release notes for SambaCloud, including new features, enhancements, and fixes.

October 2025

Release Date: October 7, 2025

New features and enhancements

API Reference Documentation Upgrade Upgraded the API Reference documentation with a fully automated, Swagger-style interface that dynamically syncs with the OpenAPI specification.

Interactive, auto-generated API documentation with full model definitions.
Real-time updates to stay aligned with the latest OpenAPI spec.
Up-to-date usage examples in Python and TypeScript, powered by the official SambaNova SDKs.

June 2025

Release Date: June 17, 2025

New features and enhancements

Japanese Language Support Added Japanese language support for SambaNova documentation.

Go to the SambaNova documentation homepage.
Open the language dropdown menu and select Japanese.

May 2025

Release Date: May 28, 2025

New features and enhancements

AWS Marketplace Integration SambaCloud is now available as a SaaS offering on AWS Marketplace.

Subscribe using your AWS account and streamline billing through AWS.
PrivateLink support for secure, low-latency connections between your AWS VPC and SambaCloud—no public internet exposure.
Fast onboarding with models like Llama 4, DeepSeek-R1 671B, and Whisper.
Up to 10x faster inference vs. GPUs with SambaNova’s RDU architecture.
Privacy-first architecture: SambaNova never stores your data.
Support for fine-tuned model deployment without code changes.

See AWS Marketplace integration document for more information. DeepSeek-V3-0324 Function Calling Support Release Date: May 6, 2025 Added function calling support for DeepSeek-V3-0324, enabling more dynamic and programmable interactions by allowing you to invoke external functions directly through the model’s outputs. See Function calling for more information.

April 2025

New features and enhancements

New Models

Qwen3-32B (April 29, 2025)
- Added as a Preview model. High-capacity, multilingual LLM with strong performance across question answering, summarization, reasoning, and coding.
- Available via Playground and API.
Whisper-Large-V3 (April 18, 2025)
- OpenAI’s latest large-scale automatic speech recognition (ASR) model with enhanced transcription accuracy, improved multilingual support, and better robustness against noisy audio.
- Available via API.
Llama-4-Maverick-17B-128E-Instruct (April 9, 2025)
- 400B parameter mixture-of-experts model with 17B active parameters and 128 experts.
- Added as a Preview model. Competitive with Gemma 3, Gemini 2.0 Flash, and Mistral 3.1.
- Available via Playground and API.
Llama-4-Scout-17B-16E-Instruct (April 7, 2025)
- 109B parameter mixture-of-experts model with 17B active parameters and 16 experts.
- Added as a Preview model. Competitive with Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1.
- Available via Playground and API.

Llama-4-Maverick Image Input Support Release Date: April 16, 2025 Llama-4-Maverick-17B-128E-Instruct now supports image input functionality with up to two images as context alongside text prompts. Available via Playground and API.

Deprecations

Release Date: April 14, 2025 The following models have been deprecated and are scheduled for removal from active endpoints:

Llama-3.1-Swallow-70B-Instruct-v0.3
Llama-3.1-Tulu-3-405B
Llama-3.2-11B-Vision-Instruct
Llama-3.2-90B-Vision-Instruct
Meta-Llama-3.1-70B-Instruct
Qwen2.5-72B-Instruct
Qwen2.5-Coder-32B-Instruct

See Model Deprecations for guidance on alternatives.

March 2025

New features and enhancements

New Models

DeepSeek-V3-0324 (March 27, 2025)
- First open-source non-reasoning model that outperforms proprietary non-reasoning models.
- Major boost in reasoning performance, stronger front-end development skills, and smarter tool-use capabilities.
- Added as a Preview model. Available via Playground and API.
E5-Mistral-7B-Instruct (March 20, 2025)
- Embedding model with Mistral architecture backbone.
- Recommended for English use. Available via API.
- See Embeddings capabilities and endpoint documentation.
QwQ-32B (March 6, 2025)
- State-of-the-art reasoning model from Alibaba Qwen team.
- Delivers comparable performance to DeepSeek-R1’s 671B with far fewer parameters.
- Integrates advanced agent-related features for critical thinking and external tool usage.
Llama-3.1-Swallow-8B-Instruct-v0.3 and Llama-3.1-Swallow-70B-Instruct-v0.3 (March 5, 2025)
- Japanese-optimized language models developed through continual pre-training of Meta’s Llama 3.1.
- Trained on 200B tokens from diverse sources including web corpora, technical content, and multilingual Wikipedia.

DeepSeek-R1 Production Release Release Date: March 18, 2025 DeepSeek-R1 transitioned from preview to production model with increased context length to 16k tokens. Available via Playground and API.

API improvements and fixes

Model List Endpoint Release Date: March 18, 2025 Added the model list endpoint that provides information about currently available models in SambaCloud. Context Length Increases

DeepSeek-R1-Distill-Llama-70B (March 13, 2025): Increased to 128k tokens.

February 2025

New features and enhancements

New Models

DeepSeek-R1 (February 13, 2025)
- 671B parameter MoE model with performance comparable to OpenAI’s o1 across mathematics, coding, and reasoning.
- Added as a Preview model. SambaCloud delivers the fastest DeepSeek-R1 deployment in the world.
Llama-3.1-Tulu-3-405B (February 4, 2025)
- Open-source model from Allen Institute for AI (AI2) that performs better than DeepSeek-V3.
- Trained using Reinforcement Learning with Verifiable Rewards (RLVR).
- Competitive with or superior to GPT-4o and DeepSeek-V3, with notable advantage in safety benchmarks.
DeepSeek-R1-Distill-Llama-70B (January 30, 2025)
- Fine-tuned on Llama 3.3 70B using samples generated by DeepSeek-R1.
- Outperforms GPT-4o, o1-mini, and Claude-3.5-Sonnet across AIME, MATH-500, GPQA, and LiveCodeBench.

Context Length Increases

Llama-3.3-70B (February 25, 2025): Increased to 128k tokens. Available as production model.
DeepSeek-R1 (February 21, 2025): Increased to 8k tokens. Available as preview model.

For API access and higher rate limits for DeepSeek-R1, please complete this form to join the waitlist.

December 2024

New features and enhancements

New Models

Llama 3.3 70B (December 11, 2024)
- Delivers comparable performance to Llama 3.1 405B.
- Competes closely with OpenAI’s GPT-4o and Google’s Gemini Pro 1.5.
QwQ 32B Preview (December 11, 2024)
- Experimental reasoning model from Alibaba’s Qwen team with 32.5B parameters.
- Excels in mathematics and programming: 65.2% on GPQA, 50.0% on AIME, 90.6% on MATH-500, 50.0% on LiveCodeBench.
Qwen2.5 72B (December 5, 2024)
- 72B-parameter model trained on 18T tokens. Supports context lengths up to 128k tokens.
- Supports 29+ languages including English, Chinese, French, and Spanish.
Qwen2.5 Coder 32B (December 5, 2024)
- 32B-parameter model tailored for code-related tasks across 92 programming languages.
- HumanEval score of 92.7%, matching GPT-4o coding capability.
Llama Guard 3 8B (December 5, 2024)
- Fine-tuned for content safety classification, aligning with 14 MLCommons standardized hazards taxonomy.

Context Length Increases

Llama 3.2 1B: Increased from 4k to 16k.
Llama 3.1 70B: Increased from 64k to 128k.
Llama 3.1 405B: Increased from 8k to 16k.

October 2024

New features and enhancements

New Models

Llama 3.2 11B and 90B (October 29, 2024)
- Multimodality support for text and image inputs.
Llama 3.2 1B and 3B (October 1, 2024)
- Available to all tiers at the fastest inference speed.

Function Calling Release Date: October 29, 2024 Added function calling API enabling dynamic, agentic workflows by allowing the model to suggest and select function calls based on user input. Multimodality in API and Playground Release Date: October 29, 2024 Interact with multimodal models directly through the Inference API (OpenAI compatible) and Playground for seamless text and image processing. Python and Gradio Code Samples Release Date: October 29, 2024 New code samples for faster prototyping and reduced setup time.

API improvements and fixes

Automatic Sequence Length Routing Release Date: October 10, 2024 Automatic routing based on sequence length—no need to change model names to specify different sequence lengths. Context Length Increases

Llama 3.1 8B (October 10, 2024): Increased from 8k to 16k.
Llama 3.1 70B (October 10, 2024): Increased from 8k to 64k.

Performance Improvements Improved performance for Llama 3.2 1B and 3B models.

User experience improvements

How to Use API guide with example curl code for text and image inputs.
Streamlined access to updated code snippets.
New Clear Chat option in Playground.
New UI components with tool tips.

Updated AI starter kits

Multimodal Retriever: Chart, image, and figure understanding with advanced retrieval combining visual and textual data.
Llama-3.1-Instruct-o1: Enhanced reasoning with Llama-3.1-405B, hosted on Hugging Face Spaces.

September 2024

Release Date: September 10, 2024

New features and enhancements

SambaCloud Public Launch

Public launch of the SambaCloud portal, API and the community.
Access to Llama 3.1 8B, 70B, and 405B at full precision and 10x faster inference compared to GPUs.
Launched with two tiers: free and enterprise (paid).

Release notes

​October 2025

​New features and enhancements

​June 2025

​New features and enhancements

​May 2025

​New features and enhancements

​April 2025

​New features and enhancements

​Deprecations

​March 2025

​New features and enhancements

​API improvements and fixes

​February 2025

​New features and enhancements

​December 2024

​New features and enhancements

​October 2024

​New features and enhancements

​API improvements and fixes

​User experience improvements

​Updated AI starter kits

​September 2024

​New features and enhancements

October 2025

New features and enhancements

June 2025

New features and enhancements

May 2025

New features and enhancements

April 2025

New features and enhancements

Deprecations

March 2025

New features and enhancements

API improvements and fixes

February 2025

New features and enhancements

December 2024

New features and enhancements

October 2024

New features and enhancements

API improvements and fixes

User experience improvements

Updated AI starter kits

September 2024

New features and enhancements