SambaNova vision models support multimodal inputs, allowing users to process both text and images. These models analyze images and generate context-aware text responses. Learn how to query SambaNova vision models using either the SambaNova or OpenAI Python client.Documentation Index
Fetch the complete documentation index at: https://sambanova-systems.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
Make a query with an image
On SambaNova, the vision model request follows OpenAI’s multimodal input format which accepts both text and image inputs in a structured payload. While the call is similar to Text Generation, it differs by including an encoded image file, referenced via theimage_path variable. A helper function is used to convert this image into a base64 string, allowing it to be passed alongside the text in the request.
Step 1
Make a new Python file and copy the code below.;
This example uses the Llama-4-Maverick-17B-128E-Instruct model.
Step 2
Use your SambaNova API key and base URL from the API keys and URLs page to replace the string fields
"your-sambanova-api-key" and "your-sambanova-base-url"in the construction of the client.
