SambaStack Metrics - SambaNova Documentation

This page describes the metrics exposed in Prometheus format by the inference router and related services. These are numeric time series intended for dashboards, SLOs, and alerts.

Inference router metrics

Inference router metrics describe queues, scheduling, and request lifecycle in the core inference layer.

Inference router metrics table

Metric	Category	Prometheus Name	Description	Granularity
Queue length	Queue	`queue_length`	Number of requests currently queued in the router.	Per model, QoS, and/or user
Max queue wait time	Queue	`queue_max_wait_seconds`	Maximum age (seconds) of any request currently in the queue.	Per model, QoS
Customer queue length	Queue	`customer_queue_length`	Queue length per customer per model.	Per user, model
Submitted requests	Traffic	`submitted_total`	Total number of requests submitted to the router.	Per model, QoS, user, status
Completed requests	Traffic	`completed_total`	Total number of completed requests, labeled with completion status (success, error, etc.).	Per model, QoS, user, status
Response codes	Traffic	`response_code_total`	Count of HTTP responses by status code.	Per HTTP code, route, user
Response latency	Latency	`response_duration_ms`	End-to-end response latency in milliseconds (often as a histogram or summary).	Per model, QoS, customer
Connection state	Workers	`connection_state_ratio`	Fraction of workers in each state (idle, busy, draining, unhealthy, etc.).	Per worker state, model, pool
Active users	Adoption	`active_users`	Number of active users observed by the router.	Global and/or per user

Metric names and label sets may evolve over time. Refer to the release notes for changes in metric schema.

Monitoring and Observability – Conceptual overview and hierarchy.
Logs – Log/manifest event schema and usage.

Overview

Installation

Service Administration

Hardware Administration

Reference Architectures

Resources

Metrics

Inference router metrics

Inference router metrics table

Overview

Installation

Service Administration

Hardware Administration

Reference Architectures

Resources

​Inference router metrics

​Inference router metrics table

​Related topics

Inference router metrics

Inference router metrics table

Related topics