Build sub-agents with Codex and SambaNova
This is a guide to wiring the OpenAI Codex CLI to a SambaNova-hosted model via the Responses API, and trying it on three real demos — including one that uses an MCP server for live library docs.What it does
Codex CLI is built around two ideas worth abusing:[model_providers.*] lets you point it at any OpenAI-compatible endpoint, and --profile lets you swap the active model+provider with one flag. Together they make the planner / executor pattern from the SambaNova blog feel native: one profile for the frontier planner (gpt-5, o3, etc), another for MiniMax-M2.7 as the cheap, fast executor. SambaNova exposes a /v1/responses endpoint that matches Codex’s wire_api = "responses" exactly — no LiteLLM proxy required.
Prerequisites
- Node.js ≥ 18 on PATH.
- Codex CLI installed:
npm i -g @openai/codex(orbrew install --cask codex). - SambaNova API key exported as
SAMBANOVA_API_KEY. - (For Demo 2 & 3) an OpenAI API key registered with Codex for the frontier planner profile — see below.
Register your OpenAI key for the frontier planner (Demo 2 & 3)
Theplan profile uses Codex’s built-in openai provider, which authenticates from ~/.codex/auth.json. Let Codex write that file for you — export your key and pipe it into codex login:
Already logged into Codex with a stale key from a past login? Runcodex logoutfirst, then re-run thecodex login --with-api-keycommand above after setting up your OpenAI API key.
Skipping the frontier planner entirely (usingplan-sn)? You don’t need an OpenAI key at all —plan-sngoes through the SambaNova provider.
Wire up SambaNova
Codex reads~/.codex/config.toml. Add a SambaNova provider and two profiles — one for planning, one for execution:
~/.codex/config.toml (append; don’t replace your existing block):
Then create one demo workspace and reuse it across all three demos:wire_api = "responses"is what makes this work directly — SambaNova’s/v1/responsesendpoint matches the OpenAI Responses API shape Codex sends. If you’ve seen older guides recommend LiteLLM as a proxy, you don’t need it.
Demo 1 — SambaNova end-to-end
A pet-friendly “hello world” landing page, built and verified entirely byMiniMax-M2.7 via the execute-sn profile.
Demo 2 — Frontier plans, SambaNova executes
The architect/builder split: theplan profile (frontier model) writes a precise PLAN.md; execute-sn (MiniMax-M2.7) carries it out. PLAN.md is the artifact that crosses the boundary — reproducible, swappable, reviewable.
Step 1 — plan with the frontier model:
PLAN.md and review. Edit it freely — that’s the point of materializing the plan.
Step 2 — hand PLAN.md to SambaNova:
PLAN.md is its entire spec. Tweak PLAN.md and rerun the same --profile execute-sn command, or swap the planner profile for plan-sn (no frontier) without rewriting the plan.
Want SambaNova on both sides?codex --profile plan-sn …for the planner usesgpt-oss-120binstead of OpenAI. Useful when you don’t want any frontier dependency.
Demo 3 — MCP-fed planning with live library docs
Demo 2, plus an MCP server. The planner uses Context7 to fetch current docs for a library, bakes them intoPLAN.md, and MiniMax executes. Solves the “model trained on stale docs” problem without writing custom retrieval.
Install Context7 as an MCP server
Free API key from context7.com/dashboard, then export it:~/.codex/config.toml:
Free tier works without the key — drop the env line and you’ll just hit lower rate limits.
The MCP server is now available to every profile. Confirm:
resolve-library-id and query-docs.
The task
Stamp each pet card from Demo 2 with a human-readable “Added X days ago” label, computed at page load with date-fns (formatDistanceToNow). date-fns is a good Context7 target: its v2→v3 rewrite changed how it’s imported (tree-shakeable named exports, a new UMD cdn.min.js build) and v4 added time-zone support — so models routinely emit stale default-import patterns that don’t run.
Step 1 — plan profile fetches current docs and writes the plan
resolve-library-id → query-docs, gets today’s API, and writes a plan grounded in current docs.
Step 2 — hand to SambaNova
PLAN.md already contains the resolved API. MCP access stays on the (more expensive) planner side, where it pays off.
Step 3 — verify
Why this matters
This is MCP-fed planning made concrete: the frontier planner has the right context, the SambaNova executor stays cheap and tool-light, andPLAN.md is the boundary.
Tips
--profileis the whole knob. Don’t override--modeland--provideron the CLI — they bypass the profile and stop being reproducible.- One repo, many profiles. Add a
[projects."/abs/path/to/repo"]block withtrust_level = "trusted"to skip the “trust this folder?” prompt for known dirs. approval_policy = "on-request"is the right default for the executor — the model asks before destructive shell calls. Drop to"never"only inside throwaway sandboxes.- Tell the executor to verify (“open
index.htmland confirm…”) or it will edit and stop.
Common gotchas
401 Unauthorized on SambaNova. env_key = "SAMBANOVA_API_KEY" resolves at the time Codex spawns its HTTP client, so the var must be exported in the shell that launches codex — not just set in a .env. echo $SAMBANOVA_API_KEY before you run.
Quota exceeded. Check your plan and billing details. on the plan profile. The openai provider authenticates from ~/.codex/auth.json, not from $OPENAI_API_KEY. A curl that works with your env key proves nothing here — Codex is sending whatever key (or ChatGPT login) is stored in auth.json, which may be stale or out of credits. Fix: codex logout then printenv OPENAI_API_KEY | codex login --with-api-key, and confirm with codex doctor that the stored auth matches the key you intend to bill.
wire_api must be responses. Codex only supports wire_api = "responses"; the older wire_api = "chat" value was removed and now errors at startup. SambaNova’s /v1/responses endpoint matches what Codex sends. If a specific model returns 404 on /v1/responses, that model isn’t served over the Responses API — switch to one that is (MiniMax-M2.7, gpt-oss-120b) rather than changing wire_api.
model not found. Use the bare SambaNova model id (MiniMax-M2.7, DeepSeek-V3.1, gpt-oss-120b) — Codex prepends nothing. The sambanova/ prefix you may have seen in opencode/AI SDK configs is not used here.
Context7 tools missing. mcp_servers is loaded once at startup — if you edited config.toml mid-session, exit and rerun.
Composing with MCP servers
Codex profiles + MCP servers compose cleanly because profiles only swap the model, not the tool surface — every profile sees every registered MCP server:| Profile | Model | Best for | MCP access |
|---|---|---|---|
| plan | frontier (gpt-5, o3, …) | reading, reasoning, calling MCP | full |
| plan-sn | gpt-oss-120b | same, no frontier dependency | full |
| execute-sn | MiniMax-M2.7 | 50–200 turns of edits + tests | full (often unused) |
PLAN.md, hands to execute-sn.
2. MCP-driven handoff. After execute-sn finishes, run a follow-up --profile plan call that uses a GitHub or Slack MCP server to open a PR / post a summary — the executor never needs those credentials.
3. Shell-CLI tools inside the executor. execute-sn has shell access under sandbox_mode = "workspace-write". Any CLI on PATH (gh, git, aws, …) is fair game — tell it in the prompt:
Example: codex --profile execute-sn "Implement PLAN.md, run npm test, then run 'gh pr create --fill' to open a draft PR."
References
- Codex CLI docs — developers.openai.com/codex
- Codex config reference — config-reference
- Codex profiles & advanced config — config-advanced
- SambaNova Cloud — cloud.sambanova.ai
- SambaNova Responses API blog — build faster coding agents
- Context7 MCP — github.com/upstash/context7
- Model Context Protocol — modelcontextprotocol.io

