If you are using the new `ollama launch codex` flow to run remote inference on your local terminal, you might have run into a familiar friction point: **The Metadata Warning.**

When you run:
```bash
ollama launch codex --model nemotron-3-super:cloud
```

Codex often greets you with:
```text
Model metadata for `nemotron-3-super:cloud` not found. Defaulting to fallback
  metadata; this can degrade performance and cause issues.
```

This happens because Codex expects a static profile in your `~/.codex/config.toml` for every model it interacts with. But in a world where cloud model slugs like `nemotron-3`, `qwen-3.5`, and `glm-5` are moving at lightspeed, maintaining static profiles is a losing game.

I wanted a "Clean Path"—a way to treat the launcher input as dynamic metadata so I never have to touch a config file again.

## The Problem: Static vs. Ephemeral

The traditional way to fix this is to hand-craft a profile:

```toml
# ~/.codex/config.toml
[[profiles]]
name = "nemotron-3-super:cloud"
model_context_window = 131072
auto_compact_token_limit = 80000
```

But as soon as you switch to `qwen-3.5:397b-cloud`, you're back at the drawing board. The ownership boundary is the **launcher handoff**, not the static config.

## The Solution: The "Launcher-as-Profile" Pattern

Instead of fighting the configuration, I built a lightweight global wrapper for `codex`. When Ollama launches a session, the wrapper intercepts the call, fetches the metadata from the local Ollama API, and synthesizes an ephemeral profile on the fly.

### How it works under the hood

The wrapper looks for signals that it's being launched in an Ollama-backed session (like `OPENAI_API_KEY=ollama`). It then:

1.  **Extracts the Model Slug:** It pulls the exact `--model` string passed by Ollama.
2.  **Queries the Local Gateway:** It hits `http://127.0.0.1:11434/api/show` to see what Ollama knows about that model.
3.  **Applies Intelligent Heuristics:** If cloud metadata is sparse, it uses family-aware logic:
    *   `nemotron*` models get a **128k** context window.
    *   `qwen3*` models get a **256k** context window.
    *   Generic cloud models default to a safe **64k** or **128k**.
4.  **Injects Runtime Config:** It uses Codex's `-c` flag to pass these values directly to the binary, bypassing the need for a persistent file.

```bash
# What the wrapper actually executes:
codex-openai --oss -m nemotron-3-super:cloud \
  -c model_context_window=131072 \
  -c auto_compact_token_limit=78643 \
  -c tool_output_token_limit=12000
```

## Why this is the "Senior" Move

As a developer, your terminal should be a **Tactical Command Deck**, not a maintenance chore. By moving the logic into a wrapper, we achieve:

*   **Zero Config Churn:** No more `config.toml` sprawl.
*   **Infrastructure Sovereignty:** You control how context windows are calculated per family.
*   **Future Proofing:** If Ollama starts exposing richer metadata, we update the wrapper in one place, and every model benefits immediately.

## Setting it up

If you're already running my `ollama-codex-wrapper.mjs`, you've probably noticed your logs look much cleaner. For those who want to implement this:

1.  Move your original `codex` binary to `codex-openai`.
2.  Place the wrapper script at `/usr/local/bin/codex`.
3.  Ensure your wrapper caches generated catalogs in `~/.cache/ollama-launch/` for performance.

## Testing with Nemotron 3

Once installed, the launch is seamless. You can even verify the dynamic overrides with a debug flag:

```bash
CODEX_OLLAMA_WRAPPER_DEBUG=1 ollama launch codex --model nemotron-3-super:cloud
```

You'll see the wrapper instantly detecting the Nemotron family and assigning the correct 128k context window—no warnings, no degradation, just pure agentic speed.

***

## Frequently Asked Questions

### Does this affect my OpenAI/Anthropic profiles?
No. The wrapper only activates when it detects an `ollama` provider signal. Your standard API profiles remain untouched.

### Can I still override the context window manually?
Yes! I built in environment overrides: `CODEX_OLLAMA_CONTEXT_WINDOW=262144` will always take precedence over the heuristics.

### Why not just use `ollama launch claude`?
Claude Code is fantastic, but Codex remains a powerhouse for specific refactoring and developer-grade code generation workflows. Having both in your 

### What is the benefit of using Ollama for local models?
Using [Ollama](https://ollama.com) allows you to run massive models like [NVIDIA Nemotron](https://nvidia.com) or [Qwen3](https://huggingface.co/Qwen) without needing to manage complex configurations. It bridges the gap between local hardware and cloud intelligence.

### Is this compatible with other CLI tools?
Yes. The same 

***

*The goal isn't just to use AI; it's to build the infrastructure that makes using AI feel like magic. 💚*