Local models
Local is doable, but Fased works best with enough context and model quality for tool-heavy, security-sensitive conversations. Small or heavily quantized models can truncate context, miss policy cues, or produce weaker tool decisions. Use the strongest model your hardware can run reliably, keep hosted fallback models configured when you need higher reliability, and review local-model exposure with the same care as any private endpoint (see Security).Advanced local stack: LM Studio
Load a strong model in LM Studio, enable the local server, and connect Fased through the first-class LM Studio provider in Agent > Models.- Install LM Studio: https://lmstudio.ai
- In LM Studio, download the largest model you can run reliably, start the server,
and confirm
http://127.0.0.1:1234/api/v1/modelslists it. - Keep the model loaded; cold-load adds startup latency.
- Adjust
contextWindow/maxTokensif your LM Studio build differs. - If LM Studio auth is disabled, leave the token blank in Agent > Models.
models.mode: "merge" so fallbacks stay available.
Hybrid config: hosted primary, local fallback
Local-first with hosted safety net
Swap the primary and fallback order; keep the same providers block andmodels.mode: "merge" so you can fall back to Sonnet or Opus when the local box is down.
Regional hosting / data routing
- Hosted MiniMax/Kimi/GLM variants also exist on OpenRouter with region-pinned endpoints (e.g., US-hosted). Pick the regional variant there to keep traffic in your chosen jurisdiction while still using
models.mode: "merge"for Anthropic/OpenAI fallbacks. - Local-only remains the strongest privacy path; hosted regional routing is the middle ground when you need provider features but want control over data flow.
Registry-Supported Local Routes
Normal setup should use Agent > Models in the Control UI. Add or select one of these provider surfaces there:- Ollama for a native local/cloud/hybrid Ollama server.
- LM Studio for the local server on
localhost:1234. - vLLM for a vLLM server.
- LiteLLM for a LiteLLM gateway.
- Custom Provider for SGLang or any other private OpenAI-compatible endpoint.
Ollama
For the normal UI path:/v1 unless you deliberately
choose Custom Provider for compatibility testing.
Other OpenAI-compatible Local Proxies
vLLM, LiteLLM, SGLang, OAI-proxy, or custom gateways work if they expose an OpenAI-style/v1 endpoint. Replace the provider block above with your endpoint
and model ID:
models.mode: "merge" so hosted models stay available as fallbacks.
Private, loopback, and LAN model provider URLs require request.allowPrivateNetwork: true.
Public hosted providers do not need this flag.
Troubleshooting
- Gateway can reach the proxy?
curl http://127.0.0.1:1234/v1/models. - LM Studio model unloaded? Reload; cold start is a common “hanging” cause.
- Context errors? Lower
contextWindowor raise your server limit. - Safety: local models skip provider-side filters; keep agents narrow and compaction on to limit prompt injection blast radius.