Jan CLI
Available since Jan 0.7.8 (March 2026).
The jan CLI lets you serve local AI models and launch autonomous agents from your terminal β no cloud account, no usage fees, full privacy.
βββ ββββββ ββββ βββ ββββββββββββββββ βββ ββββββββββββββββββββ ββ ββββββββββββββββββββ βββββββββββ ββββββ βββββ ββββββ βββ ββββββ ββββJan runs local AI models (LlamaCPP / MLX) and exposes them via anOpenAI-compatible API, then wires AI coding agent like Claude Codedirectly to your own hardware β no cloud account, no usage fees, full privacy.Models downloaded in the Jan desktop app are automatically available here.Usage: jan <COMMAND>Commands: serve Load a local model and expose it at localhost:6767/v1 (auto-detects LlamaCPP or MLX) launch Start a local model, then launch an AI agent with it pre-wired (env vars set automatically) threads List and inspect conversation threads saved by the Jan app models List and load models installed in the Jan data folder agent pi-mono-style minimal agent help Print this message or the help of the given subcommand(s)Options: -h, --help Print help (see a summary with '-h') -V, --version Print versionExamples: jan launch claude # pick a model, then run Claude Code against it jan launch claude --model janhq/Jan-code-4b-gguf # use a specific model jan launch openclaw --model janhq/Jan-code-4b-gguf # wire openclaw to a local model jan serve janhq/Jan-code-4b-gguf # expose a model at localhost:6767/v1 jan serve janhq/Jan-code-4b-gguf --fit # auto-fit context to available VRAM jan serve janhq/Jan-code-4b-gguf --detach # run in the background jan models list # show all installed models
Models downloaded in the Jan desktop app are automatically available to the CLI.
Installation
Jan CLI is installed automatically when you launch the Jan desktop app for the first time β no extra steps needed. You can uninstall or reinstall it at any time from Settings > General > Jan CLI.
The CLI binary is installed at ~/.local/bin/jan on macOS/Linux. Make sure this path is in your $PATH to use the jan command from any terminal.
The Jan desktop app helps you manage inference backends (LlamaCPP, MLX) and models β or use the CLI to manage them. Both share the same data folder, so models installed via either interface are available to both.
Quick Start
Getting started takes a single command:
jan launch
Jan will ask you to pick an agent (Claude Code, OpenClaw, opencode), then automatically download and set up Jan's foundation model and wire it to the agent for you. No config files, no API keys, no cloud β your agent runs entirely on your own hardware.
Commands
jan serve
Load a local model and expose it at localhost:6767/v1 as an OpenAI-compatible API. Auto-detects LlamaCPP or MLX.
jan serve [MODEL_ID] [OPTIONS]
| Option | Description | Default |
|---|---|---|
MODEL_ID | Model ID to load (omit to pick interactively). Can be a local model ID or a HuggingFace repo ID (e.g., unsloth/Qwen3.5-9B-GGUF) | β |
--port | Port to listen on (0 = random free port) | 6767 |
--n-gpu-layers | GPU layers to offload (-1 = all, 0 = CPU only) | -1 |
--ctx-size | Context window size in tokens | 32768 |
--fit | Auto-fit context to available VRAM | β |
--api-key | API key required by clients | "" |
-d, --detach | Run in background, print PID | β |
--threads | CPU threads for inference (0 = auto) | 0 |
--embedding | Treat model as an embedding model | β |
-v, --verbose | Print full server logs | β |
When no model ID is provided, an interactive selector is shown. If no models are installed yet, Jan will automatically download its default foundation model to get you started:
$ jan serveβββ Select Model βββChoose a model:> janhq/Jan-v3-4B-base-instruct-gguf [LlamaCPP] sentence-transformer-mini [LlamaCPP] Jan-v3-4B-base-instruct-4bit [MLX]
Examples:
jan serve # pick a model interactivelyjan serve qwen3.5-35b-a3b # serve a specific modeljan serve qwen3.5-35b-a3b --fit # auto-fit context to available VRAMjan serve qwen3.5-35b-a3b --detach # run in backgroundjan serve qwen3.5-35b-a3b --port 8080 # serve on a custom portjan serve unsloth/Qwen3.5-9B-GGUF # download and serve a HuggingFace model
When using a HuggingFace repo ID (e.g., unsloth/Qwen3.5-9B-GGUF), if the model isn't downloaded yet, Jan will automatically download it from HuggingFace.
LlamaCPP Server Arguments
You can pass additional arguments to the underlying LlamaCPP server using environment variables with the LLAMA_ARG_ prefix:
export LLAMA_ARG_HOST=0.0.0.0 # bind to specific hostexport LLAMA_ARG_N_GPU_LAYERS=32 # set GPU layersexport LLAMA_ARG_CTX_SIZE=8192 # set context sizeexport LLAMA_ARG_THREADS=8 # set CPU threadsjan serve qwen3.5-35b-a3b
Any LlamaCPP server flag can be passed this way β just prefix the flag name with LLAMA_ARG_.
jan launch
Start a local model, then launch an AI agent with it pre-wired β environment variables are set automatically so the agent connects to your local model.
jan launch [PROGRAM] [OPTIONS]
| Option | Description | Default |
|---|---|---|
PROGRAM | Agent to launch: claude, openclaw, opencode (omit to pick interactively) | β |
--model | Model ID to load (omit to pick interactively) | β |
--ctx-size | Context window size in tokens | 4096 |
--fit | Auto-fit context to available VRAM | on for claude |
--port | Port for the model server | 6767 |
--api-key | API key (exported as OPENAI_API_KEY and ANTHROPIC_AUTH_TOKEN) | jan |
-v, --verbose | Print full server logs | β |
When no agent or model is specified, interactive selectors are shown. If no models are installed, Jan will automatically download its default foundation model before launching the agent:
$ jan launchβββ Select Agent βββChoose an agent to launch:> Claude Code β Anthropic's AI coding agent OpenClaw β Open-source autonomous AI agent [not installed]
Examples:
jan launch claude # pick a model, then run Claude Codejan launch claude --model qwen3.5-35b-a3b # use a specific model with Claude Codejan launch openclaw --model qwen3.5-35b-a3b # wire OpenClaw to a local modeljan launch opencode --model qwen3.5-35b-a3b # wire opencode to a local model
jan models
List and manage models installed in the Jan data folder.
jan models list # list all installed modelsjan models load <MODEL_ID> # serve a model (alias for jan serve)jan models load-mlx <ID> # load an MLX model (macOS / Apple Silicon only)
jan threads
List and inspect conversation threads saved by the Jan desktop app.
jan threads list # list all threadsjan threads get <ID> # get a thread's metadatajan threads messages <THREAD_ID> # list all messages in a threadjan threads delete <ID> # permanently delete a thread
Common Workflows
Serve a model for use with any OpenAI-compatible client:
jan serve jan-code-4b --fit
Launch Claude Code against a local model:
jan launch claude --model jan-code-4b
Run a model in the background:
jan serve jan-code-4b --detach
List all installed models:
jan models list
Troubleshooting
For common issues β including Windows-specific problems like jan opening the desktop app instead of the CLI β see the Troubleshooting guide.