Jan Server - Installation

Prerequisites

Before setting up Jan Server, ensure you have the following components installed:

Required Components

Important: Windows and macOS users can only run mock servers for development. Real LLM model inference with vLLM is only supported on Linux systems with NVIDIA GPUs.

Docker Desktop
- Windows: Download from Docker Desktop for Windows (opens in a new tab)
- macOS: Download from Docker Desktop for Mac (opens in a new tab)
- Linux: Follow Docker Engine installation guide (opens in a new tab)
Minikube
- Windows: choco install minikube or download from minikube releases (opens in a new tab)
- macOS: brew install minikube or download from minikube releases (opens in a new tab)
- Linux: curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 && sudo install minikube-linux-amd64 /usr/local/bin/minikube
Helm
- Windows: choco install kubernetes-helm or download from Helm releases (opens in a new tab)
- macOS: brew install helm or download from Helm releases (opens in a new tab)
- Linux: curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
kubectl
- Windows: choco install kubernetes-cli or download from kubectl releases (opens in a new tab)
- macOS: brew install kubectl or download from kubectl releases (opens in a new tab)
- Linux: curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl" && sudo install kubectl /usr/local/bin/kubectl

Optional: NVIDIA GPU Support (for Real LLM Models)

If you plan to run real LLM models (not mock servers) and have an NVIDIA GPU:

Install NVIDIA Container Toolkit: Follow the official NVIDIA Container Toolkit installation guide (opens in a new tab)
Configure Minikube for GPU support: Follow the official minikube GPU tutorial (opens in a new tab) for complete setup instructions.

Quick Start

Local Development Setup

Option 1: Mock Server Setup (Recommended for Development)

Start Minikube and configure Docker:

minikube start eval $(minikube docker-env)
Build and deploy all services:

./scripts/run.sh
Access the services:
- API Gateway: http://localhost:8080 (opens in a new tab)
- Swagger UI: http://localhost:8080/api/swagger/index.html (opens in a new tab)
- Health Check: http://localhost:8080/healthcheck (opens in a new tab)
- Version Info: http://localhost:8080/v1/version (opens in a new tab)

Option 2: Real LLM Setup (Requires NVIDIA GPU)

Start Minikube with GPU support:

minikube start --gpus all eval $(minikube docker-env)
Configure GPU memory utilization (if you have limited GPU memory):

GPU memory utilization is configured in the vLLM Dockerfile. See the vLLM CLI documentation (opens in a new tab) for all available arguments.

To modify GPU memory utilization, edit the vLLM launch command in:
- apps/jan-inference-model/Dockerfile (for Docker builds)
- Helm chart values (for Kubernetes deployment)
Build and deploy all services:

# For GPU setup, modify run.sh to use GPU-enabled minikube # Edit scripts/run.sh and change "minikube start" to "minikube start --gpus all" ./scripts/run.sh

Production Deployment

For production deployments, modify the Helm values in charts/umbrella-chart/values.yaml and deploy using:


helm install jan-server ./charts/umbrella-chart

Manual Installation

Build Docker Images

Build both required Docker images:


# Build API Gateway
docker build -t jan-api-gateway:latest ./apps/jan-api-gateway
# Build Inference Model
docker build -t jan-inference-model:latest ./apps/jan-inference-model

The inference model image downloads the Jan-v1-4B model from Hugging Face during build. This requires an internet connection and several GB of download.

Deploy with Helm

Install the Helm chart:


# Update Helm dependencies
helm dependency update ./charts/umbrella-chart
# Install Jan Server
helm install jan-server ./charts/umbrella-chart

Port Forwarding

Forward the API gateway port to access from your local machine:


kubectl port-forward svc/jan-server-jan-api-gateway 8080:8080

Verify Installation

Check that all pods are running:


kubectl get pods

Expected output:


NAME                                               READY   STATUS    RESTARTS
jan-server-jan-api-gateway-xxx                     1/1     Running   0
jan-server-jan-inference-model-xxx                 1/1     Running   0
jan-server-postgresql-0                            1/1     Running   0

Test the API gateway:


curl http://localhost:8080/health

Uninstalling

To remove Jan Server:


helm uninstall jan-server

To stop minikube:


minikube stop

Troubleshooting

Common Issues and Solutions

1. LLM Pod Not Starting (Pending Status)

Symptoms: The jan-server-jan-inference-model pod stays in Pending status.

Diagnosis Steps:


# Check pod status
kubectl get pods
# Get detailed pod information (replace with your actual pod name)
kubectl describe pod jan-server-jan-inference-model-<POD_ID>

Common Error Messages and Solutions:

Error: "Insufficient nvidia.com/gpu"


0/1 nodes are available: 1 Insufficient nvidia.com/gpu. no new claims to deallocate, preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.

Solution for Real LLM Setup:

Ensure you have NVIDIA GPU and drivers installed
Install NVIDIA Container Toolkit (see Prerequisites section)
Start minikube with GPU support:
minikube start --gpus all

Error: vLLM Pod Keeps Restarting


# Check pod logs to see the actual error
kubectl logs jan-server-jan-inference-model-<POD_ID>

Common vLLM startup issues:

CUDA Out of Memory: Modify vLLM arguments in Dockerfile to reduce memory usage
Model Loading Errors: Check if model path is correct and accessible
GPU Not Detected: Ensure NVIDIA Container Toolkit is properly installed

2. Helm Issues

Symptoms: Helm commands fail or charts won't install.

Solutions:


# Update Helm dependencies
helm dependency update ./charts/umbrella-chart
# Check Helm status
helm list
# Uninstall and reinstall
helm uninstall jan-server
helm install jan-server ./charts/umbrella-chart

3. Common Development Issues

Pods in ImagePullBackOff state

Ensure Docker images were built in the minikube environment
Run eval $(minikube docker-env) before building images

Port forwarding connection refused

Verify the service is running: kubectl get svc
Check pod status: kubectl get pods
Review logs: kubectl logs deployment/jan-server-jan-api-gateway

Inference model download fails

Ensure internet connectivity during Docker build
The Jan-v1-4B model is approximately 2.4GB

Resource Requirements

Minimum System Requirements:

8GB RAM
20GB free disk space
4 CPU cores

Recommended System Requirements:

16GB RAM
50GB free disk space
8 CPU cores
GPU support (for faster inference)

The inference model requires significant memory. Ensure your minikube cluster has adequate resources allocated.

Overview Configuration