GPU Worker Setup
Connect a GPU server, cloud VM, or local workstation to the CQ Hub as a job worker. This is the foundation of GPU Anywhere — zero config, any OS, encrypted relay.
Architecture
Your laptop CQ Hub (cloud) Worker (GPU/CPU)
─────────── ────────────── ────────────────
cq hub submit ────────► job queue ◄──── cq serve
(code snapshot + (distributes) (polls queue,
job spec) runs job,
uploads results)Workers are stateless — no project config needed on the worker machine. The job carries everything: code snapshot, environment variables, and artifact declarations.
3-Step Quick Start
Step 1: Install CQ on the Worker Machine
curl -fsSL https://raw.githubusercontent.com/PlayIdea-Lab/cq/main/install.sh | shThis works on Linux (x86_64, ARM64), macOS, and Windows/WSL2. Docker and NVIDIA Container Toolkit are detected and configured automatically if present.
Step 2: Authenticate
cq auth login # GitHub OAuth — use the same account as your laptopOr, for headless machines (no browser):
cq auth login --device # Device code flow — enter code on another deviceStep 3: Start
cq serve # Starts Hub worker + MCP + relay + cron in one processThe worker is now connected. Jobs submitted from your laptop arrive automatically.
What cq serve Starts
cq serve is the all-in-one entry point. It replaces running individual components separately.
| Component | Included |
|---|---|
| Hub worker (job polling) | Yes |
| MCP server | Yes |
| Relay (NAT traversal) | Yes |
| Cron scheduler | Yes |
| pg_notify real-time | Yes (when cloud.direct_url is set) |
Run as a Service
Linux (systemd) — recommended
cq serve start # Start worker in background
cq serve enable # Auto-start on boot (systemd/launchd/Task Scheduler)
systemctl status cq-workerCheck logs:
journalctl -fu cq-workerManual systemd unit (if you prefer):
[Unit]
Description=CQ Hub Worker
After=network.target docker.service
[Service]
User=ubuntu
SupplementaryGroups=docker
WorkingDirectory=/opt/gpu-worker
ExecStart=/usr/local/bin/cq serve
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.targetmacOS (launchd)
cq serve start # Start worker in background
cq serve enable # Register launchd plist for auto-startDocker Compose
curl -sSL https://github.com/PlayIdea-Lab/cq/releases/latest/download/gpu-worker.tar.gz | tar xz
cat > .env <<EOF
C5_HUB_URL=https://<hub-host>:8585
C5_API_KEY=sk-worker-<your-key>
EOF
docker compose up -d
docker compose logs -fKubernetes
CQ workers run natively in K8s. The official container image is published to ghcr.io on every release.
docker pull ghcr.io/playidea-lab/cq-gpu-worker:latest
docker pull ghcr.io/playidea-lab/cq-gpu-worker:v1.58-cuda12.8Deployment manifest (GPU worker with health probes):
apiVersion: apps/v1
kind: Deployment
metadata:
name: cq-gpu-worker
spec:
replicas: 1
selector:
matchLabels:
app: cq-gpu-worker
template:
metadata:
labels:
app: cq-gpu-worker
spec:
containers:
- name: worker
image: ghcr.io/playidea-lab/cq-gpu-worker:latest
env:
- name: C5_API_KEY
valueFrom:
secretKeyRef:
name: cq-secrets
key: api-key
ports:
- containerPort: 8081
startupProbe:
httpGet:
path: /startup
port: 8081
failureThreshold: 10
periodSeconds: 5
livenessProbe:
httpGet:
path: /healthz
port: 8081
periodSeconds: 15
timeoutSeconds: 3
readinessProbe:
httpGet:
path: /readyz
port: 8081
periodSeconds: 10
resources:
limits:
nvidia.com/gpu: "1"
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoScheduleHealth probes (v1.58+):
| Endpoint | Purpose | 200 when |
|---|---|---|
/startup | Startup | Worker initialization complete |
/healthz | Liveness | Heartbeat file updated within 60s |
/readyz | Readiness | Worker can accept new jobs |
Graceful shutdown: On SIGTERM, the worker returns the in-progress job to the Hub queue before exiting (within 10s). Set terminationGracePeriodSeconds: 30 in the pod spec.
Image tags: v{version}-cuda{cuda_version} (e.g., v1.58-cuda12.8). Use latest for the most recent stable build.
Helm chart support is planned for a future release. For now, use the manifest above or kustomize.
Real-Time Job Delivery
By default, workers poll for jobs every 30 seconds. For sub-second delivery, configure a direct database connection:
# ~/.c4/config.yaml
cloud:
direct_url: "postgresql://..." # Direct Supabase connection stringWith direct_url, the worker uses PostgreSQL LISTEN 'new_job' — jobs arrive instantly.
Submitting Jobs
From your laptop, in Claude Code:
# MCP tool
cq_hub_submit(command="python train.py")Or from the terminal:
cq hub submit --run "python train.py"CQ snapshots the current directory to Drive (content-addressable, automatic dedup) and posts the job to the Hub. No Git required.
GPU Detection
Workers automatically detect GPU capabilities:
- If
nvidia-smiis found, the worker registers as GPU-capable - Jobs with
requires_gpu: trueare only routed to GPU workers - If
nvidia-smiis not found, the worker starts in CPU-only mode (no action needed)
Routing Jobs to Specific Workers
By worker ID
cq hub submit --target worker-abc123 python train.pyBy capability
cq hub submit --capability cuda python train.pyBy tags
cq hub submit --tags gpu,a100 python train.pyDeclare tags in caps.yaml on the worker:
tags:
- gpu
- a100
- datacenter-usMonitoring
cq hub workers # Active workers
cq hub workers --all # Include offline workers
cq hub list # Recent jobs
cq hub status <job_id> # Job status
cq hub watch <job_id> # Live job output
cq hub log <job_id> # Job logs
cq hub summary # Hub statsMaintenance
Remove zombie workers
Workers offline for 24+ hours are pruned automatically. Manual cleanup:
cq hub workers prune # Remove offline workers
cq hub workers prune --dry-run # PreviewVersion gate
If the Hub requires a minimum worker version:
cq update # Update binary
cq hub worker start # Restart workerAuthentication Reference
| Method | How |
|---|---|
| Session (default) | cq auth login — stored at ~/.c4/session.json, used automatically |
| API key | export C5_API_KEY=sk-worker-<key> |
| Device code | cq auth login --device — for headless machines |
Key prefixes:
| Prefix | Scope |
|---|---|
sk-worker-* | Poll and complete jobs only |
sk-user-* | Submit and query jobs only |
| (none) | Full access |
Troubleshooting
| Symptom | Fix |
|---|---|
nvidia-smi not found | Worker runs in CPU-only mode automatically — no action needed |
| Auth error | Re-run cq auth login or cq auth login --device |
| Worker shows offline | Run `ps aux |
| Job stuck | Check cq hub log <job_id> and worker logs |
--non-interactive needed in CI | Pass --non-interactive flag to cq hub worker init |
| WSL2 relay drops | CQ sets SO_KEEPALIVE automatically — no config needed. Ensure cq serve (not cq hub worker start) is used, as it includes the keepalive-aware relay component |
Next Steps
- Knowledge Loop — accumulate experiment results into reusable AI knowledge
- Tiers — understand Free/Pro/Team feature sets