GPU Worker Setup

Connect a GPU server, cloud VM, or local workstation to the CQ Hub as a job worker. This is the foundation of GPU Anywhere — zero config, any OS, encrypted relay.

Architecture

Your laptop              CQ Hub (cloud)          Worker (GPU/CPU)
───────────              ──────────────          ────────────────
cq hub submit  ────────► job queue        ◄────  cq serve
(code snapshot +         (distributes)           (polls queue,
 job spec)                                        runs job,
                                                  uploads results)

Workers are stateless — no project config needed on the worker machine. The job carries everything: code snapshot, environment variables, and artifact declarations.

3-Step Quick Start

Step 1: Install CQ on the Worker Machine

curl -fsSL https://raw.githubusercontent.com/PlayIdea-Lab/cq/main/install.sh | bash

This works on Linux (x86_64, ARM64), macOS, and Windows/WSL2. Docker and NVIDIA Container Toolkit are detected and configured automatically if present.

Step 2: Authenticate

cq auth login    # GitHub OAuth — use the same account as your laptop

Or, for headless machines (no browser):

cq auth login --device    # Device code flow — enter code on another device

Step 3: Start

cq serve    # Starts Hub worker + MCP + relay + cron in one process

The worker is now connected. Jobs submitted from your laptop arrive automatically.

What `cq serve` Starts

cq serve is the all-in-one entry point. It replaces running individual components separately.

Component	Included
Hub worker (job polling)	Yes
MCP server	Yes
Relay (NAT traversal)	Yes
Cron scheduler	Yes
pg_notify real-time	Yes (when `cloud.direct_url` is set)

Run as a Service

Linux (systemd) — recommended

cq serve start      # Start worker in background
cq serve enable     # Auto-start on boot (systemd/launchd/Task Scheduler)
systemctl status cq-worker

Check logs:

journalctl -fu cq-worker

Manual systemd unit (if you prefer):

ini

[Unit]
Description=CQ Hub Worker
After=network.target docker.service

[Service]
User=ubuntu
SupplementaryGroups=docker
WorkingDirectory=/opt/gpu-worker
ExecStart=/usr/local/bin/cq serve
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

macOS (launchd)

cq serve start      # Start worker in background
cq serve enable     # Register launchd plist for auto-start

Docker Compose

curl -sSL https://github.com/PlayIdea-Lab/cq/releases/latest/download/gpu-worker.tar.gz | tar xz

cat > .env <<EOF
C5_HUB_URL=https://<hub-host>:8585
C5_API_KEY=sk-worker-<your-key>
EOF

docker compose up -d
docker compose logs -f

Kubernetes

CQ workers run natively in K8s. The official container image is published to ghcr.io on every release.

docker pull ghcr.io/playidea-lab/cq-gpu-worker:latest
docker pull ghcr.io/playidea-lab/cq-gpu-worker:v1.58-cuda12.8

Deployment manifest (GPU worker with health probes):

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cq-gpu-worker
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cq-gpu-worker
  template:
    metadata:
      labels:
        app: cq-gpu-worker
    spec:
      containers:
        - name: worker
          image: ghcr.io/playidea-lab/cq-gpu-worker:latest
          env:
            - name: C5_API_KEY
              valueFrom:
                secretKeyRef:
                  name: cq-secrets
                  key: api-key
          ports:
            - containerPort: 8081
          startupProbe:
            httpGet:
              path: /startup
              port: 8081
            failureThreshold: 10
            periodSeconds: 5
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8081
            periodSeconds: 15
            timeoutSeconds: 3
          readinessProbe:
            httpGet:
              path: /readyz
              port: 8081
            periodSeconds: 10
          resources:
            limits:
              nvidia.com/gpu: "1"
      tolerations:
        - key: nvidia.com/gpu
          operator: Exists
          effect: NoSchedule

Health probes (v1.58+):

Endpoint	Purpose	200 when
`/startup`	Startup	Worker initialization complete
`/healthz`	Liveness	Heartbeat file updated within 60s
`/readyz`	Readiness	Worker can accept new jobs

Graceful shutdown: On SIGTERM, the worker returns the in-progress job to the Hub queue before exiting (within 10s). Set terminationGracePeriodSeconds: 30 in the pod spec.

Image tags: v{version}-cuda{cuda_version} (e.g., v1.58-cuda12.8). Use latest for the most recent stable build.

Helm chart support is planned for a future release. For now, use the manifest above or kustomize.

Real-Time Job Delivery

By default, workers poll for jobs every 30 seconds. For sub-second delivery, configure a direct database connection:

yaml

# ~/.c4/config.yaml
cloud:
  direct_url: "postgresql://..."    # Direct Supabase connection string

With direct_url, the worker uses PostgreSQL LISTEN 'new_job' — jobs arrive instantly.

Submitting Jobs

From your laptop, in Claude Code:

# MCP tool
cq_hub_submit(command="python train.py")

Or from the terminal:

cq hub submit --run "python train.py"

CQ snapshots the current directory to Drive (content-addressable, automatic dedup) and posts the job to the Hub. No Git required.

GPU Detection

Workers automatically detect GPU capabilities:

If nvidia-smi is found, the worker registers as GPU-capable
Jobs with requires_gpu: true are only routed to GPU workers
If nvidia-smi is not found, the worker starts in CPU-only mode (no action needed)

Routing Jobs to Specific Workers

By worker ID

cq hub submit --target worker-abc123 python train.py

By capability

cq hub submit --capability cuda python train.py

By tags

cq hub submit --tags gpu,a100 python train.py

Declare tags in caps.yaml on the worker:

yaml

tags:
  - gpu
  - a100
  - datacenter-us

Monitoring

cq hub workers              # Active workers
cq hub workers --all        # Include offline workers
cq hub list                 # Recent jobs
cq hub status <job_id>      # Job status
cq hub watch <job_id>       # Live job output
cq hub log <job_id>         # Job logs
cq hub summary              # Hub stats

Maintenance

Remove zombie workers

Workers offline for 24+ hours are pruned automatically. Manual cleanup:

cq hub workers prune              # Remove offline workers
cq hub workers prune --dry-run    # Preview

Version gate

If the Hub requires a minimum worker version:

cq update               # Update binary
cq hub worker start     # Restart worker

Authentication Reference

Method	How
Session (default)	`cq auth login` — stored at `~/.c4/session.json`, used automatically
API key	`export C5_API_KEY=sk-worker-<key>`
Device code	`cq auth login --device` — for headless machines

Key prefixes:

Prefix	Scope
`sk-worker-*`	Poll and complete jobs only
`sk-user-*`	Submit and query jobs only
(none)	Full access

Troubleshooting

Symptom	Fix
`nvidia-smi not found`	Worker runs in CPU-only mode automatically — no action needed
Auth error	Re-run `cq auth login` or `cq auth login --device`
Worker shows offline	Run `ps aux
Job stuck	Check `cq hub log <job_id>` and worker logs
`--non-interactive` needed in CI	Pass `--non-interactive` flag to `cq hub worker init`
WSL2 relay drops	CQ sets `SO_KEEPALIVE` automatically — no config needed. Ensure `cq serve` (not `cq hub worker start`) is used, as it includes the keepalive-aware relay component

Next Steps

Knowledge Loop — accumulate experiment results into reusable AI knowledge
Tiers — understand Free/Pro/Team feature sets

GPU Worker Setup ​

Architecture ​

3-Step Quick Start ​

Step 1: Install CQ on the Worker Machine ​

Step 2: Authenticate ​

Step 3: Start ​

What cq serve Starts ​

Run as a Service ​

Linux (systemd) — recommended ​

macOS (launchd) ​

Docker Compose ​

Kubernetes ​

Real-Time Job Delivery ​

Submitting Jobs ​

GPU Detection ​

Routing Jobs to Specific Workers ​

By worker ID ​

By capability ​

By tags ​

Monitoring ​

Maintenance ​

Remove zombie workers ​

Version gate ​

Authentication Reference ​

Troubleshooting ​

Next Steps ​

GPU Worker Setup

Architecture

3-Step Quick Start

Step 1: Install CQ on the Worker Machine

Step 2: Authenticate

Step 3: Start

What `cq serve` Starts

Run as a Service

Linux (systemd) — recommended

macOS (launchd)

Docker Compose

Kubernetes

Real-Time Job Delivery

Submitting Jobs

GPU Detection

Routing Jobs to Specific Workers

By worker ID

By capability

By tags

Monitoring

Maintenance

Remove zombie workers

Version gate

Authentication Reference

Troubleshooting

Next Steps