Self-hosted · open beta · v0.8.3

LLM Cluster

Distributed GPU sharing

String idle GPUs into your own compute pool. Team-shared scheduling, priority queues, transparent costs — like Slurm, but for the LLM era.

▸ Request access GitHub →

beta · v0.8.3

bashcat@studio:~/products $ ./llm-cluster

› $ cluster status

› nodes: 4 online · 12× A6000 · 8× 4090

› queue: 3 pending · 2 running · 18 done

› $ cluster submit train.yml

› job-471 → node-2 (RTX A6000)

› eta 1h 24m · cost est $1.18

deploySelf-hosted (k8s)

GPUNVIDIA · CUDA 12+

schedpriority + fair-share

licenseopen core

§ 01How it works

Idle GPUs → shared compute pool.

§ 01

Register nodes

Install agent on any GPU box — desktop, server, or cloud. Auto-detects.

§ 02

Submit jobs

YAML config or one-line CLI. Choose GPU class, deadline, priority.

§ 03

Monitor + bill

Live dashboard, per-user cost, exportable usage reports.

§ 02Features

Built for teams, not single users.

§ 01

GPU scheduling

priority + fair-share

Mix priority queues (urgent inference) with fair-share (training jobs).

§ 02

Job queue

durable

Crash-safe queue. Re-queue on node failure. Retry policies per job.

§ 03

Live dashboard

metrics

GPU utilization, VRAM, queue depth, job timeline — all in one view.

§ 04

Cost tracking

per-user / per-team

GPU-hour accounting per user, team, or project. Export to CSV.

§ 05

Multi-tenant

isolation

Namespace isolation, RBAC, per-team resource quotas.

§ 06

Open core

self-host

Core is Apache 2.0. Enterprise add-ons (SSO, audit, SLA) available.

$ ./contact

Got something to build?
Let's talk.

First 30-min tech consult is on us. No NDA needed for the kick-off chat — that comes later.

▸ Book a call Or email