Austin, Texas · Apple Silicon Training Cluster

Fine-tune LLMs on the largest unified memory in the cloud.

Per-hour pricing, US data residency, no spot-market roulette.

Talk to an engineer → See the fleet

A clean aisle inside the Deliany training cluster in East Austin — matte-black racks of Apple Silicon hardware on both sides, polished concrete floor, and the downtown Austin skyline with Frost Bank's owl-eye crown visible through a floor-to-ceiling glass wall at golden hour

Frameworks we train with — and the models we train every week

MLX

PyTorch

Hugging Face

Weights & Biases

Ollama

What it's for

Three training jobs Apple Silicon now does as well as anything else.

Adapters

LoRA and QLoRA on 70B-class models, no quantization tax.

Llama 3.3 70B, Mistral Large, Qwen 2.5 72B — load them in fp16 on a single Apple Silicon Studio and train adapters in MLX. A 10K-example LoRA finishes in an evening.

Talk to an engineer →

Full SFT

Train smaller models end-to-end. Or continue pretraining one.

Mistral 7B, Qwen 2.5 7B, Phi-4, Gemma 2 9B — full supervised fine-tuning in MLX, including every parameter, every optimizer state, every gradient. 50K examples in well under an afternoon.

Talk to an engineer →

Alignment

DPO, ORPO, KTO — the pass after your SFT.

Preference-tuning loops on the same machine your SFT lived on. We maintain the MLX training recipes and the eval glue. You bring the preference pairs.

Send us a brief →

Why Deliany

The cheapest way to fine-tune a 70B-class model. Honestly.

Unified memory at scale

Memory-bound runs above 32B don't have to be quantized or sharded. Apple Silicon's unified memory architecture lets the whole model live in one coherent address space — and the math gets very friendly very quickly.

MLX-native, recipes maintained

MLX ships with mixed precision, gradient checkpointing, distributed training, and PEFT helpers. We maintain the production recipes for every flow we run.

Direct engineering support

Every account gets a shared Slack Connect channel that routes to the engineers who maintain the recipes and own the fleet. No tier-1 scripts. No bots.

Transparent pricing

Hourly, monthly, or per-run. Quoted in plain English. No spot market. No commit minimums. No surprise invoices.

How it works

Two ways in. Both ship by Friday.

DIY

Rent the node, run your script.

Spin up an Apple Silicon node or a Studio. SSH in, run your MLX script, pull the adapter down when it's done. Resumable checkpoints and HuggingFace cache mounts standard.

Talk to an engineer →

Managed

Send a dataset, get an adapter back.

Hand us a JSONL and a one-line training config (we have a starter for every flow). We run, monitor, eval on a held-out slice, and hand back a tested adapter with a brief writeup. Flat per-run pricing.

Send us a brief →

Start a fine-tune today

Ready to try Apple Silicon for your next training run?

Pick a tier or send us a dataset. Self-serve nodes spin up in under an hour. Managed runs typically finish the same day.

Talk to an engineer → Send a brief to sales@