BUYER'S COMPARISON
Self-hosted AI vs the frontier APIs
An honest feature matrix for teams who can't send their data to OpenAI — and aren't sure whether to roll their own with Ollama, rent a single-tenant from Together, or hire a builder. Updated May 2026.
If you're reading this, you've probably already had the conversation: "We can't put production data into ChatGPT, but we can't afford a frontier-tier private deployment either." What you actually need is a real architecture, on real hardware, with a real audit trail.
This page compares CPLT against the providers most often shortlisted for that need. We've tried to be honest — when a competitor wins on something, we say so. The goal is to help you choose correctly, not to talk you into anything.
Last fact-checked May 2026 against publicly published pricing and documentation. If anything's wrong or out of date, tell us — we'll fix it.
The matrix
| CPLT build & handover |
OpenAI Enterprise |
Anthropic Claude |
Together AI Dedicated |
Anyscale Endpoints |
Ollama (DIY) |
|
|---|---|---|---|---|---|---|
| Deployment Model | ||||||
| Runs on your hardware / VPC | Yes | No | No | VPC option | VPC option | Yes |
| Air-gap deployable | Yes | No | No | No | No | Yes |
| Multi-model behind one API | 6+ providers | OpenAI only | Claude only | Together catalog | Anyscale catalog | Local only |
| Bring your own GPU | Yes | No | No | No | No | Yes |
| Compliance & Sovereignty | ||||||
| Data never leaves your network | Yes | No | No | VPC only | VPC only | Yes |
| EU-hosted by default | Yes | Region opt-in | Region opt-in | US-primary | US-primary | You choose |
| GDPR Art. 28 DPA included | Yes | Yes | Yes | Yes | Yes | N/A — you are processor |
| Operator-grade DR documentation | Included | SaaS-side only | SaaS-side only | SaaS-side only | SaaS-side only | DIY |
| Full audit log of every request | On your infra | Vendor-side | Vendor-side | Vendor-side | Vendor-side | DIY |
| Cost & Predictability | ||||||
| Pricing model | Fixed-scope project | Per-token + seats | Per-token + seats | Per-hour GPU | Per-hour GPU | Hardware only |
| Predictable monthly run-rate | Yes (own infra) | Usage-driven | Usage-driven | Reserved tiers | Reserved tiers | Yes |
| Typical entry investment | €5K–€10K (Tactical) €15K–€60K (build) | ~$60+/seat/mo + tokens | ~$30+/seat/mo + tokens | $2–8/GPU-hr × 24×7 | $2–8/GPU-hr × 24×7 | Hardware + 0 license |
| Mandatory retainer / minimums | None | Annual commit | Annual commit | Reserved minimum | Reserved minimum | None |
| Lock-in & Exit | ||||||
| You own the build artefacts | Yes — full handover | No | No | No | No | Yes |
| Swap underlying model without rewrite | LiteLLM router | No | No | In-vendor only | In-vendor only | DIY |
| Operate without the original builder | Yes — runbooks shipped | Yes | Yes | Yes | Yes | Yes |
| Custom MCP / tool integrations | Yes (add-on) | Function calling | Tool use API | No | No | DIY |
| Operator Burden | ||||||
| You operate the GPUs | Yes — you do | No | No | No | No | Yes |
| You patch & maintain the stack | Self-maintaining + your team | No | No | No | No | Fully on you |
| Frontier-class quality on day 1 | 70B open-weight | GPT-class | Claude-class | Frontier OSS | Frontier OSS | 7–70B local |
Sources: vendor public pricing & docs as of May 2026. "Per-token" / "per-seat" figures reflect published list prices and are not audited. CPLT figures reflect typical engagement bands, not committed rates. Your actual scope dictates your actual price.
When to choose which
No vendor wins everything. Honest guidance on when to skip CPLT.
CPLT CHOOSE US
If: Your data can't leave your network, your finance team needs a fixed-cost line item, and your compliance team needs a deterministic audit trail. You want to own the platform — not rent it indefinitely.
- Regulated mid-market (legal, health, finance, public sector)
- Existing GPUs or budget for hardware
- Internal team that can run a Linux box
- You want exit options, not a deeper integration with a single vendor
OpenAI / Anthropic FRONTIER APIS
Choose them when: You need absolute frontier-class quality, you don't have data-sovereignty requirements, and per-token economics work for your usage profile. They're the right answer for a lot of teams — just not for the ones who can't send the data.
- Consumer products with low compliance burden
- Internal tools where leakage risk is low
- Burst usage too small to justify dedicated hardware
- You need GPT-class or Claude-class reasoning specifically
Together / Anyscale DEDICATED OSS
Choose them when: You want frontier OSS models (Llama, Qwen, DeepSeek) without operating GPUs yourself, you're OK with US-region hosting or VPC deployment, and your scale justifies reserved capacity.
- Inference workloads ≥ 24×7 on a single model
- Engineering team that doesn't want hardware
- Acceptable to be a tenant, even a single-tenant one
- VPC deployment satisfies your compliance posture
Ollama / DIY ROLL YOUR OWN
Choose it when: You have strong infra engineering in-house, you're happy to own the full stack (auth, audit, DR, observability, model routing), and your throughput needs are modest.
- You have a senior platform engineer with capacity
- Single-tenant, single-team, single-model is enough
- You don't need a defensible compliance narrative
- Time-to-deployment is less important than zero spend
What we're not
CPLT is not a SaaS. We don't operate your GPUs, host your model, or take a per-token margin. If you want someone else to be on-call for your inference cluster, hire OpenAI or Anthropic. We build the platform, hand it over, and leave. Your team owns operations from day one.
We're also not a frontier-model lab. The models we deploy are open-weight (Llama, Qwen, DeepSeek, Mistral), routed through LiteLLM so you can swap or add frontier APIs later if your data classification allows. If you genuinely need GPT-5-class reasoning today on data that can't leave your network, no self-hosted option meets that bar — including ours. Wait six months for the open-weight gap to close, or accept a hybrid posture.
And we're a small, focused team. If your procurement requires a Tier-1 vendor with global support and a red phone, we are not that vendor. If it requires a deterministic build, a documented handover, and a price you can sign off on, we are.
Still deciding?
Download the full Architecture Decision Matrix — an 8-page PDF with build-vs-buy worksheets, hardware sizing tables, and a vendor-neutral RFP template you can use against anyone in this comparison (including us).