If you're reading this, you've probably already had the conversation: "We can't put production data into ChatGPT, but we can't afford a frontier-tier private deployment either." What you actually need is a real architecture, on real hardware, with a real audit trail.

This page compares CPLT against the providers most often shortlisted for that need. We've tried to be honest — when a competitor wins on something, we say so. The goal is to help you choose correctly, not to talk you into anything.

Last fact-checked May 2026 against publicly published pricing and documentation. If anything's wrong or out of date, tell us — we'll fix it.

The matrix

CPLT
build & handover
OpenAI
Enterprise
Anthropic
Claude
Together AI
Dedicated
Anyscale
Endpoints
Ollama
(DIY)
Deployment Model
Runs on your hardware / VPC YesNoNoVPC optionVPC optionYes
Air-gap deployable YesNoNoNoNoYes
Multi-model behind one API 6+ providersOpenAI onlyClaude onlyTogether catalogAnyscale catalogLocal only
Bring your own GPU YesNoNoNoNoYes
Compliance & Sovereignty
Data never leaves your network YesNoNoVPC onlyVPC onlyYes
EU-hosted by default YesRegion opt-inRegion opt-inUS-primaryUS-primaryYou choose
GDPR Art. 28 DPA included YesYesYesYesYesN/A — you are processor
Operator-grade DR documentation IncludedSaaS-side onlySaaS-side onlySaaS-side onlySaaS-side onlyDIY
Full audit log of every request On your infraVendor-sideVendor-sideVendor-sideVendor-sideDIY
Cost & Predictability
Pricing model Fixed-scope projectPer-token + seatsPer-token + seatsPer-hour GPUPer-hour GPUHardware only
Predictable monthly run-rate Yes (own infra)Usage-drivenUsage-drivenReserved tiersReserved tiersYes
Typical entry investment €5K–€10K (Tactical)
€15K–€60K (build)
~$60+/seat/mo + tokens~$30+/seat/mo + tokens$2–8/GPU-hr × 24×7$2–8/GPU-hr × 24×7Hardware + 0 license
Mandatory retainer / minimums NoneAnnual commitAnnual commitReserved minimumReserved minimumNone
Lock-in & Exit
You own the build artefacts Yes — full handoverNoNoNoNoYes
Swap underlying model without rewrite LiteLLM routerNoNoIn-vendor onlyIn-vendor onlyDIY
Operate without the original builder Yes — runbooks shippedYesYesYesYesYes
Custom MCP / tool integrations Yes (add-on)Function callingTool use APINoNoDIY
Operator Burden
You operate the GPUs Yes — you doNoNoNoNoYes
You patch & maintain the stack Self-maintaining + your teamNoNoNoNoFully on you
Frontier-class quality on day 1 70B open-weightGPT-classClaude-classFrontier OSSFrontier OSS7–70B local
Yes = capability present ~ = partial / conditional No = not offered

Sources: vendor public pricing & docs as of May 2026. "Per-token" / "per-seat" figures reflect published list prices and are not audited. CPLT figures reflect typical engagement bands, not committed rates. Your actual scope dictates your actual price.

When to choose which

No vendor wins everything. Honest guidance on when to skip CPLT.

CPLT CHOOSE US

If: Your data can't leave your network, your finance team needs a fixed-cost line item, and your compliance team needs a deterministic audit trail. You want to own the platform — not rent it indefinitely.

  • Regulated mid-market (legal, health, finance, public sector)
  • Existing GPUs or budget for hardware
  • Internal team that can run a Linux box
  • You want exit options, not a deeper integration with a single vendor

OpenAI / Anthropic FRONTIER APIS

Choose them when: You need absolute frontier-class quality, you don't have data-sovereignty requirements, and per-token economics work for your usage profile. They're the right answer for a lot of teams — just not for the ones who can't send the data.

  • Consumer products with low compliance burden
  • Internal tools where leakage risk is low
  • Burst usage too small to justify dedicated hardware
  • You need GPT-class or Claude-class reasoning specifically

Together / Anyscale DEDICATED OSS

Choose them when: You want frontier OSS models (Llama, Qwen, DeepSeek) without operating GPUs yourself, you're OK with US-region hosting or VPC deployment, and your scale justifies reserved capacity.

  • Inference workloads ≥ 24×7 on a single model
  • Engineering team that doesn't want hardware
  • Acceptable to be a tenant, even a single-tenant one
  • VPC deployment satisfies your compliance posture

Ollama / DIY ROLL YOUR OWN

Choose it when: You have strong infra engineering in-house, you're happy to own the full stack (auth, audit, DR, observability, model routing), and your throughput needs are modest.

  • You have a senior platform engineer with capacity
  • Single-tenant, single-team, single-model is enough
  • You don't need a defensible compliance narrative
  • Time-to-deployment is less important than zero spend

What we're not

CPLT is not a SaaS. We don't operate your GPUs, host your model, or take a per-token margin. If you want someone else to be on-call for your inference cluster, hire OpenAI or Anthropic. We build the platform, hand it over, and leave. Your team owns operations from day one.

We're also not a frontier-model lab. The models we deploy are open-weight (Llama, Qwen, DeepSeek, Mistral), routed through LiteLLM so you can swap or add frontier APIs later if your data classification allows. If you genuinely need GPT-5-class reasoning today on data that can't leave your network, no self-hosted option meets that bar — including ours. Wait six months for the open-weight gap to close, or accept a hybrid posture.

And we're a small, focused team. If your procurement requires a Tier-1 vendor with global support and a red phone, we are not that vendor. If it requires a deterministic build, a documented handover, and a price you can sign off on, we are.

Still deciding?

Download the full Architecture Decision Matrix — an 8-page PDF with build-vs-buy worksheets, hardware sizing tables, and a vendor-neutral RFP template you can use against anyone in this comparison (including us).

Read the blog → Scope a deployment →