Architecture — Every Layer Independently Replaceable

Core stack — Included in every deployment

Self-maintaining

Show specs

SSOT compiler: one config tree renders all containers, units, scripts
Pre-flight checks refuse to render on drift
Sub-2h bare-metal recovery, tested and verified
Operator-grade DR playbook (see /resources for the public summary)

Meaning — When hardware dies, you rebuild from a single command. When a container drifts from the source of truth, the pre-flight check refuses to render until you resolve it. When you need to hand the stack to a new operator, the runbook is already written. Maintenance isn’t a line item in your budget — it’s built into the architecture.

Decoupling & resilience

Show specs

6+ LLM providers behind one OpenAI-compatible router (LiteLLM)
Provider swap is a config diff — application code unchanged
Per-call routing by cost, latency, capability
Docker Compose, not Kubernetes — right scale for this problem

Meaning — When your LLM provider deprecates a model, your application keeps running. You swap providers with a config change, not a six-week migration. When pricing shifts, you route around it. When a sovereignty review requires EU-only inference, you switch providers without rewriting a line of application code.

Cost control

Show specs

Per-provider, per-model, per-call cost tracking
Token-level billing attribution
Budget alerts before you burn through credits
Cost dashboard accessible without SSH

Meaning — You see exactly which provider burns the most money per query, per document class, per month. Before a pilot goes to production and costs 10× what you budgeted, you catch it. Before a single runaway workflow drains your credits overnight, the alert fires. Cost control isn’t a feature — it’s why your CFO signs off on the deployment.

Add-ons — Scoped per engagement

ADD-ON

Document intelligence

Show specs

4 OCR engines: Vision LLM, Surya, Tesseract, gemma4-ocr
Per-document-class routing (not one engine for everything)
Deterministic post-processing pipeline
Audit trail: which engine, why, what it produced

Meaning — Every document class routes to the engine that handles it best. Clean PDFs go to Tesseract. Messy scans go to Vision LLM. Tables go to Surya. When an engine fails on a document class, you route around it — not retire the whole pipeline.

ADD-ON

Custom MCP bridges

Show specs

Bespoke Model Context Protocol servers for your environment
SAP / ERP automation bridges
Read-only SQL/PostgreSQL execution within your private subnet
Filesystem RAG pipelines for internal document stores
Live web orchestration via privacy-hardened proxies

Meaning — Your AI talks directly to your internal systems — SAP, SQL databases, document stores, web APIs — without exposing them to the public internet. Each MCP server is container-isolated and independently restartable. Custom-built for your stack, your data, your security boundaries.

ADD-ON

Custom integrations

Show specs

Single sign-on (SSO) with your identity provider
Custom branding and UI theming for LibreChat
Environment-specific compliance mappings (ISO 27001, SOC 2, NIS2)
Custom agent workflows for your industry vertical
Any environment-specific requirement not covered by the standard stack

Meaning — Your AI stack fits your organization — not the other way around. Authentication, branding, compliance mappings, and agent behavior are tailored to your environment during scoping.

For full custom GRC pipeline deployments, see cplt.tech.

ADD-ON

Observability stack

Show specs

Prometheus metrics for every container, service, provider call
Grafana dashboards: cost per model, latency per provider, error rates per endpoint
LiteLLM native cost tracking with token-level attribution
Alert routing to email, Slack, PagerDuty, or your existing monitoring stack
Runs on CPLT’s own production stack today

Meaning — When something breaks at 2 AM, you don’t open a Zoom call — you open the dashboard. Cost overruns, model degradation, and silent failures fire alerts before they become incidents. Built on the same Prometheus + Grafana stack that monitors CPLT’s production infrastructure.

ADD-ON

Off-site backup pipeline

Show specs

Encrypted daily snapshots to S3, Backblaze B2, or your object storage of choice
Three-layer architecture: local volume → host snapshot → encrypted off-site sync
Sub-2h bare-metal rebuild procedure, tested and verified on CPLT’s own production stack
Restore verification scripts — not just “hope the tarball is valid”
Currently running daily on CPLT’s own infrastructure (verified May 2026)

Meaning — Disaster recovery isn’t a slide deck — it’s a procedure that runs every night and a rebuild that completes in under two hours. The same pipeline that protects CPLT’s production stack protects yours.

See it running or scope a deployment?

Discuss the architecture in detail, or scope a sovereign deployment for your infrastructure.

Scope your deployment → Read the DR summary →

Every layer independently replaceable.

Core stack — Included in every deployment

Self-maintaining

Decoupling & resilience

Cost control

Add-ons — Scoped per engagement

Document intelligence

Custom MCP bridges

Custom integrations

Observability stack

Off-site backup pipeline

See it running or scope a deployment?