Three sample engagements illustrating the typical scope, timeline, and price range for self-hosted AI deployments in regulated industries.
LEGALTECH
80 people · Berlin
Doc-Q&A over a 12-year case archive
The need: Litigation team wanted ChatGPT-style search across two decades of pleadings, exhibits, and judgments. US providers were off the table — client data, GDPR Art. 28, German bar association rules.
- Shape
- Programmatic (one-shot)
- Stack
- LibreChat + Qwen2.5-32B + Qdrant + OCR pipeline + LiteLLM
- Hardware
- Existing rack (1× RTX 6000 Ada, 48GB) — already on-prem
- Timeline
- 6 weeks (Stage 1 → 3)
- Price band
- €18K–€25K fixed
- Add-ons
- Advanced OCR (scanned exhibits, hand-written annotations), DR runbook handoff
Outcome: 40k documents indexed, sub-2s retrieval, full audit log of every query. Lawyer can cite which document the answer came from. Continuous: optional, declined.
Sample engagement scope — illustrative, not a delivered case.
HEALTHTECH
35 people · Lyon
Patient-record summarisation, on-prem only
The need: Clinical team wanted automated discharge-summary drafting from consultation notes. Patient data cannot leave the hospital network — HDS hosting requirement, MDR Class IIa device implications.
- Shape
- Programmatic + Continuous (3 months post-launch)
- Stack
- LibreChat + Llama-3.3-70B (4-bit) + structured-output schema + audit logger
- Hardware
- 2× H100 supplied by client; physical air-gap, no internet egress
- Timeline
- 8 weeks build + 3 months monitored stabilisation
- Price band
- €32K build + €1.8K/month Continuous (8h/mo)
- Add-ons
- Custom JSON schema validators, observability stack, off-site encrypted backups
Outcome: Every model output is structured, traceable, and auditable. Clinician edits ~30% of drafts vs writing from scratch. Zero data egress.
Sample engagement scope — illustrative, not a delivered case.
FINTECH
120 people · Amsterdam
Internal copilot for a regulated trading desk
The need: Compliance and ops wanted internal chat + code-assist on a shared LLM, but DORA + MiFID II prohibited sending order-flow context to a SaaS provider. Existing GitHub Copilot use was already flagged in audit.
- Shape
- Programmatic, build-only handoff
- Stack
- LibreChat + Qwen2.5-Coder-32B + DeepSeek-V3 (via LiteLLM fallback) + Continue.dev integration
- Hardware
- Client-spec'd: 4× L40S in their colo, dual-region replication
- Timeline
- 10 weeks (includes IAM integration with their existing Okta + audit pipeline into Splunk)
- Price band
- €45K–€60K fixed
- Add-ons
- SSO integration, custom MCP server for internal data sources, DORA-aligned DR documentation
Outcome: 95 active users, fully documented for the next DORA audit, zero retainer required. Internal team owns operations from day one.
Sample engagement scope — illustrative, not a delivered case.