Blog — Engineering Notes on Self-Hosted AI

May 25, 2026·8 min

€8K Server vs €60K/year OpenAI Enterprise: When Each One Actually Wins

Every CTO evaluating generative AI eventually does the same napkin math:

May 25, 2026·8 min

GDPR Article 28 for AI Vendors: What OpenAI's DPA Actually Says (And What It Doesn't)

When enterprise procurement teams start auditing LLM architectures, the conversation hits a brick wall at GDPR Article 28. If your application or your employees are sending European personal data to an external model,…

Read post →

May 25, 2026·11 min

The 3 Hidden Failure Modes of Self-Hosted LLMs (When You Scale Past 25 Users)

Spinning up a local Large Language Model is trivial today. A Docker Compose file, a consumer GPU, and you have a private AI assistant. The self-hosted AI dream is real — until you invite your team to use it.

Read post →

May 25, 2026·9 min

Sovereign OCR: When Scanned PDFs Eat Your AI Pipeline (And Your Compliance Posture)

You bought a private LLM. You routed every chat through your own infrastructure. You wrote a DPA addendum your legal team actually signed. And then you piped your documents through a SaaS OCR API to extract the text —…

Read post →

May 25, 2026·8 min

vLLM vs llama.cpp vs Ollama at 25 Users: What the Published Benchmarks Actually Show

If you run ollama run llama3.1 on your MacBook, see tokens flying across the screen at 80 TPS, and conclude you're ready to deploy an enterprise AI API — you're walking into a trap.

Read post →