GDPR Article 28 for AI Vendors: What OpenAI's DPA Actually Says (And What It Doesn't)

When enterprise procurement teams start auditing LLM architectures, the conversation hits a brick wall at GDPR Article 28. If your application or your employees are sending European personal data to an external model, you need a Data Processing Addendum (DPA). For organisations chasing bulletproof GDPR compliance, the default move is to upgrade to an OpenAI Enterprise tier and sign their DPA.

But signing a DPA is just paperwork. Understanding the architectural and legal boundaries of that contract is where actual compliance happens — and that's where most teams discover their assumptions don't hold up.

Disclaimer: This post is engineering analysis of publicly available DPA terms and standard interpretations of GDPR. It is not legal advice. Talk to your DPO and qualified counsel before making compliance decisions for your organisation. References are to the OpenAI Enterprise DPA / Business Terms as published in 2025–2026; specific clauses change and you should always validate against the version your procurement team is signing.

Here is a verifiable breakdown of what OpenAI's Enterprise DPA actually guarantees under Article 28, and the critical compliance gaps it leaves entirely on your shoulders. We're not here to talk you out of OpenAI — for many use cases it's the right answer. We're here to make sure you sign with both eyes open.

What the OpenAI Enterprise DPA actually says

Under GDPR Article 28, a processor must act only on the documented instructions of the controller. When you use the OpenAI API, ChatGPT Enterprise, or ChatGPT Team, OpenAI is your data processor. Here's what their published DPA and Business Terms commit to:

1. Zero model training on your data

The biggest fear in enterprise AI is accidental data leakage into public foundation models. OpenAI's Business Terms explicitly state that they do not use your API inputs, outputs, or fine-tuning data to train or improve their models. The prompt you send remains partitioned. Under Article 28, this satisfies the requirement that the processor strictly follow your processing instructions.

2. The 30-day retention window (and the ZDR override)

By default, OpenAI retains API inputs and outputs for up to 30 days, strictly to monitor for platform abuse and security violations. For organisations with strict GDPR Article 5 data-minimisation requirements, OpenAI offers a Zero Data Retention (ZDR) policy for eligible API endpoints. When approved and configured, your prompts and completions are not stored at rest on OpenAI's servers at all.

ZDR is not the default and not every endpoint is eligible. Confirm in writing for the specific endpoints your application actually calls.

3. Verifiable security measures (Article 32)

Article 28(3)(c) requires processors to implement adequate technical security measures. OpenAI's enterprise infrastructure documents standard frameworks: AES-256 at rest, TLS 1.2+ in transit, SOC 2 Type 2 audited services, and Enterprise Key Management (EKM) for advanced deployments where you control your own keys.

4. International transfers and data residency

Because OpenAI is a U.S.-based entity, sending EU data to their API triggers GDPR Chapter V (international transfers). The DPA handles this by incorporating EU Standard Contractual Clauses (SCCs). More importantly for strict-compliance buyers, OpenAI now offers data residency: eligible ChatGPT Enterprise and API customers can store sensitive content at rest in specific regions including Europe and the UK.

Note: data residency for storage at rest is not the same as guaranteeing the data is never processed outside that region. Read the residency terms carefully against your specific risk model.

What the OpenAI DPA doesn't say (your liability)

A DPA is a shield for the processor — it does not absolve the controller. The most common failure mode in GDPR LLM compliance is assuming an OpenAI Enterprise contract magically makes your application compliant. It doesn't. Here are the three gaps your team owns regardless of what the DPA says.

1. It doesn't provide your lawful basis (Article 6)

OpenAI's DPA covers how they process the data you send them. It does not cover whether you had the legal right to send that data in the first place. If you're building an AI-driven HR tool or an automated customer support bot, you are wholly responsible for establishing a lawful basis (Consent, Legitimate Interest, Contract, etc.) before the data ever hits the API.

This is where most "we have a DPA, we're fine" arguments collapse under audit.

2. It doesn't do the heavy lifting for DSARs

Under Article 28, a processor must assist the controller in fulfilling data subject rights — not fulfil them on the controller's behalf. If a user requests their data be deleted (Article 17), OpenAI provides tools like the Enterprise Compliance API to audit workspace conversations. The orchestration is your engineering problem.

If you log AI inputs and outputs in your own PostgreSQL or vector databases (and you almost certainly do, even if just for debugging), you must build the internal infrastructure to find, isolate, and delete that user's data from your own architecture. The DPA does not help you here.

3. It doesn't cover "shadow AI" on consumer accounts

This is the biggest operational blind spot we encounter. The OpenAI Enterprise DPA only applies to business services — API, ChatGPT Enterprise, ChatGPT Team, ChatGPT Edu.

If your employees are pasting sensitive customer data into the free tier of ChatGPT or ChatGPT Plus, there is no Article 28 DPA in place. That data may be used for model training, and it represents an outright, indefensible GDPR breach. Most organisations have this happening today and don't know it. The fix is either DLP at the network edge, a sanctioned internal tool that's actually pleasant to use, or both.

Where self-hosted AI changes the math

Self-hosting doesn't make GDPR go away — it changes which articles you're optimising for.

When the model runs on your own hardware, in your own network, processed by your own services:

Article 28 simplifies dramatically. You're not using a processor for the inference step at all. The DPA list shrinks rather than grows. You still need DPAs for any sub-processors you do use (cloud hosting, backup providers, OCR APIs if you're not running them locally) — see our post on sovereign OCR for why that's the next compliance gap most teams hit.
Article 32 (security) gets harder, not easier. You now own the encryption, the key management, the access logs, the patch cadence. The vendor isn't doing this for you anymore. This is the trade you're making.
Article 17 (right to erasure) gets cleaner. A self-hosted vector store you control can actually be purged on request — and you can produce an attestation. With a SaaS embedding API, the embeddings sat in someone else's system at some point, and your erasure story has a footnote.
Chapter V (international transfers) often disappears. If the model, the embeddings, the logs, and the operators are all in the EU, you don't need SCCs. That's not a small thing — SCCs are a continuous ongoing compliance burden, especially post-Schrems II.
Article 6 (lawful basis) is unchanged. Self-hosting doesn't give you the right to process data you weren't allowed to process in the first place.

What a "compliance-defensible" architecture actually looks like

Procurement signing an OpenAI Enterprise DPA is step one of a compliant AI architecture. It provides legal scaffolding — SCCs, zero-training guarantees, SOC 2 — and for many use cases that's sufficient.

But a contract cannot fix a leaky architecture. True sovereign GRC requires engineering the perimeter before the data leaves your network. In practice, that means picking one (or a hybrid) of three postures:

Frontier API + DLP gate. Keep OpenAI/Anthropic, but enforce a sanitisation/redaction layer on every outbound prompt. Cheap to start, expensive to maintain as your data taxonomy evolves.
Hybrid classification gate. Unclassified data goes to a frontier API; regulated data goes to a self-hosted model. Requires a real data-classification pipeline. Best-of-both-worlds when your traffic genuinely splits.
Fully self-hosted on bare metal. Everything stays in. Your DPA list doesn't grow. You take on the operator burden in exchange for the cleanest compliance story. Right answer when your data classification puts everything in the "regulated" bucket anyway.

Your compliance is ultimately determined by your infrastructure design, not your vendor's PDF.

What we do here

CPLT engagements always begin by mapping the actual data flow before we recommend an architecture. Most of the time, the procurement team has signed a DPA that covers maybe 60% of the data the application actually touches — and the gap is exactly where the auditor will find the issue.

If your team is in the middle of an AI procurement decision and the GDPR conversation is starting to feel uncomfortable, tell us what you're working with. We'll respond within 5 business days with a written scope, or an honest "no" if your situation calls for a different solution.

Comparing options across vendors? See our feature matrix — CPLT vs OpenAI Enterprise, Anthropic, Together, Anyscale, Ollama — across deployment, compliance, cost, and lock-in. Honest, fact-checked, and when a competitor wins on something we say so.