GDPR and AI on Documents: What Every Business Must Know in 2026

You have implemented or are planning to implement an AI assistant to work with corporate documents — contracts, medical records, client files. And somewhere in the background there is a nagging feeling: is this actually legal? Are we violating something? Short answer: AI on documents and GDPR are compatible. But only if you understand three things: where data is stored, who is the data processor, and what level of isolation to choose.

⚡ In Brief

🏢 Who should read this: business owners, lawyers, heads of medical centres in the EU and Ukraine
⚠️ Main risk: most cloud AI services store your documents on servers in the US — this is a GDPR violation for businesses in the EU
✅ Solution: self-hosted AI on your own server in Europe — data never leaves your infrastructure
💰 Cost of violation: up to €20M or 4% of annual company turnover
⏱ Time to deploy a GDPR-compliant solution: 1–3 business days
👇 Below — a detailed breakdown: what GDPR is, what the risks are, what to check, and a 10-question checklist

📚 Contents

What is GDPR and why it applies to AI
What counts as personal data in the context of documents
Where your data is physically stored when using AI services
What "data processor" means and who it is in your case
Self-hosted vs cloud AI: the difference from a GDPR perspective
What happens if you violate GDPR when using AI
Checklist: 10 questions before deploying AI in your business
Frequently asked questions
Conclusions
Want a GDPR-compliant solution for your business?

What is GDPR and why it applies to AI

GDPR is the EU General Data Protection Regulation, which came into force in 2018. It governs any processing of personal data of EU citizens — regardless of where your company is located. If you process data of clients or employees from the EU through AI, GDPR applies to you directly.

Before 2018, data protection in Europe was handled in a patchwork manner: each country had its own law, requirements varied, and large companies could manoeuvre between jurisdictions. GDPR changed this radically — unified rules for the entire EU, a unified system of fines, and a unified approach to what constitutes a violation.

A key point for businesses: GDPR does not only apply to companies based in the EU. If your client or employee is an EU citizen, you fall under the regulation even if your office is in Kyiv or New York. This is particularly relevant for Ukrainian businesses working with clients in Germany, Austria or Poland.

Now, AI. When you upload corporate documents to ChatGPT, Notion AI or any other cloud service, you are effectively transferring data to a third party for processing. If those documents contain personal data — and they almost always do — that processing falls under GDPR. The question is no longer "do we use AI?" — it is where exactly that AI processes your data.

In December 2024, the European Data Protection Board (EDPB) published a dedicated opinion on AI and GDPR, clearly stating that even if an AI model was trained on anonymised data, this does not automatically place it outside the scope of GDPR — each use case requires a separate assessment.

Summary: GDPR is not about large corporations or abstract "data". It is about a specific document containing a client's name that you uploaded to an AI service this morning.

What counts as personal data in the context of documents

Personal data is any information that allows a natural person to be identified, directly or indirectly. In corporate documents, such data appears far more often than it might seem.

Most business owners think of personal data as passports and medical records. In reality the scope is much broader. Here is what typically appears in corporate documents and falls under GDPR:

✔️ Contracts — client name, address, tax ID, signature, contact details
✔️ Invoices and delivery notes — name of a natural person or sole trader, bank details
✔️ HR documents — CVs, employment contracts, salary data, medical examination records
✔️ Medical protocols — diagnoses, prescriptions, medical history (special category data)
✔️ Client correspondence — email, phone number, address, content of communications
✔️ Legal case files — details of the parties, case circumstances, court decisions

A separate topic: special categories of data under Article 9 GDPR. These include data concerning health, racial or ethnic origin, political opinions, religious beliefs, trade union membership, and genetic and biometric data. Heightened requirements apply to these: processing is only permitted with explicit consent or in cases strictly defined by law.

For medical centres this means that virtually every document carries heightened protection. For law firms, most client files contain data that falls under GDPR.

An important nuance: data does not cease to be personal simply because it is in a PDF or Word file. Format is irrelevant — content is what matters. If an AI service reads your PDF containing a client's name, it is processing personal data, and GDPR applies in full.

Summary: if your document workflow contains names, contact details or medical data of natural persons — and it does in almost every business — your documents fall under GDPR when transferred to AI services.

Where your data is physically stored when using AI services

Most popular AI services store uploaded documents and queries on servers in the United States. For businesses in the EU this automatically creates a problem: transferring data outside the EU is governed by Articles 44–49 GDPR and requires specific safeguards.

When you upload a document to a cloud AI service, the following happens: the file is copied to the provider's servers where it is stored for processing. Depending on the terms of use, it may remain there for anywhere from a few hours to several months. Here is where the servers of the most popular services are located:

✔️ OpenAI (ChatGPT, FileSearch) — primarily the US, Microsoft Azure data centres
✔️ Notion AI — the US, AWS infrastructure
✔️ Google (Gemini, NotebookLM) — the US, proprietary data centres
✔️ Microsoft Copilot — depends on the plan; business plans may include EU servers, but not by default

Why is this a problem? GDPR strictly regulates the transfer of personal data outside the EU. Following the landmark Court of Justice of the EU ruling in the Schrems II case in 2020, Standard Contractual Clauses (SCCs) alone are no longer sufficient — a risk assessment for the specific destination country is required. The United States is traditionally considered a problematic jurisdiction due to government data access laws (CLOUD Act, FISA).

A striking example: on 22 May 2023, the Irish DPC fined Meta €1.2 billion — a record sum in GDPR history — precisely for the systematic transfer of European users' data to servers in the US without adequate safeguards. Official EDPB decision →

The practical conclusion for businesses: if you are in the EU and upload documents containing personal data to ChatGPT or Notion AI, you are almost certainly transferring data to the US without the required legal basis. Even if the provider has signed SCCs, this may be insufficient without a DPIA (Data Protection Impact Assessment).

How AskYourDocs solves this

AskYourDocs is deployed entirely on your own server — in any EU region of your choice: Germany, Austria, the Netherlands, Poland. Documents, the knowledge base and all queries remain exclusively on your infrastructure. No file and no query is ever transferred outside the European zone.

After the project is handed over, I as the developer have no technical access to your database or documents — you receive full control along with administrator credentials. This is a fundamental difference from any SaaS solution where the provider always retains technical access to your data.

In fully closed-circuit mode (using a local Ollama model), even queries to the LLM never leave your server — the system operates in complete isolation from the internet.

Summary: "where is data stored?" is the first and most important question before choosing any AI service for document work. For businesses in the EU, the only answer that resolves this question completely is a server in Europe under your control.

What "data processor" means and who it is in your case

GDPR separates roles: the data controller (your company) decides why and what data to process. The data processor (the AI service) does so on your behalf. But responsibility remains with you — as the controller.

This is one of the most important and least understood aspects of GDPR for businesses. Let us work through it with an example.

Imagine a law firm that uploads client contracts to an AI assistant for search. In this setup:

✔️ The law firm = data controller — it decided to upload these documents and is responsible to clients for their data
✔️ The AI service = data processor — it processes data on behalf of the law firm

Under Article 28 GDPR, a Data Processing Agreement (DPA) must be signed between the controller and the processor. Without it the entire arrangement is unlawful, even if everything is technically configured correctly.

What is critical to understand: if the processor (the AI service) violates GDPR, the controller (your company) also bears responsibility. You cannot "shift" responsibility to the provider by signing a DPA. A DPA sets out the conditions of processing but does not relieve you of the obligation to select reliable processors and oversee them.

In practice: OpenAI, Notion and most SaaS providers offer standard DPA documents that can be signed. But the mere existence of a DPA does not resolve the questions of server geolocation and cross-border data transfers — those are separate requirements.

With a self-hosted solution where AI is deployed on your own server, the setup is fundamentally different: there is no external processor. Your company is both controller and de facto processor. The question of a DPA with a third party simply does not arise.

Summary: before deploying any AI service, check: is there a signed DPA, what does it say about data geolocation, and does this meet your obligations to your clients.

Self-hosted vs cloud AI: the difference from a GDPR perspective

Self-hosted AI is a solution where all components (database, documents, AI model) are deployed on your server. Cloud AI means transferring documents to a provider. From a GDPR perspective the difference is fundamental: in the first case data never leaves your perimeter; in the second, a full chain of requirements applies.

Let us compare the two approaches across key parameters:

Parameter	Cloud AI (ChatGPT, Notion AI)	Self-hosted AI (AskYourDocs)
Where documents are stored	Provider's servers (US)	Your server (anywhere)
Data transfer outside the EU	Yes, automatically	No, if server is in the EU
DPA with provider required	Yes, mandatory	No (no external processor)
Provider access to data	Technically possible	None
Closed circuit (offline)	Not possible	Yes (with Ollama)
GDPR compliance	Requires additional measures	Compliant by default with EU server

There is another important dimension: isolation level. A self-hosted solution can be configured in different ways:

✔️ Hybrid mode — documents and knowledge base on your server, but an external LLM (OpenAI, Mistral) is used to generate responses. Only text fragments — without file names or metadata — are sent to the LLM provider. A good balance between response quality and security.
✔️ Fully closed circuit — all components on your server, including the AI model (e.g. Ollama with Llama or Mistral). No query ever reaches the internet. The mandatory option for healthcare, law firms and public sector organisations.

Which isolation level suits your business depends on your industry, regulatory requirements and the type of documents involved. AskYourDocs helps determine the optimal configuration on the first call: together we analyse what data is being processed, what legal obligations apply, and what level of protection closes all GDPR questions in your specific case — without unnecessary costs for over-engineering.

For businesses in Germany and Austria it is worth factoring in an additional layer: federal data protection legislation (BDSG in Germany, DSG in Austria) may set stricter requirements than baseline GDPR. More on this in the article AI and GDPR in Germany and Austria: requirements for corporate systems.

Summary: self-hosted AI is not simply "more secure" — it eliminates an entire class of GDPR risks associated with transferring data to third parties.

What happens if you violate GDPR when using AI

GDPR fines reach up to €20M or 4% of a company's annual global turnover, whichever is higher. But financial penalties are not the only risk: a ban on data processing, reputational damage and client lawsuits are equally real.

The GDPR penalty system (Article 83) operates in two tiers:

✔️ Less serious violations (technical requirements, documentation, security measures) — up to €10M or 2% of annual turnover
✔️ Serious violations (unlawful data processing, infringement of data subjects' rights, unauthorised cross-border data transfers) — up to €20M or 4% of annual turnover

Transferring documents containing personal data to ChatGPT or Notion AI without a legal basis for cross-border transfer is a serious violation that falls into the second tier.

Here are real cases from 2024 that illustrate the trend:

✔️ LinkedIn — €310M (October 2024, Ireland) — for behavioural analysis of user data without proper consent. Source: Data Privacy Manager
✔️ OpenAI/ChatGPT — €15M (December 2024, Italy) — for training the model on personal data without adequately informing users
✔️ Clearview AI — €30.5M (September 2024, the Netherlands) — for unlawful collection of biometric data
✔️ Meta — €251M (December 2024, Ireland) — for a data security breach

Total GDPR fines exceeded €5.88 billion by early 2025. Regulators are no longer limiting enforcement to large technology companies — medical institutions, banks, energy companies and retailers are all being fined.

It is important to understand: a regulator can not only impose a fine but also ban data processing altogether. For a company that already depends on an AI system, this can mean a complete halt to operations until the violations are remedied.

Worth mentioning separately is the EU AI Act, which entered into force on 1 August 2024. For AI systems processing sensitive data (healthcare, legal services, HR), it introduces additional requirements around transparency and documentation. Fines under the AI Act reach up to €35M or 7% of annual turnover.

Summary: fines are real, regulators are active, and the trend is towards tightening — not relaxing. It is better to spend a day configuring things correctly than to explain to clients later why their data ended up on servers in the United States.

Checklist: 10 questions before deploying AI in your business

Before entrusting corporate documents to any AI service, ask yourself these 10 questions. If the answer to most of them is "I don't know" — the solution is not ready for deployment.

Assess your AI service against this list:

Block 1: Where and how data is stored

✔️ 1. In which country are the provider's servers located? GDPR compliance requires a server in the EU or a country with an adequacy decision.
✔️ 2. Does the provider retain uploaded documents after processing? If so — for how long and with what access rights?
✔️ 3. Is your data used to train the model? Most enterprise plans prohibit this, but it is worth checking the terms.

Block 2: Legal compliance

✔️ 4. Has a Data Processing Agreement (DPA) been signed? Without one, any transfer of data to third-party processors is unlawful.
✔️ 5. Is there a legal basis for processing the data? This is usually legitimate interest or contractual necessity — but it must be documented.
✔️ 6. Has a DPIA (Data Protection Impact Assessment) been carried out? Mandatory for AI systems processing special categories of data.

Block 3: Technical security

✔️ 7. Is data encrypted in transit and at rest? The minimum standard is TLS 1.2+ in transit and AES-256 at rest.
✔️ 8. Who on the provider's side has technical access to your documents? Staff with access to data is a potential source of leaks.

Block 4: Control and accountability

✔️ 9. Can all your data be deleted from the provider's systems on request? The right to erasure is one of the fundamental rights under GDPR.
✔️ 10. What happens to your data if you stop using the service? There must be a clear process for deletion or data portability.

A more detailed checklist with 20 questions for CTOs, lawyers and executives is available in: AI Security Checklist for Business: 20 Questions Before Deployment.

Summary: if a provider cannot clearly answer these 10 questions, it is a signal either that the service is not ready for enterprise use or that information about data processing is being withheld.

Frequently asked questions

Can a small business be fined under GDPR?

Yes. GDPR applies to any organisation regardless of size. Fines for small businesses are usually lower in absolute terms (regulators take company size into account), but the percentages are the same — up to 4% of annual turnover. In addition, a ban on data processing or reputational damage from a data breach can be critical for a small business.

Is ticking "I agree to the terms" in a cloud AI service sufficient?

No. Accepting terms of service is not the same as signing a DPA and does not resolve the question of the legal basis for cross-border data transfers. Corporate use requires a separate data processing agreement.

If we are based in Ukraine, does GDPR apply to us?

It does if you: a) have clients or employees from the EU, or b) offer goods or services to EU residents. For companies working with Germany, Austria or Poland, GDPR is effectively mandatory. GDPR also serves as the reference point for Ukraine's Law on Personal Data Protection.

What is a "closed circuit" and who needs it?

A closed circuit is a configuration in which all AI components (database, documents, language model) are deployed on an isolated server with no internet access. No query is ever sent to external services. This is the mandatory level of protection for healthcare institutions, law firms and public sector organisations. More details in the article Closed Circuit with Ollama: AI Without the Internet for Business.

Is there a difference between GDPR and the EU AI Act?

Yes. GDPR governs the processing of personal data and has been in force since 2018. The EU AI Act (in force from August 2024) regulates AI systems by risk level and sets requirements for transparency, documentation and testing. Both regulations can apply simultaneously to the same system. More details in the article AI and GDPR in Germany and Austria: requirements for corporate systems.

Conclusions

🏢 The problem: most cloud AI services store documents on servers in the US — for businesses in the EU this constitutes a GDPR violation
⚠️ The risk: fines up to €20M or 4% of turnover, a ban on data processing, reputational damage
✅ The solution: self-hosted AI on a server in the EU — data never leaves your perimeter
📋 The action: assess your current AI service against the 10-question checklist above
🎯 The recommendation: for healthcare and law firms — only a closed circuit with no data transfer to external LLMs

The key takeaway: GDPR and AI are not in conflict — this is a problem with a known solution. The solution is called "data stays on your server".

Want a GDPR-compliant AI solution for your business?

Send us 2–3 of your real documents — and within 30 minutes we will show you a live demonstration: how AI answers questions from your knowledge base, and exactly where your data is physically located in the process. Free of charge. No registration. No obligations.

Want to see the solution in action on the homepage? askyourdocs.org/en/#try-demo