Data Security — AI Without Leaks

Choosing the Right Server for Your AI Assistant: A 2026 Guide for Non-Technical Leaders

Views: 17 Published: 14.06.2026
🇺🇦 UK 🇺🇸 EN 🇩🇪 DE 🇪🇸 ES
Choosing the Right Server for Your AI Assistant: A 2026 Guide for Non-Technical Leaders

The director of a law firm in Vienna asks: "We want self-hosted AI - what should we buy and where should we host it?" Most articles on this topic are either too technical or written for developers. This guide is for decision-makers who don't want to dive into GPU specs. It covers everything you need to know: which providers to choose, how much to pay, and what to ask a contractor before signing a contract.

⚡ TL;DR

  • 🏠 Self-hosted servers are necessary when data cannot leave your infrastructure — healthcare, legal, HR, finance
  • 🚫 AWS and Azure in Germany don't solve GDPR issues — they are US companies subject to the CLOUD Act regardless of server location
  • Secure providers for the EU: Hetzner (Nuremberg/Finland), OVH (Strasbourg), Contabo (Munich)
  • 💻 CPU vs. GPU: CPU is for internal tools where 30-90 seconds response time is acceptable. GPU is for public chat or when <10 seconds is required
  • 💰 Real costs: from €4-8/month (CPU, small business) to €184/month (GPU for production AI)
  • 🤝 Post-launch: a contractor sets it up; you or your administrator manage it. Or the contractor handles ongoing support

📚 Table of Contents

Why You Need a Self-Hosted Server in the First Place

A self-hosted server isn't an end in itself. It's a consequence of a specific requirement: your data cannot be processed on third-party infrastructure. If this requirement doesn't exist, cloud AI might be a cheaper and simpler option.

Most of our clients at AskYourDocs come with a specific question — not "I want a self-hosted server" but "Can we legally use cloud AI?" The answer determines whether a server is needed at all.

Three Reasons to Choose a Self-Hosted Server

Reason 1: Legal Requirements. Medical data, attorney-client privilege, client financial records — all are regulated in such a way that transferring them to third-party AI providers is either impossible or requires extensive, costly legal work. A self-hosted server solves this technically: data physically never leaves your infrastructure.

Reason 2: Predictable Costs. Cloud AI is priced "pay-per-use" — you don't know your bill for the next month in advance. A self-hosted server offers a fixed bill regardless of usage volume. At 500+ requests per day, a self-hosted server becomes cheaper than a cloud API.

Reason 3: Independence. OpenAI can change prices, terms, or even revoke access. Your model on your server: won't change without your knowledge, isn't dependent on decisions made by a US company, and will continue to work even if the AI provider experiences an outage.

When a Self-Hosted Server is NOT Needed: If you're just starting, your documents don't contain personal data, and your usage is less than 200 requests per day — start with a cloud API and migrate to a self-hosted server once you confirm its value. We always recommend starting with the option that allows you to test your hypothesis quickly.

Why AWS and Azure Germany Don't Solve GDPR Issues

The most common mistake we see: a company chooses an "EU region" in AWS or Azure and believes the GDPR problem is solved. This is not the case. The physical location of a server and the legal jurisdiction over data are different things.

This is the section we consider most important in this article — and one that most AI server guides ignore. So, let's explain it in detail and simply.

What is the CLOUD Act and Why It Affects Your Business

In 2018, the US enacted the Clarifying Lawful Overseas Use of Data Act (CLOUD Act). This law allows US law enforcement agencies to demand that US companies provide access to any of their clients' data — regardless of where the servers are physically located.

In simple terms: imagine you rent a safe deposit box at a bank in Frankfurt. But this bank is American. American federal agents have the right to approach the bank in the US with a request to open your safe deposit box in Frankfurt — and the bank is obligated to comply without notifying you. This is exactly how the CLOUD Act works for AWS, Azure, and Google Cloud.

AWS EU-Central-1 (Frankfurt), Azure Germany West Central, Google Cloud Europe-West — all these options are physically located in the EU but are operated by US companies. The CLOUD Act fully applies to them.

Why This Is Critical for Businesses in Austria and Germany

The Austrian Data Protection Authority (DSB) in the Google Analytics case (2022) established the strictest standard in the EU: it's insufficient to claim that "the probability of US intelligence accessing your data is low." A technical impossibility of such access is required. No US cloud provider can offer such a guarantee — by definition.

For medical centers and law firms in Austria and Germany, this means: AWS and Azure Germany are not acceptable solutions, regardless of price or product quality. A provider under EU jurisdiction is necessary.

Provider Physical Location Jurisdiction CLOUD Act Suitable for EU/AT/DE?
AWS EU-Central-1 Frankfurt, DE 🇺🇸 USA ✅ Applies ❌ No
Azure Germany West Central Frankfurt, DE 🇺🇸 USA ✅ Applies ❌ No
Google Cloud Europe-West Belgium/Netherlands 🇺🇸 USA ✅ Applies ❌ No
Hetzner Nuremberg/Falkenstein DE, Helsinki FI 🇩🇪 Germany ❌ Does not apply ✅ Yes
OVHcloud Strasbourg FR, Warsaw PL 🇫🇷 France ❌ Does not apply ✅ Yes
Contabo Munich DE, Nuremberg DE 🇩🇪 Germany ❌ Does not apply ✅ Yes

Choosing a Region and Provider: Hetzner, OVH, Contabo

For most businesses in AT/DE, we recommend Hetzner as the first choice — offering the best price/quality/GDPR compliance ratio among EU providers. OVH and Contabo are worthy alternatives depending on the task.

Important Update: Hetzner increased prices by 30-37% as of April 1, 2026, due to rising costs for server memory (HBM for GPUs). Even after the increase, Hetzner remains 2.5-3.3 times cheaper than AWS/GCP for equivalent configurations.

Hetzner Online — Our Default Choice

Hetzner Online GmbH is a private company headquartered in Gunzenhausen, Bavaria. Data centers are located in Nuremberg, Falkenstein (Saxony), and Helsinki. They are ISO 27001 certified. Their flat-rate pricing includes traffic (up to 20 TB in EU regions) with no hidden fees. Technical support is primarily via ticket system, without 24/7 phone support.

Ideal for: Most SMEs wanting maximum performance at minimum cost with guaranteed EU jurisdiction. Our choice for 90% of clients.

OVHcloud — The French Alternative

OVH SAS is a French company (Iliad Group), one of the largest hosting providers in the EU. Data centers are in Strasbourg, Roubaix, and Warsaw. They offer a broader range of managed services than Hetzner. Prices are slightly higher, but they have managed tiers for those who don't want to manage the server themselves.

Ideal for: Companies needing more support or additional managed services (databases, load balancers). A good alternative if Hetzner isn't suitable for technical reasons.

Contabo — The Cheapest CPU Option

Contabo GmbH is a Munich-based company operating since 2003, offering the most CPU resources for the lowest price on the market. Their €4.50/month starting price for 4 vCPUs / 8 GB RAM is a record for the EU market. They are ISO 27001 certified. Support is via tickets.

Important Note on Contabo GPUs: Their GPU servers are geared towards the enterprise segment (NVIDIA H100, L40S) and start at $790/month — unsuitable for SME AI tasks. For GPU servers, choose Hetzner or Scaleway.

Ideal for: CPU-only deployments for small businesses where cost is critical and GPUs are not needed.

Provider CPU VPS from GPU Server from Location Support Best For
Hetzner ⭐ €3.49/month €184/month (RTX 4000 Ada 20 GB) DE, FI Ticket Most projects — CPU and GPU
OVHcloud €3.99/month from €100/month FR, PL Ticket + Phone Managed services, enhanced support
Contabo €4.50/month from $790/month (H100) DE Ticket CPU-only, maximum affordability
Scaleway €3.99/month from €150/month FR Ticket GPU alternative in France

CPU vs. GPU: What Really Matters and When a GPU is Necessary

A GPU isn't about being "better" or "more powerful." GPU means "faster." CPU means "slower but cheaper." The question is whether a 30-90 second response is acceptable for your use case, versus 5-10 seconds. For most internal tools, it is.

This is a question we explain at every initial client meeting. Most executives assume a GPU is mandatory. In reality, it depends on who is waiting for the answer and how long they're willing to wait.

A Simple Analogy

A CPU is like an experienced senior lawyer: it thinks methodically, provides an accurate answer, but takes more time. A GPU is like a whole team of parallel assistants: they respond almost simultaneously due to massive parallelism. For a document where the answer is the same in both cases, it's purely a matter of waiting time.

When a CPU is Sufficient

When a GPU is Mandatory

Scenario Is CPU Enough? Recommendation
Internal FAQ for 5-10 employees ✅ Yes CPU server, Llama 3.2 8B or Qwen3 14B
Public clinic website chat ❌ No GPU 16 GB, Gemma 4 26B or Mistral Small 3
Telegram bot for an internal team ✅ Yes (if 60 sec response is okay) CPU or GPU depending on waiting tolerance
Law firm, contract search ✅ Yes for internal use CPU to start, GPU if you want Llama 3.3 70B
Medical center, patient responses ❌ No GPU is mandatory — patients expect real-time responses

What Server Configuration Suits Your Scale

Three parameters determine the required configuration: the number of documents in the system, the number of daily requests, and the model needed for response quality. Everything else is a consequence of these three.

We don't recommend "minimum requirements" without context — it's pointless. Instead, here are four typical scenarios we see with clients.

Scenario Documents Requests/Day Configuration Model Provider
Startup / Test
Small office, internal FAQ
up to 200 up to 50 CPU-only
4 vCPU / 16 GB RAM / 100 GB SSD
Llama 3.2 8B or Qwen3 14B Contabo or Hetzner CX
Production without GPU
Internal company tool
200–1000 50–200 CPU-only
8 vCPU / 32 GB RAM / 200 GB SSD
Qwen3 14B or Llama 3.3 70B (slow) Hetzner CPX or Contabo VPS XL
Production with GPU
Public chat, clients/patients
500–5000 200–500 GPU 16–20 GB
32–64 GB RAM / 500 GB SSD
Gemma 4 26B or Mistral Small 3 Hetzner GEX44 (€184/month)
High Quality
Law firm, medical center, maximum accuracy
1000+ 200–500 GPU 48 GB or 2xGPU
128 GB RAM / 1 TB SSD
Llama 3.3 70B Q4 Hetzner Dedicated or Own Server

Our advice for getting started: Begin with a CPU-only configuration under real-world load for 2-4 weeks. If the speed isn't satisfactory, migrating to a GPU takes 1 day, and your documents will already be in the system. Overpaying for a GPU upfront without a confirmed need is unjustified.

RAM and Disk, Separately

RAM: The model is fully loaded into memory. Llama 3.2 8B requires ~6 GB, Gemma 4 26B requires ~15 GB, and Llama 3.3 70B requires ~43 GB. Always allocate RAM with a ~30% buffer for the operating system and database. Insufficient RAM means the model is partially on disk, resulting in very slow performance.

Disk: The models themselves occupy 5 to 43 GB. Your documents typically take up 1-10 GB, even for large archives (text is very compact). The vector database (pgvector) adds a few more GB. A 200 GB SSD is sufficient for most SMEs.

Actual Monthly Server Costs

"How much does a server cost?" is an unanswerable question without context. The right question is: "How much does a server for my tasks cost compared to cloud AI?" Here's an honest comparison.

Prices are current as of June 2026. Hetzner increased prices by 30-37% on April 1, 2026, but remains the cheapest GDPR-compliant EU provider for AI tasks.

Current Hetzner Prices (after April 2026 increase)

Configuration Specs Price/Month Suitable For
CX23 (CPU) 2 vCPU / 4 GB RAM / 40 GB SSD €3.49 Testing only, minimal load
CX33 (CPU) 4 vCPU / 8 GB RAM / 80 GB SSD €7.99 Small model, up to 20 requests/day
CX43 (CPU) 8 vCPU / 16 GB RAM / 160 GB SSD ~€18 Qwen3 14B, up to 50 requests/day
CPX51 (CPU) 16 vCPU / 32 GB RAM / 360 GB SSD ~€45 Qwen3 14B fast or Llama 70B slow
GEX44 (GPU) ⭐ Intel Core i5 / 64 GB RAM / NVIDIA RTX 4000 Ada 20 GB €184 Gemma 4 26B or Mistral Small 3, up to 500 requests/day

Comparison with Cloud Alternatives

Option Monthly Cost GDPR Notes
OpenAI GPT-4o mini API (500 requests/day) ~$12–24 ⚠️ Risk Cheap but data is in the US
OpenAI GPT-4o API (500 requests/day) ~$100–200 ⚠️ Risk Expensive and data is in the US
Hetzner CPU + Llama 3.2 8B €7–18 ✅ Full Compliance Slow (~60 sec), but secure and cheap
Hetzner GPU GEX44 + Gemma 4 26B €184 ✅ Full Compliance 5-8 sec response, unlimited requests
AWS/Azure GPU equivalent $400–600 ❌ CLOUD Act 2.5–3x more expensive than Hetzner

Hidden Costs Not Visible in Advertising


Who is Responsible for the Server After Launch — And What It Costs You

The most common question after a demo is: "So, who maintains all of this afterwards?" The answer is simple: either your administrator after training, or a contractor for a monthly fee. There is no third option.

The word "server" might sound intimidating to a non-technical manager. In practice, once properly configured, an AI assistant on a server requires far less attention than most people assume.

What Actually Needs "Maintenance" After Launch

There are four things that require attention:

Two Options After System Handover

Option A: Your administrator manages it independently. After project handover, we train one person from your team – usually for 2–3 hours. They can: upload documents, restart the service if needed, and answer team questions. For more complex tasks (updates, configuring a new interface), you can consult us on a one-time basis.

Option B: A contractor provides monthly support. We or another contractor take full technical responsibility: monitoring, updates, incident response, and consultations. This costs between $50 and $200 per month, depending on the scope. This is suitable if your company lacks an IT specialist.

Option A: Self-Managed Option B: Contractor
Cost $0/mo (administrator's time only) $50–200/mo
Who's Needed 1 person with basic IT understanding No one — the contractor handles it
Response Time for Failure Depends on administrator's availability SLA — typically 2–4 hours during business hours
Suitable For Companies with an IT specialist or active administrator Companies without IT staff or with critical uptime requirements

Our recommendation: For most SMEs, Option A after brief training is entirely sufficient. Option B is justified for public-facing services (like a website chat for a clinic) where downtime directly impacts customer experience.

What to Ask a Contractor Before Signing a Contract

Most managers don't know what to ask contractors, and end up signing contracts without understanding key details. These eight questions will protect you from unpleasant surprises after launch.

This section is for those who are about to sign a contract for implementing an AI assistant. Whether it's us at AskYourDocs or another contractor, ask these questions before signing.

1. Which server provider is used, and where are the data physically located?

Correct Answer: Specific EU provider name (Hetzner, OVH, Contabo) and a specific data center. "Servers in the EU" without details is insufficient. "AWS Frankfurt" is the wrong answer for GDPR-sensitive data.

2. Which model will be installed, and why this specific one?

Correct Answer: The specific model name (e.g., "Gemma 4 26B via Ollama") with an explanation of why this model is suitable for your task. If the contractor cannot explain their model choice, they don't understand the architecture.

3. Who has access to the server after handover?

Correct Answer: After handover, only your administrators should have access. The contractor should not have persistent access without your request. A contractor who keeps a "backdoor" for support without your knowledge poses a legal risk.

4. What happens to the data if you cease cooperation?

Correct Answer: Since the server is yours or rented in your name, you simply continue paying the hosting provider, and the system keeps running. If the server is rented under the contractor's name, demand a transfer before signing the contract.

5. What is the guaranteed response quality, and how is it verified?

Correct Answer: The contractor should describe the acceptance testing process — specific questions, quality criteria for responses, and what happens if the quality is not met. Guarantees of "it will respond well" without metrics are empty words.

6. How much does it cost to update documents after launch?

Correct Answer: Updating documents via the admin panel should be simple and free for you. If the contractor charges a fee for each new document upload, it's either poor architecture or a manipulative practice.

7. What is included in the deployment price, and what costs extra?

Correct Answer: A clear list: server, software installation, document upload, interface configuration, administrator training — what's included and what's billed separately. It's important to clarify: does the price cover setting up a Telegram bot, WhatsApp, or just a web chat?

8. Does the contractor have specific experience with local LLMs and GDPR?

Correct Answer: Specific case studies or references to real clients (even anonymized). A contractor who is implementing Ollama for the "first time" while promising GDPR compliance is a risk to your business. This is not an area to learn on your project.

Conclusions

Want to discuss a configuration for your specific needs? In 30 minutes, we can determine the right server, model, and cost for your particular scenario — without unnecessary technical jargon.

Write on Telegram →

Read also

Sources: Hetzner Cloud for AI Projects 2026 · Hetzner Cloud Review 2026 · EDPB — European Data Protection Board · DSB — Datenschutzbehörde Austria · GDPR Local — EU AI Act Summary