The 5 Pillars of Technical Implementation
🎯 Learning Objective
You will know the 5 technical pillars of GEO, be able to implement each pillar practically, and understand what files, code structures and configurations need to be created.
Lesson 3.1: Pillar 1 — Crawler Control (robots.txt for AI)
The robots.txt has been the standard way to tell web crawlers which areas they may visit for decades. What has changed: A whole new generation of AI crawlers must be explicitly addressed.
The Most Important AI Crawlers
| Crawler | Operator | Function |
|---|---|---|
GPTBot | OpenAI | Training + web browsing for ChatGPT |
ChatGPT-User | OpenAI | Real-time browsing in chat sessions |
ClaudeBot | Anthropic | Claude training + browsing |
PerplexityBot | Perplexity | Real-time crawling for search queries |
Google-Extended | Gemini training | |
Meta-ExternalAgent | Meta | Meta AI training |
Best Practice — robots.txt Template
# AI Engine Crawlers — Allowed
User-agent: GPTBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Allow: /
# Sitemap: https://example.com/sitemap.xml
# llms.txt: https://example.com/llms.txt
The 3 Most Common Mistakes
- 404 on robots.txt: AI crawlers crawl conservatively → less data, fewer citations.
- Global
Disallow: /: Blocks everything, often a relic from development. - Network-level bot protection: Cloudflare/Akamai block bots before they reach the robots.txt.
Practice Tip
Test with curl -A "GPTBot/1.2" -I https://example.com/ — a 403 Forbidden means: network is blocking the bot.
Lesson 3.2: Pillar 2 — llms.txt — The AI Business Card
The /llms.txt is a proposed standard (llmstxt.org, Jeremy Howard, fast.ai). It provides language models with a high-density fact briefing — like a sitemap.xml for search engines, but for LLMs.
⚠️ Status Note: llms.txt is a community convention, not an official web standard (no W3C, no IETF RFC). As of 2026, there is no public confirmation that OpenAI, Anthropic, or Google systematically consume this file in production crawls. Nevertheless, it has practical value: it forces a fact-based distillation of your brand that also serves as a foundation for other GEO measures. Implement llms.txt as part of a holistic strategy — not as a standalone solution.
Structure of a Professional llms.txt
# Brand Name
> One-sentence mission with maximum semantic density.
## About Us
Short profile: founding year, location, industry, team size.
## Products / Services
- **Produkt A**: Fact-based description, USP, price range
- **Produkt B**: Technical specifications, results
## Expertise & Credentials
ISO 9001, awards, reference clients
## FAQ
- **Was kostet [Produkt]?** From €X/month.
- **Who is it suitable for?** [Specific target audience].
## Contact
Website, email, location, management
The 4 Golden Rules
- Fact density over marketing speak. „120 employees at 3 locations" statt „a dynamic team".
- Citable sentences. Every sentence must work 1:1 in an AI answer.
- Consistency with the website. Contradictions → AI trusts no source.
- Maximum 4,000 tokens (approx. 2-3 pages).
Lesson 3.3: Pillar 3 — Structured Data (JSON-LD Schema)
JSON-LD is a W3C standard and the preferred format for machine-readable, structured data. It provides an unambiguous "translation" of the page contents.
Analogy: HTML without JSON-LD is like a book without a table of contents. The information is in there somewhere, but you have to read everything.
Required Schemas
| Schema Type | Purpose | Priority |
|---|---|---|
Organization | Who is the company? | 🔴 Required |
WebSite | What is the website? | 🔴 Required |
BreadcrumbList | Page structure | 🔴 Required |
FAQPage | Common questions | 🔴 Required |
Person | Experts | 🟡 High |
Product / Service | Offerings | 🟡 High |
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "Beispiel GmbH",
"url": "https://beispiel.de",
"foundingDate": "2005-03-15",
"numberOfEmployees": { "@type": "QuantitativeValue", "value": 120 },
"address": {
"@type": "PostalAddress",
"addressLocality": "Düsseldorf",
"addressCountry": "DE"
},
"sameAs": [
"https://linkedin.com/company/beispiel",
"https://de.wikipedia.org/wiki/Beispiel_GmbH"
]
}
Lesson 3.4: Pillar 4 — Visual GEO (Optimizing Images for AI)
Modern AI models are multimodal: GPT-4o, Gemini and Claude can understand image content. Alt texts are therefore an active AI signal.
The Alt-Text Formula
[Subject] + [Context] + [Material/Property] + [Brand]
| Alt-Text | Quality |
|---|---|
"Serum" | ❌ Useless |
"Vitamin C Serum Flasche" | ⚠️ Better |
"Vitamin C Glow Serum, 30ml glass vial with pipette, Example Cosmetics" | ✅ Excellent |
Lesson 3.5: Pillar 5 — Answer Density
AI agents preferentially extract answers from the first 50–100 words after the H1. Placing empty marketing phrases here wastes the most valuable citable space.
Bad Example (80% of all websites)
„Welcome to Example Inc. We are a dynamic company that has been working with passion and innovation for many years..."
Good Example (Answer Density optimized)
„Beispiel GmbH ist der führende DACH-Anbieter für industrielle Filtertechnik seit 2005. 120 employees at 3 locations, ISO 9001 und ISO 14001 zertifiziert. HEPA-14-Filter mit 99,995% Abscheidegrad."
Practice Exercise
Open any company website. Copy the first 100 words after the H1. How many citable facts can you find? Write an optimized version.
📝 Quiz: Module 3
Test your understanding — 5 questions, 70% to pass.
Question 1: Was passiert, wenn robots.txt eine 404 zurückgibt?
Question 2: Was sind die 4 goldenen Regeln für eine effektive llms.txt?
Question 3: Wie lautet die Alt-Text-Formel für Visual GEO?
Question 4: Welche JSON-LD-Schema-Typen sind Pflicht?
Question 5: Warum sind die ersten 100 Wörter nach der H1 so wichtig?
About the Author
Sascha Deforth — GEO Practitioner and Founder of TrueSource AI. Specialized in AI Visibility Optimization with 200+ audits completed. → LinkedIn