Module 3 of 7

The 5 Pillars of Technical Implementation

⏱ ~120 Min📖 5 Lektionen📝 5 Quiz-Fragen

🎯 Learning Objective

You will know the 5 technical pillars of GEO, be able to implement each pillar practically, and understand what files, code structures and configurations need to be created.

Lesson 3.1: Pillar 1 — Crawler Control (robots.txt for AI)

The robots.txt has been the standard way to tell web crawlers which areas they may visit for decades. What has changed: A whole new generation of AI crawlers must be explicitly addressed.

The Most Important AI Crawlers

Crawler	Operator	Function
`GPTBot`	OpenAI	Training + web browsing for ChatGPT
`ChatGPT-User`	OpenAI	Real-time browsing in chat sessions
`ClaudeBot`	Anthropic	Claude training + browsing
`PerplexityBot`	Perplexity	Real-time crawling for search queries
`Google-Extended`	Google	Gemini training
`Meta-ExternalAgent`	Meta	Meta AI training

Best Practice — robots.txt Template

# AI Engine Crawlers — Allowed
User-agent: GPTBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

# Sitemap: https://example.com/sitemap.xml
# llms.txt: https://example.com/llms.txt

The 3 Most Common Mistakes

404 on robots.txt: AI crawlers crawl conservatively → less data, fewer citations.
Global Disallow: /: Blocks everything, often a relic from development.
Network-level bot protection: Cloudflare/Akamai block bots before they reach the robots.txt.

Practice Tip

Test with curl -A "GPTBot/1.2" -I https://example.com/ — a 403 Forbidden means: network is blocking the bot.

Lesson 3.2: Pillar 2 — llms.txt — The AI Business Card

The /llms.txt is a proposed standard (llmstxt.org, Jeremy Howard, fast.ai). It provides language models with a high-density fact briefing — like a sitemap.xml for search engines, but for LLMs.

⚠️ Status Note: llms.txt is a community convention, not an official web standard (no W3C, no IETF RFC). As of 2026, there is no public confirmation that OpenAI, Anthropic, or Google systematically consume this file in production crawls. Nevertheless, it has practical value: it forces a fact-based distillation of your brand that also serves as a foundation for other GEO measures. Implement llms.txt as part of a holistic strategy — not as a standalone solution.

Structure of a Professional llms.txt

# Brand Name
> One-sentence mission with maximum semantic density.

## About Us
Short profile: founding year, location, industry, team size.

## Products / Services
- **Produkt A**: Fact-based description, USP, price range
- **Produkt B**: Technical specifications, results

## Expertise & Credentials
ISO 9001, awards, reference clients

## FAQ
- **Was kostet [Produkt]?** From €X/month.
- **Who is it suitable for?** [Specific target audience].

## Contact
Website, email, location, management

The 4 Golden Rules

Fact density over marketing speak. „120 employees at 3 locations" statt „a dynamic team".
Citable sentences. Every sentence must work 1:1 in an AI answer.
Consistency with the website. Contradictions → AI trusts no source.
Maximum 4,000 tokens (approx. 2-3 pages).

Lesson 3.3: Pillar 3 — Structured Data (JSON-LD Schema)

JSON-LD is a W3C standard and the preferred format for machine-readable, structured data. It provides an unambiguous "translation" of the page contents.

Analogy: HTML without JSON-LD is like a book without a table of contents. The information is in there somewhere, but you have to read everything.

Required Schemas

Schema Type	Purpose	Priority
`Organization`	Who is the company?	🔴 Required
`WebSite`	What is the website?	🔴 Required
`BreadcrumbList`	Page structure	🔴 Required
`FAQPage`	Common questions	🔴 Required
`Person`	Experts	🟡 High
`Product` / `Service`	Offerings	🟡 High

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Beispiel GmbH",
  "url": "https://beispiel.de",
  "foundingDate": "2005-03-15",
  "numberOfEmployees": { "@type": "QuantitativeValue", "value": 120 },
  "address": {
    "@type": "PostalAddress",
    "addressLocality": "Düsseldorf",
    "addressCountry": "DE"
  },
  "sameAs": [
    "https://linkedin.com/company/beispiel",
    "https://de.wikipedia.org/wiki/Beispiel_GmbH"
  ]
}

Lesson 3.4: Pillar 4 — Visual GEO (Optimizing Images for AI)

Modern AI models are multimodal: GPT-4o, Gemini and Claude can understand image content. Alt texts are therefore an active AI signal.

The Alt-Text Formula

[Subject] + [Context] + [Material/Property] + [Brand]

Alt-Text	Quality
`"Serum"`	❌ Useless
`"Vitamin C Serum Flasche"`	⚠️ Better
`"Vitamin C Glow Serum, 30ml glass vial with pipette, Example Cosmetics"`	✅ Excellent

Lesson 3.5: Pillar 5 — Answer Density

AI agents preferentially extract answers from the first 50–100 words after the H1. Placing empty marketing phrases here wastes the most valuable citable space.

Bad Example (80% of all websites)

„Welcome to Example Inc. We are a dynamic company that has been working with passion and innovation for many years..."

Good Example (Answer Density optimized)

„Beispiel GmbH ist der führende DACH-Anbieter für industrielle Filtertechnik seit 2005. 120 employees at 3 locations, ISO 9001 und ISO 14001 zertifiziert. HEPA-14-Filter mit 99,995% Abscheidegrad."

Practice Exercise

Open any company website. Copy the first 100 words after the H1. How many citable facts can you find? Write an optimized version.

📝 Quiz: Module 3

Test your understanding — 5 questions, 70% to pass.

Question 1: Was passiert, wenn robots.txt eine 404 zurückgibt?

KI-Crawler crawlen die gesamte Website
KI-Crawler crawlen konservativ — weniger Daten
Die Website wird sofort deindexiert
Es hat keinen Effekt

Ohne robots.txt interpretieren KI-Crawler die Lage als „unklar" und meiden möglicherweise ganze Bereiche.

Question 2: Was sind die 4 goldenen Regeln für eine effektive llms.txt?

Keywords, Backlinks, Meta-Tags, Alt-Texte
Kurz, kreativ, dynamisch, innovativ
HTML, CSS, JavaScript, PHP
Faktendichte, zitierbare Sätze, Konsistenz, max. 4000 Tokens

Die 4 Regeln: Faktendichte statt Marketing, zitierbare Sätze, Konsistenz mit der Website, maximal 4.000 Tokens.

Question 3: Wie lautet die Alt-Text-Formel für Visual GEO?

Subjekt + Kontext + Material/Eigenschaft + Marke
Keyword + Keyword + Keyword
Dateiname + Größe + Format
„Bild" + Nummer

The Alt-Text Formula: [Subjekt] + [Kontext] + [Material/Eigenschaft] + [Markenname] — maximale Informationsdichte.

Question 4: Welche JSON-LD-Schema-Typen sind Pflicht?

Recipe, Event, HowTo
Nur Organization reicht aus
Organization, WebSite, BreadcrumbList, FAQPage
Es gibt keine Required Schemas

Das Basis-Set: Organization, WebSite, BreadcrumbList, FAQPage — plus Person und Product/Service als hohe Priorität.

Question 5: Warum sind die ersten 100 Wörter nach der H1 so wichtig?

Google zeigt nur die ersten 100 Wörter an
KI-Agenten extrahieren Antworten bevorzugt von dort
Browser rendern nur die ersten 100 Wörter
Es hat nur ästhetische Gründe

Answer Density: KI-Agenten extrahieren Antworten bevorzugt aus den ersten 50-100 Wörtern — Marketing-Phrasen dort = verschenkte Zitationsfläche.

About the Author

Sascha Deforth — GEO Practitioner and Founder of TrueSource AI. Specialized in AI Visibility Optimization with 200+ audits completed. → LinkedIn