Academy Modul 3
Module 3 of 6

The 5 Pillars of Technical Implementation

⏱ ~120 Min📖 5 Lektionen📝 5 Quiz-Fragen

🎯 Learning Objective

You will know the 5 technical pillars of GEO, be able to implement each pillar practically, and understand what files, code structures and configurations need to be created.

Lesson 3.1: Pillar 1 — Crawler Control (robots.txt for AI)

The robots.txt has been the standard way to tell web crawlers which areas they may visit for decades. What has changed: A whole new generation of AI crawlers must be explicitly addressed.

The Most Important AI Crawlers

CrawlerOperatorFunction
GPTBotOpenAITraining + web browsing for ChatGPT
ChatGPT-UserOpenAIReal-time browsing in chat sessions
ClaudeBotAnthropicClaude training + browsing
PerplexityBotPerplexityReal-time crawling for search queries
Google-ExtendedGoogleGemini training
Meta-ExternalAgentMetaMeta AI training

Best Practice — robots.txt Template

# AI Engine Crawlers — Allowed
User-agent: GPTBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

# Sitemap: https://example.com/sitemap.xml
# llms.txt: https://example.com/llms.txt

The 3 Most Common Mistakes

  1. 404 on robots.txt: AI crawlers crawl conservatively → less data, fewer citations.
  2. Global Disallow: /: Blocks everything, often a relic from development.
  3. Network-level bot protection: Cloudflare/Akamai block bots before they reach the robots.txt.

Practice Tip

Test with curl -A "GPTBot/1.2" -I https://example.com/ — a 403 Forbidden means: network is blocking the bot.

Lesson 3.2: Pillar 2 — llms.txt — The AI Business Card

The /llms.txt is a proposed standard (llmstxt.org, Jeremy Howard, fast.ai). It provides language models with a high-density fact briefing — like a sitemap.xml for search engines, but for LLMs.

Structure of a Professional llms.txt

# Brand Name
> One-sentence mission with maximum semantic density.

## About Us
Short profile: founding year, location, industry, team size.

## Products / Services
- **Produkt A**: Fact-based description, USP, price range
- **Produkt B**: Technical specifications, results

## Expertise & Credentials
ISO 9001, awards, reference clients

## FAQ
- **Was kostet [Produkt]?** From €X/month.
- **Who is it suitable for?** [Specific target audience].

## Contact
Website, email, location, management

The 4 Golden Rules

  1. Fact density over marketing speak. „120 employees at 3 locations" statt „a dynamic team".
  2. Citable sentences. Every sentence must work 1:1 in an AI answer.
  3. Consistency with the website. Contradictions → AI trusts no source.
  4. Maximum 4,000 tokens (approx. 2-3 pages).

Lesson 3.3: Pillar 3 — Structured Data (JSON-LD Schema)

JSON-LD is a W3C standard and the preferred format for machine-readable, structured data. It provides an unambiguous "translation" of the page contents.

Analogy: HTML without JSON-LD is like a book without a table of contents. The information is in there somewhere, but you have to read everything.

Required Schemas

Schema TypePurposePriority
OrganizationWho is the company?🔴 Required
WebSiteWhat is the website?🔴 Required
BreadcrumbListPage structure🔴 Required
FAQPageCommon questions🔴 Required
PersonExperts🟡 High
Product / ServiceOfferings🟡 High
{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Beispiel GmbH",
  "url": "https://beispiel.de",
  "foundingDate": "2005-03-15",
  "numberOfEmployees": { "@type": "QuantitativeValue", "value": 120 },
  "address": {
    "@type": "PostalAddress",
    "addressLocality": "Düsseldorf",
    "addressCountry": "DE"
  },
  "sameAs": [
    "https://linkedin.com/company/beispiel",
    "https://de.wikipedia.org/wiki/Beispiel_GmbH"
  ]
}

Lesson 3.4: Pillar 4 — Visual GEO (Optimizing Images for AI)

Modern AI models are multimodal: GPT-4o, Gemini and Claude can understand image content. Alt texts are therefore an active AI signal.

The Alt-Text Formula

[Subject] + [Context] + [Material/Property] + [Brand]
Alt-TextQuality
"Serum"❌ Useless
"Vitamin C Serum Flasche"⚠️ Better
"Vitamin C Glow Serum, 30ml Glasflakon mit Pipette, MIRI Cosmetics"✅ Excellent

Lesson 3.5: Pillar 5 — Answer Density

AI agents preferentially extract answers from the first 50–100 words after the H1. Placing empty marketing phrases here wastes the most valuable citable space.

Bad Example (80% of all websites)

„Welcome to Example Inc. We are a dynamic company that has been working with passion and innovation for many years..."

Good Example (Answer Density optimized)

„Beispiel GmbH ist der führende DACH-Anbieter für industrielle Filtertechnik seit 2005. 120 employees at 3 locations, ISO 9001 und ISO 14001 zertifiziert. HEPA-14-Filter mit 99,995% Abscheidegrad."

Practice Exercise

Open any company website. Copy the first 100 words after the H1. How many citable facts can you find? Write an optimized version.

📝 Quiz: Module 3

Test your understanding — 5 questions, 70% to pass.

Question 1: Was passiert, wenn robots.txt eine 404 zurückgibt?

  • KI-Crawler crawlen die gesamte Website
  • KI-Crawler crawlen konservativ — weniger Daten
  • Die Website wird sofort deindexiert
  • Es hat keinen Effekt
Ohne robots.txt interpretieren KI-Crawler die Lage als „unklar" und meiden möglicherweise ganze Bereiche.

Question 2: Was sind die 4 goldenen Regeln für eine effektive llms.txt?

  • Keywords, Backlinks, Meta-Tags, Alt-Texte
  • Kurz, kreativ, dynamisch, innovativ
  • HTML, CSS, JavaScript, PHP
  • Faktendichte, zitierbare Sätze, Konsistenz, max. 4000 Tokens
Die 4 Regeln: Faktendichte statt Marketing, zitierbare Sätze, Konsistenz mit der Website, maximal 4.000 Tokens.

Question 3: Wie lautet die Alt-Text-Formel für Visual GEO?

  • Subjekt + Kontext + Material/Eigenschaft + Marke
  • Keyword + Keyword + Keyword
  • Dateiname + Größe + Format
  • „Bild" + Nummer
The Alt-Text Formula: [Subjekt] + [Kontext] + [Material/Eigenschaft] + [Markenname] — maximale Informationsdichte.

Question 4: Welche JSON-LD-Schema-Typen sind Pflicht?

  • Recipe, Event, HowTo
  • Nur Organization reicht aus
  • Organization, WebSite, BreadcrumbList, FAQPage
  • Es gibt keine Required Schemas
Das Basis-Set: Organization, WebSite, BreadcrumbList, FAQPage — plus Person und Product/Service als hohe Priorität.

Question 5: Warum sind die ersten 100 Wörter nach der H1 so wichtig?

  • Google zeigt nur die ersten 100 Wörter an
  • KI-Agenten extrahieren Antworten bevorzugt von dort
  • Browser rendern nur die ersten 100 Wörter
  • Es hat nur ästhetische Gründe
Answer Density: KI-Agenten extrahieren Antworten bevorzugt aus den ersten 50-100 Wörtern — Marketing-Phrasen dort = verschenkte Zitationsfläche.

About the Author

Sascha Deforth — GEO Practitioner and Founder of TrueSource AI. Specialized in AI Visibility Optimization with 200+ audits completed. → LinkedIn