Academy Modul 3
Module 3 of 7

The 5 Pillars of Technical Implementation

⏱ ~120 Min📖 5 Lektionen📝 5 Quiz-Fragen

🎯 Learning Objective

You will know the 5 technical pillars of GEO, be able to implement each pillar practically, and understand what files, code structures and configurations need to be created.

Lesson 3.1: Pillar 1 — Crawler Control (robots.txt for AI)

The robots.txt has been the standard way to tell web crawlers which areas they may visit for decades. What has changed: A whole new generation of AI crawlers must be explicitly addressed.

The Most Important AI Crawlers

CrawlerOperatorFunction
GPTBotOpenAITraining + web browsing for ChatGPT
ChatGPT-UserOpenAIReal-time browsing in chat sessions
ClaudeBotAnthropicClaude training + browsing
PerplexityBotPerplexityReal-time crawling for search queries
Google-ExtendedGoogleGemini training
Meta-ExternalAgentMetaMeta AI training

Best Practice — robots.txt Template

# AI Engine Crawlers — Allowed
User-agent: GPTBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

# Sitemap: https://example.com/sitemap.xml
# llms.txt: https://example.com/llms.txt

The 3 Most Common Mistakes

  1. 404 on robots.txt: AI crawlers crawl conservatively → less data, fewer citations.
  2. Global Disallow: /: Blocks everything, often a relic from development.
  3. Network-level bot protection: Cloudflare/Akamai block bots before they reach the robots.txt.

Practice Tip

Test with curl -A "GPTBot/1.2" -I https://example.com/ — a 403 Forbidden means: network is blocking the bot.

Lesson 3.2: Pillar 2 — llms.txt — The AI Business Card

The /llms.txt is a proposed standard (llmstxt.org, Jeremy Howard, fast.ai). It provides language models with a high-density fact briefing — like a sitemap.xml for search engines, but for LLMs.

⚠️ Status Note: llms.txt is a community convention, not an official web standard (no W3C, no IETF RFC). As of 2026, there is no public confirmation that OpenAI, Anthropic, or Google systematically consume this file in production crawls. Nevertheless, it has practical value: it forces a fact-based distillation of your brand that also serves as a foundation for other GEO measures. Implement llms.txt as part of a holistic strategy — not as a standalone solution.

Structure of a Professional llms.txt

# Brand Name
> One-sentence mission with maximum semantic density.

## About Us
Short profile: founding year, location, industry, team size.

## Products / Services
- **Produkt A**: Fact-based description, USP, price range
- **Produkt B**: Technical specifications, results

## Expertise & Credentials
ISO 9001, awards, reference clients

## FAQ
- **Was kostet [Produkt]?** From €X/month.
- **Who is it suitable for?** [Specific target audience].

## Contact
Website, email, location, management

The 4 Golden Rules

  1. Fact density over marketing speak. „120 employees at 3 locations" statt „a dynamic team".
  2. Citable sentences. Every sentence must work 1:1 in an AI answer.
  3. Consistency with the website. Contradictions → AI trusts no source.
  4. Maximum 4,000 tokens (approx. 2-3 pages).

Lesson 3.3: Pillar 3 — Structured Data (JSON-LD Schema)

JSON-LD is a W3C standard and the preferred format for machine-readable, structured data. It provides an unambiguous "translation" of the page contents.

Analogy: HTML without JSON-LD is like a book without a table of contents. The information is in there somewhere, but you have to read everything.

Required Schemas

Schema TypePurposePriority
OrganizationWho is the company?🔴 Required
WebSiteWhat is the website?🔴 Required
BreadcrumbListPage structure🔴 Required
FAQPageCommon questions🔴 Required
PersonExperts🟡 High
Product / ServiceOfferings🟡 High
{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Beispiel GmbH",
  "url": "https://beispiel.de",
  "foundingDate": "2005-03-15",
  "numberOfEmployees": { "@type": "QuantitativeValue", "value": 120 },
  "address": {
    "@type": "PostalAddress",
    "addressLocality": "Düsseldorf",
    "addressCountry": "DE"
  },
  "sameAs": [
    "https://linkedin.com/company/beispiel",
    "https://de.wikipedia.org/wiki/Beispiel_GmbH"
  ]
}

Lesson 3.4: Pillar 4 — Visual GEO (Optimizing Images for AI)

Modern AI models are multimodal: GPT-4o, Gemini and Claude can understand image content. Alt texts are therefore an active AI signal.

The Alt-Text Formula

[Subject] + [Context] + [Material/Property] + [Brand]
Alt-TextQuality
"Serum"❌ Useless
"Vitamin C Serum Flasche"⚠️ Better
"Vitamin C Glow Serum, 30ml glass vial with pipette, Example Cosmetics"✅ Excellent

Lesson 3.5: Pillar 5 — Answer Density

AI agents preferentially extract answers from the first 50–100 words after the H1. Placing empty marketing phrases here wastes the most valuable citable space.

Bad Example (80% of all websites)

„Welcome to Example Inc. We are a dynamic company that has been working with passion and innovation for many years..."

Good Example (Answer Density optimized)

„Beispiel GmbH ist der führende DACH-Anbieter für industrielle Filtertechnik seit 2005. 120 employees at 3 locations, ISO 9001 und ISO 14001 zertifiziert. HEPA-14-Filter mit 99,995% Abscheidegrad."

Practice Exercise

Open any company website. Copy the first 100 words after the H1. How many citable facts can you find? Write an optimized version.

📝 Quiz: Module 3

Test your understanding — 5 questions, 70% to pass.

Question 1: Was passiert, wenn robots.txt eine 404 zurückgibt?

  • KI-Crawler crawlen die gesamte Website
  • KI-Crawler crawlen konservativ — weniger Daten
  • Die Website wird sofort deindexiert
  • Es hat keinen Effekt
Ohne robots.txt interpretieren KI-Crawler die Lage als „unklar" und meiden möglicherweise ganze Bereiche.

Question 2: Was sind die 4 goldenen Regeln für eine effektive llms.txt?

  • Keywords, Backlinks, Meta-Tags, Alt-Texte
  • Kurz, kreativ, dynamisch, innovativ
  • HTML, CSS, JavaScript, PHP
  • Faktendichte, zitierbare Sätze, Konsistenz, max. 4000 Tokens
Die 4 Regeln: Faktendichte statt Marketing, zitierbare Sätze, Konsistenz mit der Website, maximal 4.000 Tokens.

Question 3: Wie lautet die Alt-Text-Formel für Visual GEO?

  • Subjekt + Kontext + Material/Eigenschaft + Marke
  • Keyword + Keyword + Keyword
  • Dateiname + Größe + Format
  • „Bild" + Nummer
The Alt-Text Formula: [Subjekt] + [Kontext] + [Material/Eigenschaft] + [Markenname] — maximale Informationsdichte.

Question 4: Welche JSON-LD-Schema-Typen sind Pflicht?

  • Recipe, Event, HowTo
  • Nur Organization reicht aus
  • Organization, WebSite, BreadcrumbList, FAQPage
  • Es gibt keine Required Schemas
Das Basis-Set: Organization, WebSite, BreadcrumbList, FAQPage — plus Person und Product/Service als hohe Priorität.

Question 5: Warum sind die ersten 100 Wörter nach der H1 so wichtig?

  • Google zeigt nur die ersten 100 Wörter an
  • KI-Agenten extrahieren Antworten bevorzugt von dort
  • Browser rendern nur die ersten 100 Wörter
  • Es hat nur ästhetische Gründe
Answer Density: KI-Agenten extrahieren Antworten bevorzugt aus den ersten 50-100 Wörtern — Marketing-Phrasen dort = verschenkte Zitationsfläche.

About the Author

Sascha Deforth — GEO Practitioner and Founder of TrueSource AI. Specialized in AI Visibility Optimization with 200+ audits completed. → LinkedIn