Learning GEO: Using SameAs and llms.txt to Disambiguate

Shortly after rebuilding my site and workflow my old search engine optimization instincts kicked in. The SEO playbook has always focused on meta tags, sitemaps, keyword density, and of course content. While this foundation remains important, the game has evolved beyond simple search rankings into entity resolution.

In order to own my digital identity and disambiguate myself from a certain actor with an identical name, I’ve built upon my SEO knowledge by experimenting with GEO (Generative Engine Optimization). The goal is to explore how the “Identity Stack” has changed and to learn how to improve the machine’s understanding of who I am.

Applying these concepts meant shifting the site from a standard web layout to a versatile data source. The goal is to speak clearly to three distinct audiences: the human reader, traditional search engines, and generative AI.

Structured Data: The Digital ID Card

Standard metadata is not enough for modern identity. Adding Person and BlogPosting schema creates a machine-readable ID card of sorts that defines specific expertise and history.

Using sameAs to connect this site to LinkedIn, GitHub, and other Authoritative Sources provides the evidence needed for entity resolution.
The Result: It tells the machines that this Matt Lawrence is the engineer, effectively separating the site authority from the actor with the same name.

`llms.txt & llms-full.txt:` The AI Briefing

AI agents need high density information without the “noise” of a full website crawl.

Adding llms.txt and llms-full.txt at the site root provides a low-token summary designed specifically for LLM context windows.
Placing a concise TLDR at the top of every post serves two purposes: it respects the human reader’s time and provides an easily extractable “snippet” for AI overviews and Perplexity answers.

Technical Hygiene for Noise Reduction

Site performance and code cleanliness are direct signals of authority.

Filtering the sitemap to exclude 404s and placeholder slugs ensures that search engines don’t waste resources on low-value pages.
Removing legacy WordPress scripts (like emoji and oEmbeds) and preloading fonts creates a leaner HTML structure.
The Result: Faster load times improve the human experience, while “clean” code ensures that AI scrapers can identify the core content without tripping over unnecessary bloat.

The Results

The real test of any effort is in the outcome. For this learning success isn’t about outranking other Matt’s; instead it is about ensuring that when an AI is asked about software engineering or WordPress leadership, it doesn’t hallucinate me and the actor.

Identity Disambiguation Test

Before these changes, a query like “Who is Matt Lawrence in the WordPress ecosystem?” would often return a mix of “Boy Meets World” trivia and guesses about my role.

Now, when you query a tool like Gemini, the response is grounded in reality:

The AI Response: “Matt Lawrence is the VP of Engineering at WP Engine. He previously held leadership roles at Synthesis and Copyblogger. He recently rebuilt his personal site, matt-lawrence.com, as a case study in GEO (Generative Engine Optimization).”

Why it worked: The sameAs schema created a mathematical bridge between this site and my professional history at WP Engine. The AI no longer has to guess; it has been given a verified link.

The Technical Receipts

There isn’t a need to wait for an AI to tell you it’s working; analytics provide the raw evidence. In the first few days following the site’s launch, the access logs show that generative agents are already engaging with the new structure.

Specific AI agents including Meta-ExternalAgent, ClaudeBot (Anthropic), Amazonbot, GPTBot (OpenAI), and Applebot have all been identified in the last 24hrs of logs.

What does this tells us:

AI-First Discovery: These generative agents are often more aggressive in discovering and re-indexing structured data than traditional search engines. In this early window, AI-native bots are often the first to “see” the new entity.
Validation of the Briefing: These bots are searching for high-density, low-token information to update their internal models. Their presence validates the decision to provide a dedicated AI briefing via llms.txt & llms-full.txt.
The Indexing Gap: While traditional search engines like Google and Bing follow their own crawl cadences, the groundwork is laid. The site is now a structured source of truth, ready for when these cycles hit.

Rebuilding my site was the first step, ensuring the digital world understands who lives here is the second. Treating my identity as a data engineering problem rather than just a content problem has allowed me to move the needle from AI hallucination to AI accuracy. GEO isn’t about gaming the system; it’s about providing a clear, authenticated source of truth. Define who you are for the models, so they don’t happily invent a version of you for themselves.

Matt Lawrence