GEO & AI bots : What Your Logs Reveal and Why It’s Game-Changing

Explore this content with AI:
Table of Contents

There’s increasing talk of GEO, AI Search, andAEO. Of visibility in ChatGPT, presence in Perplexity, and recommendations inGoogle AI Mode.

The field is evolving rapidly. Best practices, not so much. In many teams, people continue to analyze these new environments using tools designed for the search landscape of the past. These tools are useful, but they fail to capture part of what’s actually happening.

In SEO, we’ve long relied on impressions, clicks, and sessions. This model remains solid. But it’s no longer enough to explain how content circulates - or doesn’t circulate - in generative environments.

We’re talking less about rankings. More about understanding. About the ability to be rephrased, reformulated, or integrated into an answer.

GA4 depends on the browser.Search Console remains focused on Google Search. Part of the activity - particularly on the AI crawler side - escapes these tools. Server logs naturally come back into the discussion. Not as something new. Rather as something we’re rediscovering with a fresh perspective.

A simple read, but one that sets the record straight

Logs don’t tell the whole story.

They don’t reveal whether a page will be cited in an AI response. They don’t directly measure a piece of content’s influence. And they don’t correspond to the concept of a “session” as tracked in analytics tools.

However, they do show what’s happening on the crawling side : who’s crawling, where, how often, and with what type of response. It’s a fairly raw read, but one that’s hard to replace.

What we see when we look

When you look at your logs, there’s often as light lag. The site doesn’t just live to the rhythm of its users. It’s constantly being explored by a whole range of agents : search engines, social bots, external tools, specialized crawlers, and AI agents of varying degrees of identification.

These visits don’t follow the same patterns as human navigation. Some areas are heavily trafficked. Others are hardly ever visited. Some pages are revisited regularly. Others remain on the periphery.

For SEO/GEO teams and content managers, this allows them to move beyond a purely theoretical view of the site’s structure.You can see what is actually being browsed.

On an e-commerce site analyzed recently, a single day’s log extract already shows this discrepancy. On one side, there is an overview of the crawl; on the other, a breakdown between users and bots, the volume of unique URLs explored, and the number of bots detected. This type of view quickly sets the record straight : a significant portion of the site’s activity completely escapes traditional audience tracking tools.

From theory to practice

Simply saying that “bots crawl the site” doesn’t tell us much. Logs allow us to dig a little deeper.

We can observe volumes, spikes, and distributions. Some areas attract the most attention, while others fly under the radar.

Certain patterns become clearer : foundational content designed for SEO or AEO, but with few visitors ; alongside secondary areas that capture a significant portion of the crawl. We also see behaviors that are highly concentrated over time.

Looking a bit more closely, these behaviors become clear. Some bots generate several hundred requests on distinct URLs, while others act more sporadically. Activity can also be concentrated within a very short time window, corresponding to intensive crawl phases that are very different from user traffic. Here, we move beyond a general analysis to something much more precise:which bots are actually visiting, at what frequency, and on which parts of the site.

Crawl and usage don't tell the same message

The fact that a bot visits a page does not guarantee that it will be used in a response. And the absence of a visible visit does not mean that content is absent from the generative ecosystem.

Between the two, there are several layers:indexes, pre-integrated data, retrieval systems, and cross-referencing with other sources.

This requires working with two perspectives : what is explored, and what is actually utilized. Logs provide good visibility into the former. The latter requires other approaches : observing responses, analyzing citations, and testing prompts. The value lies in the combination.

The crawl budget becomes clear

One observation comes up quite often : bots spend a lot of time where things are simple.

Filters, settings, archives, variations… certain areas account for a significant portion of the crawl.Meanwhile, more strategic content remains largely unexplored.

When we group the data by page categories, another type of imbalance emerges. Certain sections account for a significant portion of the hits, simply because they are easy to crawl. Others, though more strategic, remain overlooked. This analysis allows us to quickly identify where the crawl is concentrated… and where it spends less time than we might think.

The issue is no longer limited to producing content. It’s also about ensuring that it’s accessible, integrated into the structure, and present in the paths that bots follow.

Friction manifests itself in different ways

Logs reveal fairly basic issues that often got unnoticed elsewhere.

Chain redirects. Recurring errors. Irregular response times. Active pages with low traffic.

Logs also allow you to drill down to the level of each individual request. You can see exactly which URL was accessed, by which bot, with what response code, and in how much time. We often find pages visited only once, with a single hit and no recrawls during the period. Conversely, some URLs go through redirects or have longer response times, which slows down or complicates crawling.

This level of detail changes the nature of decision-making. We no longer fix “the site” in a general way. We intervene on very specific pages and behaviors.

Another way to interpret the performance

Traffic metrics remain important.They aren’t going away.

But with AI Search, part of visibility is built outside of clicks - in the way content is reposted, rephrased, and mentioned.

This adds another layer. We continue to track business results. But we also look at presence in search results, the topics where the brand appears, and how content circulates. Logs don’t directly measure this visibility. They let us see if the site is part of user's exploration paths.

Putting logs back where they belong

Logs alone are not enough to drive a GEO strategy. They do not, on their own, allow us to understand what is happening in the generative responses.

But they provide a concrete foundation. They show what is being explored and what is not. The over-targeted areas. The under-targeted pages. The friction points.

In an environment that is still partially opaque, this type of analysis becomes hard to ignore.

The approach remains fairly simple: observe the crawl as it is, identify imbalances, adjust the structure, track changes, and then cross-reference with what’s happening on the responses and business sides.

AI bots are already here. The question is whether we’re really looking at what they’re doing.

FAQ

Céline Naveau, co-founder of Sematic, SEO and GEO expert

Céline Naveau

Céline is the co-founder of Semactic, Europe’s leading SEO activation platform and a pioneer in Generative Engine Optimization (GEO). With over 10 years of experience in SEO, she combines deep expertise as a consultant — particularly for e-commerce and news websites — with a forward-thinking approach to the future of search. Prior to founding Semactic, Céline led a team of specialists in search marketing, social ads, and analytics at a top Belgian digital agency. She also held key marketing and project management roles in both national and international companies. Today, she is shaping the next generation of organic visibility strategies, where SEO and GEO converge to give digital teams strategic control and measurable impact.