Sacha Lefebvre Founder, Paid Ai Search · LinkedIn

How AI Engines Choose Which Brands to Recommend

A buyer types “best running shoe for flat feet under $150” into ChatGPT. The model returns two brands by name. Not a list of ten. The names aren’t random and they aren’t paid. They’re the output of a pipeline the model runs every time a buyer asks a question with buying intent attached.

This post is about that pipeline. Understand the mechanism, and the operational decisions about what to fix on your store this week stop looking like guesses.

The pipeline

When ChatGPT, AI Mode, AI Overviews, or Copilot receive a product-discovery query, the response comes out of three steps. Retrieval pulls candidate sources the model can read. Ranking weights those sources against the query. Synthesis assembles the answer with citations attached.

Different engines run the same three steps with different weights and different retrieval corpora. ChatGPT and Copilot share OpenAI training data but diverge in retrieval (ChatGPT Search vs Bing). AI Mode and AI Overviews are Google products. Perplexity weights primary publication citations heavier than first-party copy.

A site not in an engine’s retrieval corpus is invisible to that engine. A site blocked from GPTBot is invisible to ChatGPT. A site blocked from Googlebot is invisible to AI Mode and AI Overviews. The retrieval step is the only one with a binary fix.

The six signals

In rough weighting order.

Crawl access. Can the model fetch your PDPs at all? robots.txt is the first check. Allow GPTBot, OAI-SearchBot, ClaudeBot, Google-Extended, PerplexityBot in robots.txt.liquid. Common-bot allowlists from before the AI wave often omit two or three.

Structured data. Product + Offer + AggregateRating schema on every advertised PDP. Shopify ships partial Product schema by default. AggregateRating is frequently injected client-side by review apps, and the model does not run JavaScript. If the rating renders after page load, the model does not see it.

Third-party citation density. Mentions of your brand in publications the model trusts. The brands ChatGPT names in DTC apparel today were profiled in Modern Retail, Digiday, Adweek, Reuters product coverage, or founder interviews between 2022 and 2025, three to five mentions per year. Publication quality matters more than count. One Modern Retail feature outweighs 50 mentions in SEO content mills.

Freshness. The model checks the last-modified header, content hash, and citation network around a page. A footnote that says “Updated May 2026” without changing the page doesn’t move the signal. Refreshing the actual content (new specs, new pricing, new copy) does.

Brand-name consistency. “Allbirds” on the homepage, “Allbirds, Inc.” in the schema, “Allbirds Footwear” in Modern Retail, “All Birds” in two old Reddit threads. The model dedupes across sources; each variant lowers its confidence. One canonical name, same spelling and casing, across every surface.

Literal intent match. The model compares the buyer’s query to the first paragraph of the candidate page. “Welcome to our running collection, hand-crafted in Portland since 2015” doesn’t match “running shoe with 10mm drop for flat feet.” Answer-first prose isn’t a copywriting style choice. It’s a ranking signal.

What you control, what you don’t

Five of the six signals are directly controllable. Crawl access is a robots.txt edit. Structured data is an engineering ticket per template. Freshness is a publication cadence. Brand-name consistency is a copy and metadata cleanup. Literal intent match is a PDP copywriting rewrite.

Third-party citation density is the slow one. You don’t control whether Modern Retail covers you. You control whether you give them something worth covering. One quarterly cycle: pitch, deliver, repeat.

What you do not control: the ranking weights themselves. OpenAI, Google, Anthropic, Microsoft, and Perplexity adjust ranking quietly and frequently. The defensive move is the same as it was for traditional search: diversify across signals likely to keep mattering. Brands leaning on one signal (citation density only, or schema only) are more exposed to any weight shift.

If you want this signal four times a week, get the Wire.

Correction policy: if anything in this post is wrong, we’ll fix it publicly with a date-stamped note. Email corrections to support@paidaisearch.com.

Frequently asked questions

How does ChatGPT decide which brands to recommend?

A three-step pipeline. Retrieval pulls candidate sources the model can read. Ranking weights those sources against six signals: crawl access, structured data, third-party citation density, freshness, brand-name consistency, and literal intent match between the buyer's question and the page's first paragraph. Synthesis assembles the answer with citations. The brand that scores best on the stack gets named in the answer.

Why does the same query return different brands in ChatGPT, AI Mode, and Copilot?

Different ranking weights, different retrieval corpora. ChatGPT and Copilot share OpenAI training data but diverge on retrieval (ChatGPT Search vs Bing). AI Mode and AI Overviews are Google products. Perplexity weights primary publication citations heavier than first-party copy. The practical takeaway: measure across multiple engines, not just ChatGPT.

Can I influence which brands an AI engine recommends?

Five of the six signals are directly controllable. Crawl access is a robots.txt edit. Schema is engineering work on PDP templates. Freshness is a publication cadence decision. Brand-name consistency is a copy and metadata cleanup. Intent match is a PDP copywriting rewrite. The sixth, third-party citation density, is the slow one: you control whether you give publications something worth covering, one quarterly cycle at a time.

Go deeper

The CRS Encyclopedia covers the full operational framework behind these signals, 28 chapters, free.

Read the encyclopedia →

Published May 13, 2026