Why Doubling Your SEO Will Not Get AI to Recommend You

The mechanism that decided rankings stopped operating once buyers started asking AI who to choose. Recommendation Design is the discipline that closes the gap.

Flemming Rubak · May 25, 2026 · 14 min read

Key Takeaways

Buyers ask AI who to choose. The SEO playbook that won search does not get you recommended.

AI returns a recommendation, not a list of links. A handful of providers, named, with reasoning. Your brand is either in the answer, in it with caveats, or absent from the conversation that shapes the deal.
Profound’s 1,311-page analysis shows SEO explains only 4 to 7 percent of AI citation variance. The other 93 to 96 percent is content quality and structural signals AI uses to decide who to recommend. SEO is the qualifying round, not the championship.
The unit of work has changed. Search engines ranked documents against queries. AI models construct recommendations against decisions. Different mechanism, different playbook.
Recommendation Design is the new discipline. Observe what AI is recommending in your category. Decode the criteria it uses. Seed the specific evidence AI needs. The first two weeks produce a defensible baseline you can build a quarterly cadence on.

This piece sets out the data, names the discipline, and walks through the first two weeks of practice.

The argument in brief

A B2B SaaS company spends a year doubling its SEO investment after watching organic traffic drop. Rankings improve. Domain authority climbs, but lead volume does not recover. The CRO finally asks the question marketing has been avoiding: are buyers even using Google for this decision anymore?

Their own customers, when surveyed, say they asked ChatGPT first.

This is the new shape of the problem. The buyer used to do their own research, scan a page of links, click three or four, and form a shortlist themselves. Now they ask an AI which providers to consider, and the AI builds the shortlist in a single response. The provider that does not appear in that response does not enter the journey at all.

The mechanism that decides who appears in the AI answer is structurally different from the mechanism that decided who ranked on Google. SEO transfers partially, but the ceiling on what SEO can do is lower than most marketing teams realise. The discipline that operates above that ceiling has not been named yet. This piece names it: Recommendation Design. The work is to observe what AI is recommending in a category, decode the criteria AI is applying, and seed the specific evidence AI needs to recommend a brand.

The case follows. Part 1 sets out what is actually changing in buyer behaviour, with the data. Part 2 explains why SEO playbooks fall short, with the mechanism. Part 3 names the gap, defines it as a measurable quantity, and shows why it has been invisible. Part 4 lays out the discipline that closes it. Part 5 describes the first two weeks of practice.

Part 1: The buying motion has already changed

Three external data points, none of them produced by a vendor with a stake in inflating the AI narrative, describe what is happening.

One. Per the SparkToro and Datos joint research on 41 search platforms, Google searches per US desktop user fell roughly 20 percent year over year from 2024 to 2025. The number of searches per searcher in the EU and UK fell 2 to 3 percent over the same period. Rand Fishkin attributes the US decline to AI answers and AI Overviews satisfying questions before users click into organic results or perform follow-up searches.

Two. The same research shows AI tools as a category make up 3.19 percent of platform searches in the US, and ChatGPT alone reaches 32 to 37 percent of US desktop users in any given month. AI is small in absolute share. It is large and growing in user reach.

Three. The AI traffic that does reach websites converts at very high rates relative to organic search. In a single-client case study published by Seer Interactive, ChatGPT-referred traffic converted at 15.9 percent, against Google Organic at 1.76 percent. That is approximately a 9x advantage on conversion rate. The same study found Perplexity at 10.5 percent, Claude at 5 percent, and Gemini at 3 percent. AI traffic in this case was 0.07 percent of organic volume, and still produced 1,370 conversions, which was 100 percent more than the prior year.

Three statements emerge from this data, and they all need to be true at once.

The volume story is honest. AI is a small share of total search activity. Anyone claiming AI has taken over search is wrong by an order of magnitude.
The trajectory story is honest. Per-user search volume on Google is falling. The audience for AI is rising. The two together describe a channel that is moving faster than the headline-share number suggests.
The value-per-visit story is the one most B2B teams are missing. When an AI-referred buyer arrives at your site, they have already had a research conversation with a model that named you, framed your category, and surfaced specific evaluation criteria. They are not browsing. They are validating.

The honest framing of the opportunity is not “AI is replacing search.” It is “AI is the leading edge of a discovery shift, and the buyers who arrive through it convert at multiples of the ones who arrive through Google.”

A marketing team that builds for that frame asks a different question than the team that doubles its SEO budget. The next part is about why.

Part 2: Why SEO playbooks do not transfer to AI

The strongest empirical case against the “AEO is SEO 2.0” frame is also the Profound finding. Across 1,311 pages analysed against citation behaviour in major AI engines, traditional SEO metrics explained only 4 to 7 percent of citation variance. The relationship is real and statistically significant (p < 0.001), but it is weak. Content quality and structural signals AI uses for evaluation drive the other 93 to 96 percent.

This is not a small correction to SEO practice. It is a description of a different mechanism. Three structural differences explain why the playbook stops working at the ceiling the 4 to 7 percent number names.

The first difference is the unit of work. Search engines rank documents against queries. They take a string of words from a user, score every document in their index for relevance to that string, and return an ordered list. The unit of work is the document. The job is to make a document score well.

AI models do not rank documents. They construct recommendations against decisions. A user asks “who should I use for X,” and the model produces a synthesised answer that has already performed the decision work on the user’s behalf. It has named providers, framed criteria, surfaced risks, and shaped the consideration set. The unit of work is the recommendation. The job is to make the recommendation include you, accurately, with the right reasoning.

The second difference is what the engine reads. Per Profound, AI engines read approximately 100 characters from each candidate page before deciding whether to cite it. That snippet is typically derived from the title, meta description, URL, and visible content around the query term. It is the entire pitch. Pages that require deep scrolling or that bury the answer earn nothing.

Search engines have always rewarded thin snippets less than thorough content. AI engines, structurally, do the opposite. The first 100 characters are the gate; the other 10,000 words are decoration unless the model decides to fetch deeper.

The third difference is the source surface. Per the same Profound data, ChatGPT and Google overlap on cited domains only 39 percent of the time. Claude 38 percent. Meta AI 36 percent. The implication is that optimising for Google is optimising for less than half of where AI Search happens. A brand that ranks #1 in Google but has no presence in the editorial sources AI engines actually trust (Reddit threads, YouTube transcripts, specialist publications, third-party blogs) is winning in one place and losing in the place the buyer is now spending time.

One more piece of the source-surface story matters. Per Profound, 95.1 percent of AI citations are earned, not owned. Earned means the citation comes from a source the brand does not control. Owned means the citation comes from the brand’s own website. The ratio holds across major engines. A discipline built around owned-content optimisation, which is what SEO largely is, addresses 5 percent of where the recommendation is built.

None of this means SEO is dead. SEO is the qualifying round. If your domain is not indexed by Google, your AI visibility is also dead, because most AI engines crawl the web through Google’s index. SEO buys you eligibility. It does not buy you the recommendation.

The implication for marketing strategy is direct. The 4 to 7 percent number does not say SEO is unimportant. It says the ceiling on what SEO can do for AI visibility is roughly that height. Any incremental SEO investment, after a baseline, buys diminishing returns against the recommendation outcome the buyer actually experiences. The structural reason this ceiling exists, and what happens when an entire category runs the same optimisation playbook against it, is the subject of a companion piece on the AEO paradox. The next dollar should be spent on something else.

Part 3: The Recommendation Gap, and why it has been invisible

The thing that the next dollar should be spent on closing has a name. We call it the Recommendation Gap.

The Recommendation Gap is the distance between what AI is actually recommending in a category and what a brand assumes it is recommending. It is structural. It is currently measurable for the first time. And, until very recently, it has been invisible to every tool in a marketer’s stack.

The reason it has been invisible is that it lives in a layer no existing tool instrumented. SEO tools measure rank. Analytics tools measure clicks and conversions. Brand monitoring tools measure mentions in social and press. None of these tools observe what an AI model says when a buyer in your category asks it to recommend a provider. That conversation happens in a system the marketing team has no access to, cannot read, and cannot affect through the channels they already operate.

Three properties of the Recommendation Gap matter for strategy.

It compounds. Per Profound’s analysis, 50 percent of top-cited AI Search content is less than 13 weeks old. AI engines reward recency structurally, not incidentally. A brand that has not been seeding evidence into the recommendation layer for two quarters has fallen further behind than the dashboard suggests, because the recency window keeps moving. Closing the gap a quarter from now is more expensive than closing it today.

It can collapse fast. A May 2026 seoClarity analysis tracked ChatGPT’s citation volume across five markets between February and April 2026. Citations fell 86 to 94 percent in ten weeks. Zero-citation rates roughly tripled. Even when ChatGPT did cite, it cited fewer sources per response. This is a single-engine, single-window finding, and the magnitude almost certainly will not hold as a steady state. The directional read is what matters: the channel that publishers and SEO professionals have been told to optimise for is itself becoming less reliable as a discovery signal over time. The recommendation that AI produces, with or without a citation, is becoming the only thing the buyer experiences.

It is asymmetric. A brand that is in the AI answer wins disproportionately because the consideration set is preformed. The buyer arrives with one of two or three providers already named for them. The brand that is absent does not lose a deal; it loses access to the conversation that produces the deal. The gap between “in the answer” and “absent from the answer” is not gradient. It is binary.

For the first time, the gap is measurable. The instrument is the same prompt your buyer would ask, run at scale across the AI tools your buyer uses, with the recommendation captured, parsed, and tracked over time. The recommendation is the artifact. What AI describes about you, what it cites for you, what it weighs against you, what false beliefs are operating inside the recommendation, all of these are observable now in a way they were not eighteen months ago. The next question is what to do once you can see them.

Part 4: Recommendation Design, the discipline that closes the gap

Recommendation Design is the discipline of being chosen by AI, not just seen by it. It is to AI what SEO was to search. The work is to observe what AI says about your brand in your category, decode the criteria AI is using to construct its recommendation, and seed the specific evidence AI needs to recommend you the way you would want to be recommended.

The loop has three steps and runs continuously.

Observe. Run the prompts your buyers actually use, at scale, across the AI tools they actually use. ChatGPT, Claude, Gemini, and Perplexity cover the customer-facing surface for most B2B categories. Capture how each model describes you, what it cites for you, where you appear, where you are conspicuously missing, what it says about your competitors, what it warns buyers about. Observe is not passive listening. It is deliberate watching, on a cadence, with the recommendation captured as text and stored over time so that month-over-month changes are visible.

Decode. Extract the criteria AI is applying in your category. A useful recommendation is not a list of names. It is a list of names plus the reasoning the model used to produce them. Decode the reasoning. Identify what evidence AI is missing about your brand, what it is weighting against you, what false beliefs are operating inside the recommendation, and which buyer-stage of the journey those beliefs are surfacing at. Recommendations made at the Consideration stage are framed differently than those made at the Evaluation, Decision, Retention, and Advocacy stages. Each stage rewards different evidence.

Seed. Plant the specific content that gives AI the evidence it needs to recommend you. The word that matters here is specific. Not generic thought leadership. Specific evidence tied to specific decoded gaps. If the recommendation consistently warns buyers about an objection that does not apply to your offering, the content that closes the gap is a single page that addresses that objection on its own terms, with concrete cases and named outcomes. If the recommendation cites a peer blog favourably and omits you, the content that closes the gap is a piece in the same editorial register, published in a place the model already trusts, that earns a citation. Seeding is creative work, not engineering. It is the work the buyer is hiring you to direct.

The loop is continuous, not linear. The next Observe verifies whether the previous Seed changed the recommendation. The parallel structure is Lean Startup’s Build, Measure, Learn, with the verification step embedded in the next iteration’s Observe rather than named as a separate phase.

Two more properties of the discipline are worth naming.

The five stages matter. A buyer’s journey through an AI-mediated decision compresses but does not collapse. Across the Consideration, Evaluation, Decision, Retention, and Advocacy stages, AI applies different criteria, different evidence requirements, and different recommendation behaviour. Recommendation Design is not a single intervention against a single prompt. It is a five-stage discipline, and the content investments that close the gap differ stage by stage.

Earned beats owned. Per the Profound 95.1 percent finding, the vast majority of AI citations come from sources the brand does not control. A serious Recommendation Design practice spends more time on what gets said about the brand in Reddit threads, YouTube transcripts, niche-publication coverage, and analyst notes than it spends on what gets published on the brand’s own blog. The owned site is the qualifying round. The earned surface is where the recommendation is built.

Part 5: The first two weeks

A team that wants to start Recommendation Design from scratch can produce a defensible baseline and a first action plan in two weeks. The work is sequential, not parallel; each step depends on the one before it. The first week is harder to compress because it requires coordinating with customers. The second week is desk research and moves fast. For teams whose pipeline is softening right now and need the tactical version of this work that ships in days rather than two weeks, the Emergency Field Guide covers the five rapid moves that raise citation probability immediately, while the baseline below is being built.

Week 1: prompt inventory. Write down 20 to 40 prompts your buyers actually ask AI in your category. Not the prompts your team thinks they should ask. The prompts they do ask. The fastest way to produce this list is to talk to ten recent customers and ask, forward-looking: “If you were to find a brand in this category today, which questions would you use?” Record the questions verbatim. If the answer is “I would not ask AI,” record that too. The honest baseline is the share of buyers in your segment who would use AI at all, before any recommendation analysis. Recall about past prompts is unreliable; the forward-looking framing gives you cleaner data.

Week 2: observe, decode, seed. The second week is desk research and runs in three steps.

Step 1: observation pass. Run each prompt across ChatGPT, Claude, Gemini, and Perplexity. Capture each recommendation as text. Note who is named, in what order, with what reasoning, at what stage of the journey, with what risk callouts. The output is not a score. It is a corpus of recommendations. Most B2B teams discover three things in this pass that they did not know before. First, their brand is mentioned in fewer prompts than they expected. Second, in the prompts where they are mentioned, the reasoning is not the reasoning they would have chosen. Third, there are competitors named in the recommendation set who do not show up on their internal competitive map at all.

Step 2: decoding pass. Across the corpus, identify the criteria AI is using, the risks it is surfacing, the false beliefs it is repeating, and the gaps in evidence it is operating with. Group criteria by buyer-decision stage, because the criteria at Consideration are not the criteria at Decision. The output is a written diagnosis: here is what AI thinks is true in our category, here is what we wish it knew, here is the specific delta between the two.

Step 3: first seed. Pick the single highest-leverage gap and produce one content asset against it. Not a content series. One asset. The discipline is to verify that a specific seed produces a specific change in the next observation cycle, not to flood the field with content that nobody can attribute to a recommendation change. The asset should match the editorial register of the sources AI already trusts in your category. If AI trusts long-form analytical posts in your space, write one. If AI trusts product comparison pages with named tradeoffs, write one. If AI trusts third-party publications in your vertical, the asset is a pitch to that publication, not a post on your own site.

In practice, the asset is drafted in week 2 and published at the start of week 3. The desk work compresses into two weeks; the re-observation follows once the seed has had time to be discovered by AI, typically days for owned-content seeds and weeks for earned-citation seeds. You can do the work in a week, but you cannot make AI find it faster.

When the seed has had time to be found, observe again. Did the recommendation move on the prompt you seeded against? If yes, the loop is working and you have evidence to commit to a quarterly cadence. If no, the seed was wrong, or it was right but underpowered, and the next iteration sharpens against what the second observation revealed.

None of this requires a new technology stack. It requires the discipline to run the loop, on a cadence, with the recommendation, not the ranking, as the artifact you manage to.

What this means for the marketing leader reading this

The data above does not say that SEO is over. It says that the next dollar of marketing investment, after a baseline of SEO is established, buys more recommendation outcome than ranking outcome. That is a strategic redirection, not a budget cut.

The strategic redirection comes with a measurement redirection. If you are measuring your AI work in rankings, mentions, or share-of-voice numbers, you are measuring presence, not chosen-ness. Presence is necessary and insufficient. The argument for why visibility metrics specifically mislead AI marketing teams is laid out in a piece on why AI visibility is the new vanity metric. The metric that corresponds to the actual buyer outcome is whether the recommendation moves, in your favour, on the prompts your buyers actually ask.

The discipline that produces that movement has a name now. Recommendation Design. Observe, Decode, Seed, across the five stages of the AI-mediated buyer decision. The first two weeks, sequenced, produces a defensible baseline and a first seed. After that, it is a quarterly cadence, with the recommendation as the artifact and the gap as the metric.

The brands that learn this discipline early become the category winners of their markets. The brands that do not become footnotes in someone else’s category.

See what AI is recommending in your category

The first move in Recommendation Design is a defensible baseline. Seedli runs the prompts your buyers actually ask, captures the recommendations, and produces the diagnosis. The two-week baseline runs in hours with the right instrument.

Audit your recommendation gap