Schema as an AI context layer: the types Google ignores that AI models parse

Most schema advice optimises for rich results. This technique optimises for the structured context layer that AI models read on top of your content.

Flemming RubakFlemming Rubak · April 19, 2026 · 11 min read

Executive summary

The schema markup on most websites exists for one reason: Google rich results. Article schema for authorship cards. FAQ schema for expandable Q&A. HowTo schema for step-by-step panels. Breadcrumb schema for navigation trails. Every implementation decision is filtered through the question “will Google reward this with a rich result?”

That filter misses the point for AI-mediated discovery. AI models do not generate rich results. They construct recommendations, comparisons, and buying advice by parsing your content and its metadata. Schema markup gives them a structured context layer on top of the body text: explicit definitions, entity tags, extractable summaries, and verifiable assertions. The schema types that serve this purpose are not the ones Google rewards with visual treatments in search results. They are the ones Google mostly ignores.

This technique covers the four schema types worth implementing for AI models, what the evidence actually supports, and how we implemented them across this site.

The rich results trap

Rich results are visual treatments Google applies to certain search listings. A recipe gets star ratings and cook time. A FAQ gets expandable question boxes. A product gets price and availability. These treatments increase click-through rates in traditional search, so the entire schema implementation industry has oriented around them.

The problem is that this orientation creates a binary: if Google does not reward a schema type with a rich result, most sites do not implement it. That leaves a large portion of the schema.org vocabulary unused, including the types that are most useful for AI models.

AI models process your page differently from Googlebot. Google crawls the page, extracts structured data, and decides whether to render a rich result. An AI model ingests the page during training or retrieves it during search-augmented generation, then uses everything it finds (body text, headings, metadata, and structured data) to understand what the page claims, what it defines, and how it relates to other entities. The difference between visibility and positioning is partly a question of whether the model has structured context to work with or is guessing from body text alone.

Schema markup that an AI model can parse as structured context reduces ambiguity. Instead of the model inferring “this paragraph seems to define a term,” the markup states “this is a formal definition of this term, created by this organisation, part of this terminology set.” Instead of the model scanning the entire page for an extractable summary, the markup says “this paragraph is the one the author intends as the citable summary.”

That explicitness is the technique. Not schema for rich results. Schema as a context layer for AI comprehension.


Four schema types do this work. None of them generate Google rich results on their own. All of them give AI models structured context that body text alone does not provide.

Four schema types AI models parse

1. DefinedTerm and DefinedTermSet

Status: stable in schema.org. No Google rich result.

DefinedTerm marks a word, name, or phrase with a formal definition. DefinedTermSet groups related terms into a vocabulary. When an AI model encounters a DefinedTerm in your JSON-LD, it gets a clean, citable definition rather than having to extract one from surrounding prose.

Where this matters: any page that introduces terminology specific to your domain. In Seedli’s case, terms like “Market Gravity,” “Decision Clarity,” “Criteria Flip,” and “Elimination Risk” are concepts we coined. Without DefinedTerm markup, the model infers the definitions from context. With the markup, it has the canonical definition and knows which organisation created the term.

{
  "@type": "DefinedTerm",
  "name": "Criteria Flip",
  "description": "A content format that redefines the
    evaluation criteria in a market by elevating a
    low-priority criterion into a primary differentiator.",
  "inDefinedTermSet": {
    "@type": "DefinedTermSet",
    "name": "Seedli AI Visibility Terminology"
  }
}

2. speakable

Status: beta in Google (voice assistants). No standard rich result.

The speakable property was designed for Google Assistant: it tells the system which sections of a page are suitable for text-to-speech playback. Google never expanded it beyond a limited beta for news publishers. Most sites ignored it.

The relevance for AI models is different. speakable marks the paragraphs the author considers the most concise, self-contained summary of the page. That is exactly what an AI model needs when constructing a direct answer: not the full article, but the 100 to 150 words that capture the core assertion. By marking your executive summary as speakable, you are telling the model “if you need one paragraph from this page, use this one.”

Deploying speakable across a content library

The executive summary is the starting point, not the full deployment. A long-form article has multiple sections, each of which could appear in a different AI query context. When someone asks about your overall thesis, the executive summary is the right extraction target. When someone asks about a specific subtopic, the best paragraph might be buried in section four.

The deployment pattern for a content library with section-based articles:

  1. 1

    Executive summary. Mark the executive summary or opening definition paragraph. This is the page-level extraction target. One per article, always.

  2. 2

    One paragraph per section. In each major section, identify the single paragraph that captures the section’s core assertion in a way that stands alone without context. That paragraph gets a speakable data attribute. The selection criterion: could a model serve this paragraph as a complete answer to a question about this subtopic? If the paragraph requires the previous paragraph to make sense, it is not self-contained enough.

  3. 3

    Reference all selectors in the schema. The SpeakableSpecification accepts an array of CSS selectors. Each speakable paragraph gets its own data attribute value (e.g., data-speakable="summary", data-speakable="implementation", data-speakable="evidence"), and the cssSelector array lists them all.

The schema for a fully deployed speakable property looks like this:

{
  "@type": "Article",
  "speakable": {
    "@type": "SpeakableSpecification",
    "cssSelector": [
      "[data-speakable='summary']",
      "[data-speakable='implementation']",
      "[data-speakable='evidence']"
    ]
  }
}

Selection criteria: what makes a paragraph speakable

Not every paragraph qualifies. A speakable paragraph meets three conditions: it states a position or finding (not a transition or setup), it is self-contained (a reader encountering it in isolation understands the claim), and it is between 40 and 150 words (short enough to extract, long enough to carry substance). If a section has no paragraph meeting all three, it does not get a speakable marker. Forcing a mark on a weak paragraph dilutes the signal for every other mark on the page.

Note: whether current AI models explicitly parse the speakable property is not publicly confirmed by any major provider. The logic is sound (structured extraction hints reduce ambiguity), but the mechanism is inferred, not verified. [Unverified]

3. about and mentions

Status: stable in schema.org. No rich result.

The about property declares the primary subjects of a page. The mentions property tags secondary entities that appear in the content. Together, they give AI models an explicit entity graph for the page instead of requiring Named Entity Recognition (NER) to extract entities from prose.

NER is imperfect. It can mistake a brand name for a common noun, confuse a product name with a company name, or miss domain-specific terms entirely. When you declare entities in schema, you remove that ambiguity. The model knows that “Darktrace” on this page refers to a cybersecurity company, not a sci-fi concept, because the schema says so.

The practical value: when an AI model is constructing a recommendation and needs to understand which entities your page is about, the about array gives it the answer without guessing. This is particularly valuable for pages that discuss multiple entities (comparison content, market reports, competitor acknowledgment pages) where the model needs to understand which entity is the primary subject and which are supporting references.

{
  "@type": "Article",
  "about": [
    {
      "@type": "DefinedTerm",
      "name": "AI Visibility",
      "description": "The measurable presence and
        positioning of a brand across AI model responses."
    }
  ],
  "mentions": [
    { "@type": "Product", "name": "ChatGPT" },
    { "@type": "Product", "name": "Gemini" },
    { "@type": "Product", "name": "Perplexity" }
  ]
}

4. Claim

Status: pending in schema.org since v3.4. No rich result. Forward-looking.

Claim is the schema.org type for a specific, factually-oriented assertion. Google uses the related ClaimReview type for fact-checking rich results, but bare Claim markup has no Google application. Its value is for AI models that need to distinguish between opinions and verifiable statements.

A data-backed assertion like “Decision Clarity for AI-native cybersecurity evaluated for CISOs is 62 out of 100” is a Claim. It has a specific value, a source, and a date. Marking it as a Claim tells the model that this is something that can be cited as a fact, not paraphrased as an opinion.

We are not implementing Claim on this site yet. The type is still pending in schema.org, which means its properties may change. We include it here because the direction is clear: as AI models become better at distinguishing claims from commentary, having your assertions pre-tagged will be an advantage. Watch this type. Here is what it looks like:

{
  "@type": "Claim",
  "text": "Decision Clarity for AI-native cybersecurity
    evaluated for CISOs is 62 out of 100.",
  "appearance": {
    "@type": "CreativeWork",
    "url": "https://www.seedli.ai/examples/
      market-reality-report-cybersecurity"
  },
  "firstAppearance": {
    "@type": "CreativeWork",
    "datePublished": "2026-04-10"
  }
}

The logic for each type is sound. The question is whether AI models actually parse them. Here is what the evidence supports and where it stops.

What the evidence actually shows

This is where the technique requires honesty. The schema-for-AI space is full of confident claims from marketing sources, and most of them are not independently verified. Here is what we can say with different levels of confidence.

Solid ground

AI models ingest JSON-LD as part of the page content. This is not debatable: the structured data is in the HTML, and any system that processes the page processes the schema. The question is not whether models see it, but how much weight they give it relative to body text. Multiple sources (BrightEdge, Schema App, Search Engine Land) confirm that structured data helps AI systems resolve entity ambiguity and understand page intent.

Reasonable inference

DefinedTerm provides cleaner definitions than NER extraction. about and mentions provide explicit entity tagging. These are structural advantages that any system processing the data would benefit from. Whether the benefit is marginal or significant depends on the model and the query, but the direction is consistent: less ambiguity produces better comprehension.

Unverified

Specific claims like “structured data leads to 2.5x more AI citations” or “speakable markup increases voice citations by 3.1x” come from individual marketing sources without published methodology. We cannot verify these numbers. The speakable property’s effect on AI models specifically is inferred from its design (it marks extractable content), not confirmed by any major AI provider.

Our position: the implementation cost is low (adding a few JSON-LD properties to pages you already maintain), the downside risk is zero (unused schema does not harm ranking or indexation), and the upside is structurally sound even if the magnitude is uncertain. When the cost-benefit ratio looks like that, you implement and measure rather than wait for proof.


With the types understood and the evidence weighed, here is how to add these schema types to pages you already publish.

How to implement

The implementation adds to your existing JSON-LD. You do not replace the Article schema; you extend it with additional properties and add new items to the @graph array. All markup stays in a single <script type="application/ld+json"> block.

  1. 1

    Add about and mentions to every Article schema

    The about array should contain the primary subjects of the page (the concepts it teaches or defines). The mentions array should contain secondary entities referenced in the content (products, companies, people, standards). Use specific types: DefinedTerm for concepts you define, Product for software, Organization for companies, Person for named individuals.

  2. 2

    Add speakable to your executive summary and section-level paragraphs

    Add a data-speakable="summary" attribute to your summary container. Then walk each section and identify the single paragraph that states the section’s core claim in a self-contained way. Tag each one with a unique data-speakable value (e.g., data-speakable="evidence"). Add a SpeakableSpecification to the Article schema with a cssSelector array listing all tagged elements. This works regardless of whether the component is server-rendered or client-rendered, because the attribute is in the HTML output.

  3. 3

    Add DefinedTerm for every concept you coin

    If your page introduces terminology that does not exist in the standard vocabulary of your industry, mark it with DefinedTerm. Include the name, a one to three sentence description, and a reference to the DefinedTermSet (your terminology vocabulary). The description should be the text you want AI models to cite when asked “what is [your term]?”

  4. 4

    Group DefinedTerms into a DefinedTermSet per domain

    If you have multiple proprietary terms across multiple pages, create a consistent DefinedTermSet name and reference it from every DefinedTerm. This signals to AI models that these terms form a coherent vocabulary from a single authority source. It is the schema equivalent of the mesh link topology described in the internal linking technique: connected concepts from a single source signal depth.


We followed this method on the site you are reading right now. Here is what we added and why.

What we did on this site

The Seedli content library spans insights, playbooks, examples, and techniques. Each article already had Article schema in JSON-LD. We extended the schema in three ways.

speakable on every executive summary

Every article has an ExecutiveSummary component at the top. We added a data-speakable attribute to the component and a SpeakableSpecification to the Article schema pointing to it. If an AI model or voice assistant looks for the extractable paragraph, it now has a machine-readable signal for which content the author intended as the citable summary.

about and mentions on every article

Each article’s JSON-LD now includes an about array with the primary concepts and a mentions array with referenced entities. The content types insight, for example, lists “AI Visibility,” “Decision-Stage Content,” and “Buyer Decision Journey” as about entities, and ChatGPT, Gemini, Claude, Perplexity, and Copilot as mentions. This removes entity ambiguity for models processing the page.

DefinedTerm for Seedli-coined terminology

Pages that introduce Seedli-specific concepts now include DefinedTerm markup in the @graph. The criteria-flip playbook defines “Criteria Flip” as a DefinedTerm. The elimination-defence playbook defines “Elimination Risk.” The market-reality-report playbook defines “Market Gravity.” Each references the same DefinedTermSet (“Seedli AI Visibility Terminology”), creating a linked vocabulary across the site.

We did not implement Claim schema. The type is still pending in schema.org and its properties may change. We are watching it and will add it when it stabilises, starting with data-backed assertions in worked examples (e.g., “Decision Clarity: 62/100”).


Schema for AI is a discipline of precision. Over-marking is as counterproductive as under-marking.

What not to do

Do not mark everything as speakable

If every paragraph is speakable, none of them are. The property works because it is selective: the executive summary plus at most one paragraph per major section, chosen because it states a self-contained claim in 40 to 150 words. A ten-section article should have at most ten or eleven speakable markers. An article where every second paragraph is tagged has no extractable signal left.

Do not create DefinedTerm markup for standard industry terms

If the term already has a widely accepted definition (e.g., “SEO,” “CRM,” “API”), marking it as your DefinedTerm implies you are defining it differently. Reserve DefinedTerm for terminology your organisation coined or for terms where your definition adds genuine specificity beyond the standard usage.

Do not use about and mentions interchangeably

about declares what the page is about. mentions lists entities referenced in passing. If your article is about decision frameworks and mentions ChatGPT as one of several AI models, ChatGPT goes in mentions, not about. Misusing about dilutes the primary subject signal and makes entity resolution harder, not easier.

Do not add schema you cannot maintain

A DefinedTerm with a description that contradicts the body text is worse than no markup at all. If you update the article and change how you define a concept, the JSON-LD must change too. Treat the schema as part of the editorial process, not a one-time technical implementation. This pairs with the quarterly review cadence described in the link topology technique: when you review links, review schema. The same cadence applies to temporal authority signals like dateModified, which must reflect actual revisions.

The underlying principle is the same one that governs meta descriptions for AI: every piece of metadata is a statement the author makes about the page. The metadata should be accurate, selective, and maintained. If it is, it reduces ambiguity. If it is not, it creates contradiction. AI models process both.

See which content types serve each stage of your buyer journey

Seedli maps the decision structure AI models build around your market. The content types, the criteria, the buyer language. The schema layer is the structural optimisation on top.

Get started

This is part of the Seedli technique series on structural optimisation for AI-mediated discovery. See all techniques and playbooks.

Schema as an AI Context Layer: Four Types That Feed AI Models, Not Google Rich Results | Seedli