AI-SEO & GEO

Structured Data for AI Search: Schema Markup Guide 2026

Q: When should I update my schema implementation?

Three triggers: (1) Update dateModified every time the content is meaningfully revised — new stats, expanded sections, new sources. Typo fixes don't qualify; substantive updates do. (2) Run a quarterly coverage audit across priority pages — new pages get published without schema, and existing schema goes stale as sameAs URLs change or author credentials evolve. (3) Re-validate after any site-wide change — CMS migrations, URL structure changes, theme updates, or template revisions can silently break mainEntityOfPage, BreadcrumbList, or image URLs in schema.

By Nadia Mohamed · 12 March 2026 · 16 min read · Updated 3 June 2026

What is structured data for AI search?

Structured data for AI search is machine-readable markup — typically JSON-LD schema — that tells AI systems what your content means, who created it, and how it relates to other entities on the web. Unlike traditional SEO, where schema primarily enables rich results, AI systems use structured data for content attribution, entity verification, and citation confidence scoring. The three schema types that matter most for AI visibility are FAQPage, Article, and Person — and the way they are nested together is as important as the individual implementations.

Schema markup has existed for over a decade as a Google optimisation tool. Its role in generative engine optimization is structurally different — and understanding the distinction determines whether implementation produces results or simply adds code with no measurable effect.

TL;DR — Key takeaways

AI systems use schema markup for a fundamentally different purpose than Google — not for rich results, but for entity attribution and citation confidence scoring.
FAQPage, Article, and Person schema are the three types with the most direct influence on whether an AI system cites your content.
Quantitative backing: ConvertMate’s analysis of 80M+ AI citations found that comprehensive structured data implementation — including Article, FAQPage, HowTo, and Product schemas — improved LLM discoverability by 67%.

67%

LLM discoverability improvement from comprehensive structured data

Source: ConvertMate — 80M+ AI citations

Schema does not create citations on its own — it removes the ambiguity that prevents them.

Schema does not guarantee AI citations — it removes the ambiguity that prevents them. Content quality and entity clarity remain the primary factors.
Google deprecated the FAQ rich result on 7 May 2026, but FAQPage schema retains its value for AI citation eligibility — the rich result is gone, the parsing utility for LLMs is not.
Nested schema relationships — Article referencing Person referencing Organization — create the entity graph AI systems use to verify source authority.
Malformed or mismatched schema is worse than no schema — AI systems ignore content where markup contradicts visible page content.
JSON-LD in the document head is the correct implementation format.

How AI systems use schema differently from Google

Google uses structured data to generate rich results — star ratings, FAQ dropdowns, breadcrumb trails in the SERP. The relationship is transactional: implement the correct markup, pass validation, earn the enhanced display. Google has been steadily deprecating these rich result eligibilities — HowTo was deprecated in September 2023, sitelinks search box in November 2024, and FAQ on 7 May 2026 — yet the underlying schemas remain valuable. The shift is from rich result enablement to AI citation enablement.

AI systems use the same markup for a different purpose entirely. When ChatGPT, Perplexity, or Google AI Overviews encounter a webpage, they are not evaluating it for display enhancement. They are evaluating it for citation worthiness — deciding whether the content is attributable to a specific, verifiable entity with documented expertise on the topic. As Fabrice Canel, Principal Product Manager at Microsoft Bing, confirmed at SMX Munich 2025, schema markup helps Microsoft’s LLMs understand your content — one of the first on-record acknowledgements from a major AI search platform that structured data shapes how its models interpret a page.

Schema markup answers three questions AI systems need to resolve before citing any source.

Who created this content? Person schema with credentials, a named byline, and sameAs links to external profiles provides the author confidence signal. Without it, AI systems encounter text with no attributable source — which is not citable.

What organisation published it? Organization schema with a consistent name, URL, and contact information establishes publisher identity. This matters particularly for AI systems that cross-reference entity information across the web to verify authority claims.

What type of content is this? Article and BlogPosting schema with datePublished and dateModified tells AI systems whether the content is current. Perplexity in particular weights recency heavily — stale content without a modification signal loses citation priority for queries where currency matters.

The practical implication: content without schema is not penalised by AI systems the way malformed markup is, but it forces AI systems to infer all three answers from context alone. Most of the time, they cannot infer with enough confidence to cite.

The three schema types that matter most for AI visibility

1. FAQPage schema

FAQPage schema is the single highest-leverage schema type for AI citation. The reason is structural: AI systems generate responses in question-answer format. Content marked up as question-answer pairs maps directly to the output format the system needs to produce, removing the extraction step entirely.

Google deprecated the FAQ rich result on 7 May 2026 — meaning the visible SERP dropdowns are gone. But the schema itself remains valuable for two reasons. First, AI assistants (Perplexity, ChatGPT, Claude, Gemini) still parse FAQPage markup for citation extraction. Second, Microsoft and other search engines continue to use it. Treat FAQ schema as an AI signal, not a Google rich result signal — the strategic value has shifted, not disappeared.

Google’s structured data documentation still specifies the correct implementation format even after deprecation. The questions in the schema must match the questions visible on the page exactly — AI systems cross-reference markup against visible content, and discrepancies reduce citation confidence rather than improving it.

Questions should mirror the natural language prompts your ICP would type into an AI interface — not the branded language your organisation uses to describe its own services. “How does structured data affect AI search visibility?” outperforms “What are the benefits of schema markup for our platform?” because the first matches informational intent and the second matches promotional framing that AI systems are trained to discount.

Each answer must open with a direct response in the first sentence. Answers between 50–150 words give AI systems enough substance to cite while remaining short enough to extract cleanly.

2. Article and BlogPosting schema

Article schema provides the content classification and publication context AI systems need for source evaluation. The required properties for AI citation purposes go beyond the minimum Google Search Central requires:

headline — must match the H1 exactly, not the meta title
author — must reference a Person schema entity, not a plain string
datePublished — required for content freshness evaluation
dateModified — update this every time the content is meaningfully revised
publisher — must reference an Organization schema entity (or Person, if the site is a single-person consulting brand)

The dateModified field is the most consistently overlooked. An article published in 2024 with no dateModified signal reads as a 2024 source to AI systems, regardless of whether it has been updated since. For topics where information changes — GEO strategies, AI search statistics, technical implementation guidance — an outdated dateModified actively reduces citation probability.

3. Person schema

Person schema is the author attribution layer. It connects the named author on the page to a verifiable entity with documented expertise. The minimum viable implementation for AI citation includes name, url pointing to the author’s About or profile page, jobTitle, and at least one sameAs link to an external profile such as LinkedIn.

The sameAs property is what makes Person schema useful for AI citation specifically. It gives AI systems a verification pathway — rather than evaluating the author claim solely from your own site, they can cross-reference against the external entities listed. An author with consistent name, title, and expertise signals across LinkedIn, external publications, and their own site receives higher citation confidence than an author who exists only on their own domain.

Nested schema relationships create entity graphs

Individual schema types provide individual signals. Nested schema relationships create entity graphs — and entity graphs are what AI systems use to evaluate the full citation confidence of a source.

The correct nesting structure for a blog article on a consulting site:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Article title matching H1",
  "datePublished": "2026-03-17",
  "dateModified": "2026-06-01",
  "author": {
    "@type": "Person",
    "name": "Nadia Mohamed",
    "url": "https://nadiamohamed.me/about/",
    "jobTitle": "SEO and GEO Consultant for SaaS & tech",
    "sameAs": [
      "https://www.linkedin.com/in/nadia-mohamed/"
    ]
  },
  "publisher": {
    "@type": "Organization",
    "name": "Nadia Mohamed",
    "url": "https://nadiamohamed.me",
    "logo": {
      "@type": "ImageObject",
      "url": "https://nadiamohamed.me/wp-content/uploads/logo.png"
    }
  },
  "mainEntityOfPage": {
    "@type": "WebPage",
    "@id": "https://nadiamohamed.me/insights/[slug]/"
  }
}
</script>

This structure allows an AI system to resolve: the article was written by Nadia Mohamed, who is an SEO and GEO Consultant, whose LinkedIn profile corroborates this, published on nadiamohamed.me. That is a citable chain. A flat Article schema with "author": "Nadia Mohamed" as a plain string provides none of the verification pathways.

BreadcrumbList schema adds a fourth layer to this chain — it tells AI systems where the article sits within the site’s topical structure. An article explicitly nested within a GEO consulting site’s insights cluster carries more topical authority signal than the same article on an isolated page with no site hierarchy context.

Implementation: format, placement, validation

Format

JSON-LD is the correct format for all schema implementation targeting AI visibility. Google Search Central’s structured data documentation recommends JSON-LD specifically because it separates markup from HTML content. This clean separation allows AI systems to parse schema information without processing visual formatting or template code.

Microdata and RDFa embed markup within HTML elements. They produce the same structured data signals for Google’s validation tools, but they create parsing complexity for AI systems that are reading page content rather than crawling for rich result eligibility. JSON-LD schema sits alongside other machine-readable signals for AI crawlers, including the llms.txt file for surfacing key content — both give AI systems a clean, structured view of the site independent of visual formatting.

Placement

Place JSON-LD scripts in the <head> of the document or immediately after the opening <body> tag. Scripts embedded deep in the page body — particularly below the fold or inside template footer blocks — are processed less reliably by AI crawlers that prioritise above-fold content. If you’re using a tag manager (Google Tag Manager or similar) to inject schema, verify it fires before the AI crawler timeout — late-injected schema can be missed entirely by retrieval-augmented systems like Perplexity that have hard timeout windows.

Validation

Validate all schema implementation using Google’s Rich Results Test and the Schema.org validator. Run validation after every meaningful content update — not just after initial implementation. A dateModified update that accidentally introduces a formatting error will silently break the schema until the next manual validation check.

Google Search Console structured-data (Review snippets) report — 16 valid items, 0 invalid — Search Console's structured-data report confirms the markup validates site-wide — 16 valid items, zero errors. Test each change in the Rich Results Test before it ships; one malformed property can drop the whole block.

When to update your schema

Three triggers should prompt a schema update, each at a different cadence.

Every content revision. Update dateModified any time the page’s content is meaningfully changed — new statistics, expanded sections, corrected claims, new sources. The threshold is “would a reader notice the difference?” — if yes, the schema’s dateModified needs to move forward to match. A typo correction doesn’t qualify. A new section with verified primary-source statistics does.

Every quarter, as a coverage audit. New pages are frequently published without schema attached, and existing schema becomes stale as sameAs URLs change, author credentials evolve, or the publisher Organization details shift. Run a full schema coverage sweep across priority pages quarterly to catch these silent gaps — it slots naturally into the recurring technical SEO audit cadence.

After any site-wide change. CMS migrations, URL structure changes, theme updates, and template revisions can silently break mainEntityOfPage values, BreadcrumbList paths, and image URLs in schema. Run a full validation pass after any change that touches the templates schemas live in.

Common mistakes that block AI citation

Schema that contradicts visible content. AI systems cross-reference markup claims against what they read on the page. If the headline in your Article schema differs from the H1, if the author name in the schema does not match the byline, or if dateModified shows a date more recent than the visible content warrants, the system flags the inconsistency and reduces citation confidence. This is more damaging than having no schema at all.

Incomplete Person schema. An author marked up only with a name and no url, jobTitle, or sameAs provides almost no verification pathway. AI systems cannot confirm that “Nadia Mohamed” in the schema refers to a specific person with documented expertise in the topic being cited. The schema exists but does not do the work.

Missing dateModified updates. Publishing strong schema at launch and never updating dateModified means that every revision you make to the content — adding new data, updating stats, expanding sections — is invisible to AI systems evaluating content freshness.

FAQPage questions that don’t match user intent. FAQ schema populated with internal questions (“What does our service include?”) rather than search-intent questions (“How does schema markup affect AI search visibility?”) produces markup that is technically valid but useless for AI citation. AI systems match FAQ schema against user queries — if the questions in the schema don’t resemble the queries, the schema provides no citation advantage.

Breadcrumbs missing on interior pages. BreadcrumbList schema is frequently implemented only on the homepage or not at all. Every interior page — every article, every service page, every tool page — benefits from explicit breadcrumb markup that signals its position in the site hierarchy.

Measuring schema impact on AI visibility

Schema implementation is not directly measurable through a single metric. The effects appear across three data sources used together.

AI citation monitoring. Run your 10–15 priority queries directly in ChatGPT, Perplexity, and Google AI Overviews before and after schema implementation. Document whether your domain appears as a source, which pages are cited, and which competitors appear instead. This is the most direct measurement available and should be done monthly regardless of schema changes. For a deeper Perplexity-specific monitoring workflow, see How to Get Cited by Perplexity.

Google Search Console structured data reports. GSC’s Enhancements section flags schema errors and warnings at the page level. Errors here mean the markup is failing Google’s validation — and by extension, creating the inconsistency signals that reduce AI citation confidence. Resolve all errors before measuring AI impact.

GA4 referral traffic by landing page. Once you have a custom AI channel group in GA4 isolating Perplexity, ChatGPT, Claude, and Gemini traffic, the landing page dimension shows which pages are receiving AI-referred sessions. For the full GA4 setup walkthrough including the regex pattern for the channel group, see How to Track AI Referral Traffic in GA4. Schema implementation should correlate with specific pages beginning to receive AI traffic that was absent before. Track this per-page, not at the aggregate site level, to isolate the signal.

If you want a complete review of your schema coverage and which pages have the highest-priority implementation gaps, the AEO Article Analyzer scores any article against the 10 criteria AI engines use to decide what to cite, including the schema-related signals — 0–100 readiness score, top-3 highest-impact fixes, in under 30 seconds.

FAQ

What types of structured data are most important for AI search visibility?

FAQPage, Article, and Person schema are the three types with the most direct influence on AI citation probability. FAQPage schema aligns content with the question-answer format AI systems generate responses in. Article schema provides publication context, datePublished, and dateModified for freshness evaluation. Person schema establishes the author as a verifiable entity with documented expertise via sameAs links to LinkedIn and other external profiles. BreadcrumbList schema adds topical hierarchy context. Organization schema matters for business sites where publisher identity is distinct from individual authorship. Implement all five across priority pages before adding any industry-specific schema types.

Can structured data guarantee citations in AI search results?

No. Schema markup removes ambiguity that would otherwise prevent citations — it does not create citations where content quality or topical authority is insufficient. A page with perfect schema and thin, unoriginal content will not be cited. A page with strong, original, well-structured content and proper schema will be cited more consistently than the same page without schema. ConvertMate’s analysis of 80M+ AI citations found that comprehensive structured data implementation improved LLM discoverability by 67% — meaningful, but the multiplier rests on the underlying content being citation-worthy in the first place.

Does structured data replace traditional SEO for AI search optimization?

Schema markup is one component of a GEO strategy, not a replacement for any part of traditional SEO. Ahrefs analysis of 15,000 prompts found that across all AI assistants, only ~12% of AI-cited URLs rank in Google’s top 10 on average — but Perplexity is the outlier at 28.6% overlap, making it the most Google-aligned AI assistant. Schema implementation supports both objectives: it improves AI citation eligibility while also maintaining the Article, BreadcrumbList, and Person signals that drive Google’s understanding of your content and authors.

Is FAQ schema still worth implementing now that Google deprecated the rich result?

Yes. Google deprecated the FAQ rich result on 7 May 2026, which removed the visible SERP dropdowns. But the schema itself retains value for AI citation eligibility — Perplexity, ChatGPT, Claude, Gemini, and Microsoft Copilot still parse FAQPage markup to extract question-answer pairs for inclusion in generated responses. The deprecation changes where the value comes from (no more Google rich results), not whether the value exists. Treat FAQ schema as an AI signal rather than a Google rich result signal — the strategic role has shifted, not disappeared.

What happens if my structured data contains errors?

Malformed markup has two failure modes. First, AI systems may ignore the schema entirely and revert to inferring entity information from page content alone — removing the citation confidence signals the markup was intended to provide. Second, and more damaging, markup that is technically parseable but internally inconsistent — a schema headline that does not match the H1, or an author whose sameAs link goes to a broken profile — creates active distrust signals. Validate all schema on a staging environment before deployment and re-validate after any meaningful content update using Google’s Rich Results Test and the Schema.org validator.

How do I validate structured data for AI search compatibility?

Use Google’s Rich Results Test and the Schema.org validator as the primary validation tools. Both surface errors that reduce AI citation confidence. Additionally, test your key pages manually in Perplexity and Google AI Overviews by asking questions that your content directly answers — if your schema is clean and your content is well-structured, citations should begin appearing for queries where your content has genuine topical authority. Cross-reference your GSC Enhancements report for any schema errors flagged at the page level.

When should I update my schema implementation?

Three triggers: (1) Update dateModified every time the content is meaningfully revised — new stats, expanded sections, new sources. Typo fixes don’t qualify; substantive updates do. (2) Run a quarterly coverage audit across priority pages — new pages get published without schema, and existing schema goes stale as sameAs URLs change or author credentials evolve. (3) Re-validate after any site-wide change — CMS migrations, URL structure changes, theme updates, or template revisions can silently break mainEntityOfPage, BreadcrumbList, or image URLs in schema.

Should I use JSON-LD, Microdata, or RDFa for AI search optimization?

JSON-LD. Google Search Central explicitly recommends JSON-LD because it separates markup from HTML content, which lets AI systems parse the schema independently of visual formatting and template code. Microdata and RDFa embed markup within HTML elements and produce the same Google validation signals, but they create parsing complexity for AI systems that are reading page content for citation rather than crawling for rich result eligibility. Place JSON-LD in the document <head> or immediately after the opening <body> tag for highest extraction reliability.

Where this fits

Service GEO & Technical SEO Architecture Audit Read → More like this All AI-SEO & GEO articles Read → Talk to me Book a 30-min discovery call Read →