From Query to Citation: How Snippet Signals Influence AI Search

TL;DR

AI search engines reward content that matches both user intent and search language. Precision matters more for informational queries, while semantic breadth and synonym coverage win for commercial queries.

Citation selection for commercial queries showed greater tolerance for vocabulary variation, with 25% of citations using synonyms.
Informational queries demonstrated more consistent language patterns and were nearly twice as likely to surface sources with exact-query terms in snippets.
For informational queries, more than 60% of titles and slugs matched search language. Clear, direct phrasing—like “how to,” “what is,” or “best practices”—made pages much easier for LLMs to interpret and cite.
Refresh content to align with search intent and optimize for your users search phrases to gain visibility—prioritize both clear structure and the actual words your target audience uses to maximize results.

‍

Getting cited by AI search isn't just about having good content –– it's about having content that aligns with what users are actually searching for.

But here's the challenge: there are different ways to measure if your page "aligns" with a search query, and each method reveals different insights about what makes content citation-worthy.

For this research, we examined how closely a page's title and URL slug match with the original search query—both in words and semantic similarity—and what that means for earning citations in AI-generated answers.

By combining these measurements, we wanted to understand:

Do titles or slugs have an impact in telling AI search a page is relevant to the user search intent?
Which of these on-page attributes provide the weakest or strongest signals to LLMs?
What practical steps can content teams take to improve their visibility in AI answers?

‍

What We Analyzed and How We Measured It

For this report, we examined over 300 pages cited in AI search results, focusing on two types of intents:

Commercial searches — queries where users are comparing or evaluating solutions (e.g., “best CRM software,” “email marketing tools”).
Informational searches — queries aimed at learning or problem-solving (e.g., “what is cloud computing,” “how to set up email automation”).

For each cited page, we evaluated how closely the search query matched two core on-page attributes:

Page title – the main headline shown in search results
URL slug – the last segment of the URL (e.g., /best-crm-software)

We measured alignment using:

Root-word overlap (lemmas + Jaccard index): Literal vocabulary match
Intent matching (cosine similarity): Does the content answer the same underlying need, even if phrased differently?

Method	How It Works	What It Measures	How It's Measured
Lemma Overlap	Breaks words down to their root form and counts the overlap.	Raw count of shared root words.	A whole number (e.g., 3 shared lemmas).
Jaccard Index	Uses the lemma sets from above and calculates the overlap as a percentage of all unique words combined.	Relative strength of overlap.	A percentage score (e.g., 43% similarity).
Cosine Similarity	Looks at semantic meaning using embeddings — even if exact words differ.	Relationship similarity between two or more entities.	A decimal value between 0–1 (e.g., 0.60, or 60% similarity).

‍

Measurement Criteria

We defined the page attribute (title or slug) to be optimized or a strong match if it had either of the following:

30% or more word overlap with the search query (Jaccard index)
A cosine similarity of 60% or higher with the search query

Our Findings

Our analysis uncovered clear patterns in how content is cited by AI search, depending on both the query type and the on-page attributes. We found that commercial queries showed greater tolerance for vocabulary variations in citation selection, while informational queries demonstrated more consistent language matching patterns across all attributes measured.

Informational Queries: Precision Wins

When users search for knowledge or how-to guidance, AI search engines prioritize content that directly matches both the intent and language of the query. Our analysis shows that titles and slugs closely aligned with search intent are much more likely to earn citations.

Cosine Similarity: Match Search Intent, Get Cited

When it comes to informational queries, content that closely matches the intent behind the search is most likely to earn citations from AI. Cosine similarity measures whether your titles and slugs are truly answering the same need as the search query—even if the words aren’t an exact match.

*Cosine similarity of on-page attributes from cited pages*

‍

Key Insights:

Informational pages require >60% cosine similarity between queries and on-page signals to be cited.
Across informational citations, the average similarity to the search term was 65% for titles, and 67% for slugs.
Titles and slugs with direct, clear phrasing—like “how to,” “what is,” or “best practices”—were far more likely to be cited.
Content that followed predictable, intent-aligned patterns in titles and slugs was significantly easier for LLMs to extract and recognize.
URL slugs stood out as the strongest signal across this search intent group, with 65.5% of citations meeting the alignment threshold and an average cosine similarity of 80%.

Jaccard Index: The Power of Word-Matching

In addition to informational pages having strong alignment with the users search intent, we found informational pages often used the exact words their audience searched for.

On average, the titles and slugs of these citations contained 27–30% of the same words as the search query (Jaccard index). Align your content with user intent by using the exact terms they search for to gain a clear advantage in being surfaced.

What This Means for Content Teams:

To maximize your chances of earning citations in AI search, craft your titles and slugs with clear, direct language that closely matches both the intent and the words your audience actually uses. Here’s what to do:

Use explicit, direct phrasing in titles and slugs—such as “how to,” “what is,” or “best practices”—to align with user queries.
Include at least 30% of query terms in your titles or slugs; pages meeting this threshold were twice as likely to earn citations.
Mirror your audience’s language and structure so both LLMs and users instantly recognize your content’s relevance.

*Average cosine similarity score of attributes for all informational citations*

On average, 70% of all informational citations had a title or slug closely aligned (>60% cosine similarity) with the search query, demonstrating that precision and user alignment are critical to earning visibility in AI search.

Commercial Queries: Breadth and Flexibility Matter

As we analyzed the on-page attributes of commercial citations, we found clear differentiation, answer engines surface pages that cover a wide range of related terms, solutions, and synonyms–often prioritizing relevant variations of the search topic—rather than strict repetition of the query.

When users are evaluating tools, solutions, or making purchase decisions, AI search engines show much more flexibility in how content matches the query. Instead of exact phrasing, they reward semantic breadth, synonym coverage, and comprehensive topic framing.

Cosine Similarity: Flexible Relevance

When we analyzed commercial citations, we found that titles and slugs were not required to closely match the exact wording of the query to be interpreted as relevant, and earn a citation.

In fact, roughly 55% of cited titles and 62% of cited slugs met the 60% similarity threshold with their search query—showing that a majority of commercial content reached moderate alignment, but not through exact phrasing.

Within these query responses, AI surfaced content that balanced semantic overlap with broader coverage—incorporating synonyms, related solution categories, and alternative terminology.

This shows that, for commercial queries, AI search engines allow for greater flexibility in how content aligns with user intent—favoring broader topic coverage and synonym use over strict semantic alignment.

Key Insights:

Commercial citations showed weaker alignment overall, with average cosine similarity scores about 8% lower for titles and 10% lower for slugs compared to informational queries—highlighting that earning citations in commercial search depends less on exact matches and more on synonym use and broader coverage.
Titles and slugs that incorporated a range of related terms, synonyms, or broader solution categories—rather than repeating the exact query—were often cited.

Jaccard Index: How Synonyms Influence Citations

Our research found that literal word overlap plays a smaller role in what earns visibility for commercial intent phrases. This signals that AI models often interpret related terms and synonyms as equally relevant.

Across commercial searches, LLMs prioritize meaning over exact phrasing—where an average 22% of pages earned citations even though they used synonyms instead of the original query terms–indicating that clear, diverse, intent-aligned language—not just exact matches—helps your commercial content get cited.

Key Insights:

On average, titles and slugs in commercial citations shared only 16% of their words with the search query.
25% of cited titles and 19% of slugs replaced keywords the main query term with a synonym—demonstrating that AI search rewards content that captures intent through flexible, varied language rather than exact keyword repetition.
Nearly 90% of cited commercial pages had less than 30% word overlap with its query, showing that semantic variety creates far more citation opportunities.

Going beyond simple synonym swaps, we observed that many cited commercial pages also used broader category terms or related concepts (like “platforms” or “solutions”) to capture a wider range of relevant searches—demonstrating that semantic variety, not just direct keyword replacement, has an impact on earning citation share.

What This Means for Content Teams:

For commercial queries, AI search engines are more flexible (for now). Your content can still be cited even if it doesn’t use the exact search phrase. Here’s how to approach it:

Leverage synonyms and variations thoughtfully: AI engines show greater tolerance for related terms and synonyms in commercial queries, so a single page can rank for multiple phrasings of a topic. Don’t abandon exact matches, but know you have more room to capture related searches.
Design pages to serve multiple query angles: Unlike informational queries (where precision matters most), commercial queries reward breadth and variety. Cover related categories and synonyms to capture more citation opportunities while the window for looser matches is still open.
Balance breadth with precision where it matters: You can go broad (cover multiple variations) or go precise (optimize tightly for similarity). Both paths can work, but with rising competition, favoring higher similarity now gives you an edge.

Use On-Page Attributes to Scale Content Reach

As AI search evolves, brands that move quickly and align every piece of content to user intent will earn the greatest visibility.

Winning teams structure their content with unmistakable clarity—using precise, relevant language in titles, slugs, and on-page attributes—so both users and answer engines immediately understand what’s on the page and why it matters. Prioritize E-E-A-T principles to make your content more helpful, credible, and discoverable.

For content teams:

Structure content and on-page attributes clearly, optimize for user intent, and maximize your chances of being cited by balancing precision with broad topical coverage.
Refresh content regularly—70% of ChatGPT-cited pages were updated within the past year.

Brands that consistently prioritize clarity, extractability, and relevance will stand out and win visibility in both traditional and AI-powered search.

Ready to future-proof your content for AI search?

Book a call to learn how AirOps helps brands earn, monitor and maintain visibility across both traditional and AI search.

From Query to Citation: How Snippet Signals Influence AI Search

What We Analyzed and How We Measured It

Measurement Criteria

Our Findings

Informational Queries: Precision Wins

Cosine Similarity: Match Search Intent, Get Cited

Jaccard Index: The Power of Word-Matching

What This Means for Content Teams:

Commercial Queries: Breadth and Flexibility Matter

Cosine Similarity: Flexible Relevance

Jaccard Index: How Synonyms Influence Citations

What This Means for Content Teams:

Use On-Page Attributes to Scale Content Reach

Win AI Search.