July 01, 2026

How 13 Words on Reddit Can Hijack ChatGPT and Gemini

In June 2026, researchers at Cornell Tech showed that a single Reddit comment, about the length of a text message, was enough to make ChatGPT Deep Research and Gemini Deep Research recommend a restaurant that does not exist. The comment was thirteen words long. The restaurant was called Sol Azteca. There is no Sol Azteca in Austin. The AI did not care.

That finding, published as part of an attack the researchers named WARP (Web Agent Retrieval Poisoning), sits awkwardly next to the marketing industry's newest obsession. Every generative engine optimisation deck now includes the same slide: Reddit is the most-cited source in AI answers, so brands must be on Reddit. What almost none of those decks mention is that the mechanism that makes Reddit valuable to marketers is the same mechanism that makes it trivial to weaponise against consumers.

Reddit did not sneak into AI search. It walked in the front door.

For most of 2025 and into 2026, AI visibility analysts have been converging on the same conclusion. Peec AI's 30-million-source analysis ranks Reddit as the single most-cited domain across ChatGPT, Google AI Mode, Gemini, Perplexity, and AI Overviews. Tinuiti's Q1 2026 tracking put Reddit at over five percent of all ChatGPT citations in a single month, and roughly a quarter of all Perplexity citations. On some measures, Reddit sits above Wikipedia. On Perplexity specifically, no domain even comes close.

The reasons are structural, not accidental. OpenAI signed a content deal that pipes Reddit threads directly into ChatGPT. Google pays a reported sixty million US dollars a year for the same access, feeding Reddit into AI Overviews and AI Mode. When those licensing deals kicked in through 2025, Reddit citation volume in AI answers jumped by more than four hundred percent almost overnight.

Underneath the deals is a design assumption. Large language models are trained to weight community-validated, experience-heavy content over polished marketing copy. Reddit's karma system and unpaid volunteer moderators function, from the model's point of view, as a distributed editorial layer that filters out spam and rewards long-lived consensus. SE Ranking's analysis of 129,000 domains found that companies with millions of Reddit mentions received roughly four times more ChatGPT citations than companies with minimal Reddit presence. The signal the model is chasing is not link authority. It is the sound of real people arguing about your category for years.

The mainstream advice: seed Reddit or lose the answer

The natural response from the marketing industry has been to industrialise Reddit as a distribution channel. A new sub-genre of GEO playbook has appeared in the last six months, most of it converging on the same tactics. Publish a "canonical" Reddit post per quarter, structured with a numeric TL;DR at the top, four to six bullets, timestamps, and a small "method" section. Anticipate follow-up questions in the comments. Update the post with dated edits every few weeks so freshness signals accumulate. Avoid promotional links and let the moderators do the work of keeping the thread alive.

These posts are engineered to be quotable. The models chunk them, cite them, and often lift the TL;DR wholesale into answers. Graphite's Ethan Smith, speaking on Lenny's Podcast earlier this year, described this as the shortest path from an explanation to the answer people see in AI search. Webflow reportedly measured a six-fold uplift in conversion for traffic arriving via LLM answers compared to standard Google search.

What almost every serious practitioner in the space also acknowledges is that faking this does not work. Roughly eighty percent of the Reddit threads that AI engines cite have fewer than twenty upvotes. The average age of a cited thread is around 900 days. The models are surfacing historical consensus, not last week's growth hack. Manufactured virality, purchased accounts, and sockpuppet upvote farms fail because the models are not looking for popularity. They are looking for durable, moderated conversation.

Or so we thought.

Enter WARP: the thirteen-word poisoning attack

The Cornell Tech paper, led by researcher Tingwei Zhang, tested a much narrower question. What happens if you do not try to fake an entire thread, and instead append a short promotional passage to a real, popular, already-cited Reddit thread? The team called this Web Agent Retrieval Poisoning because it exploits the retrieval-augmented generation loop that all modern deep research agents depend on.

The attack has three stages, and none of them require access to any AI company's infrastructure. First, the attacker uses ordinary Google searches to find the Reddit URLs that keep appearing across many related queries in a target category. Second, they draft a short promotional passage, often with the help of an LLM, that reads naturally in the style of the surrounding thread and slips a fictitious entity into the conversation. Third, they post that passage as a comment on the identified thread.

When the researchers appended a sentence recommending Sol Azteca to a real thread on r/austinfood, ChatGPT Deep Research began citing that thread and inserting Sol Azteca into answers about the best Mexican food near Austin. When they seeded a similar comment about a fictional dating app called SilverPath into r/OnlineDating, the model started recommending SilverPath alongside Match and Hinge in answers for "best dating apps for divorced men over 50". None of these products existed. The AI wrote about them with the same confidence it used for real ones.

Why the usual defences fail

The most uncomfortable finding in the paper is not that the attack works. It is that the standard defences make it worse.

Perplexity-based detection is the classic technique for spotting AI-generated spam. It flags text whose statistical fluency looks abnormal against a language model's expectations. The Cornell team found that GEO-optimised poisoned text has lower perplexity than the genuine Reddit comments around it, because it was written by an LLM to be maximally quotable. High-perplexity filters, in other words, actively remove the real users and preserve the poison.

Output similarity analysis fails for a related reason. Poisoned research reports within the same topic cluster scored higher similarity to clean reports than clean reports scored to each other. When a manipulation is well-crafted enough to pass a plausibility check, output filters cannot distinguish it from a legitimate recommendation. A fake supplement placed alongside three real ones looks, at the report level, exactly like a real supplement placed alongside three others.

Input filtering, where a language model screens sources before they are used, fails on the same fluency grounds. Zhang put the core problem plainly: from the model's perspective, a random Reddit comment and an article from a government website are treated as roughly the same class of evidence. The retrieval layer does not carry a strong credibility gradient. That is precisely the design choice that makes RAG useful for current information, and it is also what makes it exploitable.

This is already happening in the wild

Cornell's contribution is to formalise a technique that reporters have been documenting all year. A Wall Street Journal investigation published in January 2026 traced businesses paying professional firms to plant so-called "brand authority statements" across UGC sites, specifically because the placements trigger AI recommendation surfaces. Reddit moderators have caught product companies coordinating comment sequences and edit-in-link tactics on their own subreddits. One case involved a peptide tracking app called PepPal, which built up engagement on a health thread before editing the top post to link out to the product.

None of this requires state-level resources. The technical bar is a Reddit account in good standing and an LLM. The economic bar is a few hours of research to find the right threads. The blast radius is any user who asks a deep research agent for a recommendation in that category for as long as the poisoned comment stays live.

What this means if you rely on AI answers

The immediate takeaway for anyone using ChatGPT, Gemini, Perplexity, or Claude for real decisions is uncomfortable but simple. When the answer involves money, health, or an unfamiliar vendor, treat the summary as a starting point and open the citations. The manipulation is not in the URL the AI links to. It is in how the AI summarises what the URL says. The linked Reddit thread is real. The recommendation lifted from a comment near the bottom might not be.

This is a departure from how most users have been trained to think about AI citations. A cited answer feels more trustworthy than an uncited one, and by default it usually is. But citation of a real page is not verification of the specific claim being extracted from that page. The signal a citation actually carries is that the model found supporting text, not that the supporting text was accurate, human-written, or offered in good faith.

What this means if you are marketing a brand

For anyone building a GEO strategy in 2026, the WARP paper sharpens a question that has been quietly present for a year. The line between legitimate Reddit GEO and the poisoning attack is not a line at all. It is a spectrum. Writing a canonical thread that anticipates follow-up questions and includes quotable statistics is the sanctioned version of the same behaviour that, taken further, becomes fraud. The difference is intent and disclosure, not mechanism.

That has three practical implications. Genuinely useful, honest Reddit participation is more defensible than ever, because moderators, users, and eventually the models themselves will only get better at pattern-matching manipulation. Any agency promising fast AI citations through Reddit "seeding" services should be treated with the same suspicion as a link farm vendor in 2012. And brands that already have durable, positive community presence on Reddit are sitting on an asset that is genuinely difficult to replicate and increasingly hard to buy.

The failure mode to avoid is the middle path, where a brand tries to look organic while operating manufactured accounts. Reddit's moderation surface will catch most of it. What survives moderation will look, statistically, exactly like the WARP attack pattern, which is precisely the pattern the platforms will spend the next two years learning to detect.

The uncomfortable middle ground

The honest summary of where AI search sits in mid-2026 is that its most valuable source of evidence and its most exposed attack surface are the same source. Reddit earned its position in AI answers by being the closest thing the open web has to a moderated, adversarial, high-signal conversation at scale. It cannot lose that position without the entire generative search stack losing something important. But holding that position means the platforms and the models absorb a set of adversarial risks they have not solved yet.

For consumers, that argues for scepticism at exactly the moments AI is most convenient. For marketers, it argues for patience at exactly the moments the tactics feel most tempting. And for anyone building AI products on top of retrieval, it argues that the credibility gradient inside the retrieval layer is the next real research problem, not a solved one.

Thirteen words is not a lot. It is enough.

How 13 Words on Reddit Can Hijack ChatGPT and Gemini

Reddit did not sneak into AI search. It walked in the front door.

The mainstream advice: seed Reddit or lose the answer

Enter WARP: the thirteen-word poisoning attack

Why the usual defences fail

This is already happening in the wild

What this means if you rely on AI answers

What this means if you are marketing a brand

The uncomfortable middle ground

Recent Blogs

The Fake AI Skill That Passed Every Scanner

Span XFRA puts an AI data center beside your home

How NVIDIA Rubin Cools AI Data Centers With 45 Degree Liquid

Enough talk, let’s get to work

Links

Services

Contact Details