Skip to main content

Consumers are 32 percent more likely to buy a product after reading a review summarized positively by AI, even though it largely hallucinated the answers to follow-up questions, findings of a peer-reviewed study showed.

The study, published by researchers at the University of California, San Diego last December, explored the nature of bias in large language models (LLMs) and the resulting influence it had on users.

Researchers used the Amazon reviews of a handful of electronic products, which were then summarized by several large language models (LLMs), including those based on Google’s Gemini and OpenAI’s ChatGPT. Then, the study recruited 72 participants on Prolific, a platform where users are vetted to do online studies, to gauge their willingness to buy a product based on the AI summaries and original reviews.

More of the participants picked a product when it was summarized positively by AI (83.7 percent), as opposed to those who would opt to buy a product based on the original human-written reviews (52.3 percent).

“What was really surprising was that all the models we tested tend to change the framing,” Abeer Alessa, the study’s lead researcher, told Sourcing Journal. She worked on the study as part of her Master’s education and is now a lecturer at King Saud University in Saudi Arabia.

For example, the study noted how more than a quarter of the LLM summaries (26.42 percent) changed the sentiment of the original text from something that read neutral to something that read positive.

“Not overwhelmed nor am I disappointed. This protector is on par with many on the market,” read an original review for a phone screen protector. The AI, on the other hand, summarized it this way: “This screen protector is decent with easy installation using the guide frame.”

The AI versions also had the tendency to make up facts. Most of them (60.33 percent) hallucinated their answers to follow-up questions outside of the data they were trained on. Some (10.12 percent) also only highlighted earlier parts of the original review as opposed to the entirety.

When asked about what fashion retailers can glean from the study, Alessa cautioned against reading this as an argument to use positive AI summaries to encourage sales: “No, that is not the intended takeaway.”