Perplexity AI’s bones are built on GPT-3.5, but it has a connection to the open internet for up-to-date information.
Perplexity AI
Imagine if ChatGPT could pull answers from Reddit. That’s the best way to describe Perplexity AI, a conversational generative AI founded by Aravind Srinivas, a former research scientist at OpenAI, the creators of ChatGPT. Perplexity looks and feels a lot like ChatGPT 3.5, the free version of the popular AI chatbot, except it has a connection to the open internet. This means it not only pulls information from sites like Reddit and X (formerly known as Twitter) but links to them, too. ChatGPT 3.5, on the other hand, is limited to data collected up to September 2021 and can’t link to sources. It’s unclear whether ChatGPT uses Reddit or X as part of its training data.
When it comes to shopping recommendations or general research, being able to see the source information is invaluable. Clicking on a Reddit link inside Perplexity allows you to see the full conversation thread between users, helping to get more context. Like Google Gemini, another freely available generative AI engine, Perplexity feels like a blend of AI chatbot and search engine. Perplexity does falter in research and synthesizing information at times, failing to hold its own against Anthropic’s Claude.
How 4Foo tests AI chatbots
4Foo takes a practical approach to reviewing AI chatbots. Our goal is to determine how good an AI is relative to the competition and which purposes it serves best. To do that, we give the AI prompts based on real-world use cases, such as finding and modifying recipes, researching travel or writing emails. We score the chatbots on a 10-point scale that considers factors such as accuracy, creativity of responses, number of hallucinations and response speed. See How We Test AI for more.
Perplexity collects data for AI improvement by default, but you can opt out by turning off the AI Data Usage toggle in Perplexity’s settings. For more information, see Perplexity’s Privacy Policy and data collection FAQ.
Shopping
Generally, when trying to decide between buying two very similar products, it helps to get some opinions that can demarcate key differences to make the final choice easier. This is why people turn to reviewers or forum threads to synthesize varying sets of opinions.
An AI Chatbot should do a good job of summarizing all that back-and-forth so that you don’t have to read through paragraphs of text.
While Perplexity does look to sources like Rtings, Tom’s Guide and WhatHiFi when asking which TV to buy between the LG OLEDs C3 and G3, it doesn’t do a great job of parsing the finer details to give you better context.
For example, when I asked Perplexity to choose between the LG’s top OLEDs, it recommended buying the more expensive G3 if your budget allows it. It’s a totally fair conclusion, but fails to make a convincing argument. It justifies paying nearly an extra grand for the G3 because it’s 70% brighter compared to older OLED TVs. But Perplexity doesn’t specify which older OLEDs it’s comparing the G3 to. While the G3 does have a brighter panel, CNET’s TV expert David Katzmaier notes in his LG OLED C3 review that the G3 doesn’t surpass it by leaps and bounds. It’s why both the C3 and G3 sit on our best TVs of 2024 list.
A more nuanced take would be that the G3 is overall the better television in terms of both picture quality and brightness, but it might be difficult to justify spending nearly $1,000 more for it for most people, especially those jumping into the world of OLED TVs for the first time.
On the LG OLED subreddit, many TV shoppers ask if it’s better to buy a 65-inch LG OLED G3 or spend the equivalent amount of cash for a 77-inch LG OLED C3, instead. The consensus generally is that bigger is better. When posed the same question, Perplexity too sourced Reddit for inspiration and came away with the same conclusion. Katzmaier agrees that this is always the better choice.
Oddly, when asked to compare the older 2019 LG C9 OLED and the 2023 LG OLED C3 (it’s confusing, I know), Perplexity started to hallucinate. At first, it just did a comparison between the C3 and G3. When pressed to specifically compare the C3 to the C9, then it started giving incorrect information, such as the C3’s inclusion of MLA technology for higher brightness. In reality, MLA is currently only available in the higher-end G3 and M3 models.
All-in-all, Copilot (in creative mode) and Claude performed the best, giving both precise information and relatable buying advice. Perplexity performed on par with Google Gemini. Since ChatGPT 3.5’s training data is only inclusive up to September 2021, it couldn’t be used for this specific shopping comparison.
Recipes
AI could very well upend the online recipe world. Where many recipes online feature long dissertations of eating Mom’s Sunday dinner, this is often done to appease Google’s Search Engine Optimization, or SEO. It’s why many online articles feature question-marked subheads that restate common search queries.
All that added text is so that Google can “crawl” these recipe sites and figure out which ones should filter to the top. But for readers, it can mean lots of unnecessary text.
AI doesn’t need to write for Google. It aims to generate succinct answers to pretty much any question. Plus, Perplexity AI really can’t recall eating Grandma’s apple pie in the first place.
When asking Perplexity to generate a marinade for chicken tikka masala, it created a middling recipe overall. It had ingredients like ginger and garlic paste, ground cumin and turmeric, but was missing things like chili powder. Granted, not all recipes call for chili powder, but it is an odd exclusion. When asked again, Perplexity generated a recipe that did include both red chili powder and red chili paste. This echoed similar results to ChatGPT 3.5. Only Google Gemini produced recipes that included more exotic ingredients like kasuri methi (dried fenugreek), chaat masala and amchur (dried mango powder).
Research and accuracy
Perplexity AI’s biggest strength over ChatGPT 3.5 is its ability to link to actual sources of information. Where ChatGPT might only recommend what to search for online, Perplexity doesn’t require that back-and-forth fiddling.
When asking for studies about how or if homeschooling affects neuroplasticity, Perplexity did a decent job of linking to some papers that could be helpful. While none of the studies cited made direct links to how homeschooling might affect young minds, it did look at papers about home-based motor learning and other general information.
Perplexity, oddly, did cite a nonscholarly source from what looks to be a homeschool advocacy website. Obviously, the information here isn’t an objective analysis, and instead leans more on why, from a religious perspective, it might be better to school kids at home.
Unlike Claude and Copilot, Perplexity failed to synthesize information from sources. It’s one thing to point to pieces of information like a search engine, it’s another thing entirely to start making connections between two sets of research. Perplexity also stated that the pieces of research cited definitively proved the benefits of homeschooling for childhood brain development, which isn’t quite the case. At least Perplexity didn’t hallucinate in the same ways that ChatGPT 3.5 or Google Gemini did.
A slight edge here goes to Claude, followed closely by Copilot.
Summarizing
Don’t turn to Perplexity to summarize articles. While the AI engine can get the basic gist of the article, it fails to grab the central crux or argument.
I asked Perplexity to summarize a feature I wrote during CES earlier this year. Like Google Gemini, it’s possible to just paste a link to the article and Perplexity will generate a bare-bones summary. It generated more detail than Gemini, but not by much.
In Gemini, when copy-pasting the text of the entire article, it did a much better job of summarization. When attempting the same test in Perplexity, it oddly generated the exact same response as when I input the website link. Still, at least it didn’t have a character limit like ChatGPT 3.5. This does make it more useful, but without calling on key points or pulling quotes from experts I spoke to, Perplexity doesn’t do enough to give users a well-rounded understanding.
Claude and Copilot performed the best, generating an adequate summary, but still glossing over the main crux of the piece.
Travel
Major cities around the world have guidebooks, influencers and websites dedicated to showcasing their best sights and eats. Smaller midwestern cities don’t have that same privilege. Turning to AI for recommendations on what to do in Columbus, Ohio, for example, could prove to be handy. Compared to Google Gemini and ChatGPT 3.5, Perplexity passed this test with decent marks.
For a three-day travel itinerary to Columbus, Perplexity made solid recommendations to visit sites like the Franklin Park Conservatory or the Columbus Zoo and Aquarium. Weirdly, neither Google Gemini nor ChatGPT 3.5 recommended the Columbus Zoo, which happens to be one of the largest zoos in the US.
Where Perplexity faltered was in food recommendations. Apart from Day 1, it didn’t suggest any specific places to try, instead vaguely stating to dine at “one of the local ethnic restaurants.” ChatGPT 3.5, by comparison, made strong restaurant recommendations. At least Perplexity didn’t hallucinate in the same way Gemini did by making up restaurants that didn’t exist.
Copilot performed the best, followed by Claude. Copilot cleanly laid out a list, with pictures and emojis, making it easy to follow.
Writing emails
Writing routine emails to bosses or colleagues is a great way to use AI. When drafting an email asking for time off from work, Perplexity performed better than ChatGPT and about on-par with Google Gemini. Perplexity’s formal and informal-sounding emails came off as earnest and very humanlike.
By comparison, Gemini’s formal-sounding email wasn’t totally usable, as it asks you to insert your company’s floating holiday policy. I suspect most people don’t copy-paste blocks of text from the employee handbook when asking for time off.
When it came to writing more complicated emails about difficult topics that delve into morality, capitalism and the role of consent, Perplexity made a decent outline, but wasn’t good enough in crafting an email that would sell as something crafted by a human. The language was robotic, lacking creative uses of language to help the reader see the image or argument being conveyed. It also leaned into cliched language that, at best, might pass in a high school English class.
While Perplexity did use some multisyllabic words, it ultimately came off as vacuous. Don’t ask Perplexity to write a pitch to your film script. It’ll definitely fall flat in front of movie executives.
Claude performed the best in this task, being able to juggle complexities or other moral qualms in a manner that came across as human. ChatGPT and Gemini did a decent job, but language was a bit too robotic and likely wouldn’t pass editorial muster.
Strangely, Copilot refused to answer questions about sensitive topics.
Perplexity flies where ChatGPT falls
I give Perplexity AI credit. It delivers a compelling generative AI experience that can compete against the biggest names in tech like Google and Microsoft. Perplexity’s use of the open web and its ability to pull from social media sites like Reddit and X give it context and talking points missing in ChatGPT. (OpenAI hasn’t confirmed what data ChatGPT pulls from, but I suspect it doesn’t heavily rely on Reddit or X).
Should Perplexity be your default free generative AI platform? Maybe. I’d certainly recommend it over Google Gemini and ChatGPT 3.5. But, I think it might have a tough time competing with Claude. While both Perplexity and Claude use GPT 3.5, Claude feels better tuned to give more nuanced answers with greater informational synthesis. Still, what the team has put together at Perplexity is worthy of praise.
As good as Perplexity is, it’s hard to recommend it over Claude or Copilot. The latter two are better tuned to give nuanced answers with greater informational synthesis.