Perplexity AI
Tested iOS App version 2.13.0 in February 2024
Answer to Question 1
Screenshot of Perplexity AI answer to Question 1 – When was the Battle of Hastings? – showing the way Perplexity AI displays its results, with sources, images gallery etc
Accuracy Issues
Screenshot of Perplexity AI answer to Question 6 – Who has won the Best Director Oscar the most times?
This shows one of the accuracy issues that Perplexity AI got marked down on, it got the actual answer correct = John Ford with 4 Best Director Oscars – but then it incorrectly states ‘He is followed by Francis Ford Coppola, who has won the award five times’. Francis Ford Coppola has won 1 Best Director Oscar, but overall has won 5 Oscars, so the issue is really the wording as it states ‘the award’ when the last reference to the award was ‘Best Director Oscar’ not all Oscars.
Also the follow up question – How many times has Francis Ford Coppola won the Best Director Oscar? leads to inaccuracies, as in the original test in February it stated he has won it 3 times and in a retry on 9 March, it says he has won it 2 times – whereas he has won it for The Godfather Part II, so 1 time only.
I assume over time and updated versions of the software and models, these kind of issues will gradually disappear, but its an interesting test and result as its for a question that you can check yourself and check if the answer is correct – in other cases, you might be relying on the answer provided.
Perplexity AI scores were as follows:
- Ease of Use / Usability / UX / UI – Score = 9
- Very good all way through. Easy to use and navigate. Enjoyed working with the answers.
- Accuracy – Score = 8
- 2 points off as errors in some of the answers – see the explanation and screenshots above, for Question 6 and the follow up question for Question 6 – but, other than that, very good. The errors were for questions in fairly specific areas but should still have supplied correct answers.
- Response time – Score = 10
- Very quick all the way through.
- Error handling – Score = 10
- Worked well with garbled questions and nonsense questions and garbage data.
- Source quotation/Provenance – Score = 10
- Gave 4 or 5 sources every time and the ones I tapped were relevant.
- Working with the response e.g. further options – Score = 10
- Provides text, images, sources and chance to share the answer so can view it on desktop browser for instance.
- Overall rating and score – Overall Score = 9.5
- Very good all round, only negative points are a couple of errors in the answer text for Q6 and follow up question for Q6. Answered general questions, more specific questions, and handled nonsense questions and garbage data.
- Ranking = 1
- Overview – Much fuller experience with answers with more info, images, sources, links than the others. A couple of errors in answer text prevented 10/10
More info on Perplexity AI – main website
According to this Wired article, Perplexity AI uses a combination of OpenAI’s GPT, Meta’s open source Llama AI model, a model from French startup Mistral AI and Anthropic’s Claude – all wrapped up in Perplexity’s own ‘answer engine’.
All the posts
I have created a post for each AI Chatbot test in order to fully explore the test results, along with a Results post, where I crown the winning AI Chatbot. You can jump straight there or read each test post via the links below.
– Testing the AI Chatbots: Questions
– Testing the AI Chatbots: Perplexity AI
– Testing the AI Chatbots: OpenAI ChatGPT
– Testing the AI Chatbots: Microsoft Copilot
– Testing the AI Chatbots: Google Gemini
– Testing the AI Chatbots: Results
Note: The heading images used in these posts were created via Bing Image Creator
About my testing services: iOS App Testing / Android App Testing / Website Testing / AI Chatbot Testing