OpenAI ChatGPT
Tested iOS App version 1.2024.052 (22222) – Says ChatGPT 3.5 in header – March 2024
Answer to Question 1
Screenshot of OpenAI ChatGPT answer to Question 1 – When was the Battle of Hastings? – showing the sparse way it answers questions, with the answer a little bit of text. No sources information or sharing options. I found this a bit underwhelming having seen the answer Perplexity AI provided.
Accuracy Issues
OpenAI ChatGPT struggled with accuracy issues for the follow up question for question 6 – How many times has Francis Ford Coppola won the Best Director Oscar? as it answered saying twice, as can be seen in the screenshot. Strangely enough, all the AI Chatbots I tested struggled with this question, saying twice, when it should be one.
There was also an accuracy issue for Question 7 – Name an actor who has been nominated for most Oscars without winning one? – as it correctly answered with Peter O’Toole with 8 nominations – but didn’t mention Glenn Close at all – and she has also had 8 nominations. The other AI Chatbots correctly answered with both Peter O’Toole and Glenn Close mentioned in the answer.
And there was also an accuracy issue for the follow up question for Question 9 – Who was the first manager of Tottenham Hotspur FC? as it answered this with John Cameron, whereas it was actually Frank Brettell.
OpenAI ChatGPT scores were as follows:
- Ease of Use / Usability / UX / UI – Score = 8
- Easy to use as just type in questions and view the answers. Just a bit limited as text only and often not much text in the answer. I’d class them as sparse.
- Accuracy – Score = 7
- 3 points off for some errors in answers to some of the questions.
- Response time – Score = 10
- Very quick all the way through
- Error handling – Score = 10
- Worked well with garbled questions and nonsense questions and garbage data
- Source quotation/Provenance – Score = 0
- No sources given at all, so have to score it 0
- Working with the response e.g. further options – Score = 4
- Doesn’t provide anything further to work with other than the usually sparse answer, so again a low mark here
- Overall rating and score – Overall Score = 6.5
- Quick responses but often sparse answers. Provides only text answers with no sources or media or links or ways to share. Got 3 of the questions wrong so also a bit worrying on accuracy.
- Ranking = 3
- Overview – Fairly average performance, as sparse answers, no sources or accompanying media or links. Got 3 of the questions wrong so that’s also poor, scraped in 3rd place with a 6.5 rating.
More info on OpenAI ChatGPT – main website
Different versions and themes of ChatGPT are available, including the freely available GPT-3.5 on web, iOS and Android and paid versions such as Plus, which features GPT-4, at a monthly subscription.
All the posts
I have created a post for each AI Chatbot test in order to fully explore the test results, along with a Results post, where I crown the winning AI Chatbot. You can jump straight there or read each test post via the links below.
– Testing the AI Chatbots: Questions
– Testing the AI Chatbots: Perplexity AI
– Testing the AI Chatbots: OpenAI ChatGPT
– Testing the AI Chatbots: Microsoft Copilot
– Testing the AI Chatbots: Google Gemini
– Testing the AI Chatbots: Results
Note: The heading images used in these posts were created via Bing Image Creator
About my testing services: iOS App Testing / Android App Testing / Website Testing / AI Chatbot Testing