With a range of AI Chatbots to choose from now, I decided to do my own test of them, to get an idea of the differences between them, also to test their accuracy and other features. This set of tests will also enable me to establish some ideas of how to test AI chatbots, as they’re now everywhere and may eventually replace traditional interfaces and even apps.
To be included in the tests, I established the following basic criteria for the AI chatbots:
- must be available for free (or at least have a free version)
- should ideally have an iOS app, on which I would perform the testing
- also, ideally, should provide sources of its results, for provenance and to enable checking of answers
I narrowed the AI Chatbots to test down to the following ones:
- Perplexity AI – iOS app
- OpenAI ChatGPT – iOS app
- Microsoft Copilot – iOS app
- Google Gemini – as currently no iOS app available, tested via Chrome browser
I tested the AI Chatbots with a wide range of questions and these included:
- simple questions with factual answers that can be determined as correct or incorrect
- some trickier questions with factual answers to add some complexity
- questions on specific areas of knowledge, such as movies, music, sport, even air fryers!
- plus some questions where the wording is garbled to see if it can still get the meaning of the question
- some whimsical questions, which do have a meaning, to see if it can grasp those and what kind of answer it supplies
- some basic garbage questions to check error handling
To enable ratings and comparison between them, I established the following scores:
- Ease of Use / Usability / UX / UI
- Accuracy
- Response time
- Error handling
- Source quotation/Provenance
- Working with the response e.g. further options
- Overall score
For consistency, I used the same set of questions across all the AI Chatbots and marked them accordingly based on the results and also made notes on any bugs or errors I found.
All the posts
I have created a post for each AI Chatbot test in order to fully explore the test results, a post which shows the questions asked, plus a Results post, where I crown the winning AI Chatbot. You can jump straight there or read each test post via the links below.
– Testing the AI Chatbots: Questions
– Testing the AI Chatbots: Perplexity AI
– Testing the AI Chatbots: OpenAI ChatGPT
– Testing the AI Chatbots: Microsoft Copilot
– Testing the AI Chatbots: Google Gemini
– Testing the AI Chatbots: Results
Note: The heading images used in these posts were created via Bing Image Creator
About my testing services: iOS App Testing / Android App Testing / Website Testing / AI Chatbot Testing