These days, a growing number of people worldwide are consulting AI Chatbots for medical advice. The question is, can we rely on them to give us accurate advice?

A recent study conducted at Oxford University set out to establish whether or not AI systems are effective with regards to diagnosis and appropriate course of action.

The study included nearly 1,300 participants who were asked to identify potential health conditions and recommended courses of action, based on personal medical scenarios developed by doctors.

One group used AI chatbots to assist in their decision-making, while a control group  used other traditional sources of information. The researchers then evaluated how accurately participants identified the likely medical issues and the most appropriate next step, such as visiting a GP or going to the ER.

Results showed that those who relied on the chatbots made the right choice less than half of the time and that AI correctly identified the problem only about a third of the time.

Although AI platforms excel at standardized tests of medical knowledge, they fall short when it comes to accurate diagnosis because they don’t reflect the complexity of interacting with human users.

Researchers found evidence of three specific areas that posed a challenge:

  1. Users often didn’t know what information they should provide
  2. AI provided very different answers based on slight variations in the questions asked
  3. AI often provided a mix of good and bad information which made it difficult for users to identify the best course of action

Ultimately, AI’s advice was not any better than a simple Google search.

Rebecca Payne, one of the study’s authors, said it could be dangerous to rely solely on this new technology which might fail to recognize when a person needs urgent medical attention.

Senior author of the study, Associate Professor Adam Mahdi (Oxford Internet Institute) said: ‘The disconnect between benchmark scores and real-world performance should be a wake-up call for AI developers and regulators. We cannot rely on standardised tests alone to determine if these systems are safe for public use. Just as we require clinical trials for new medications, AI systems need rigorous testing with diverse, real users to understand their true capabilities in high-stakes settings like healthcare.’

Clearly, AI isn’t qualified to provide clear medical guidance…yet.  On the plus side, although still in its infant stages, some doctors find that incorporating information provided by AI into their own assessments can help them spot illnesses in patients which could lead to new discoveries.

As for the rest of us who are still largely flying by the seat of our pants, it’s probably best to continue to rely on traditional doctor/patient interaction, at least for the time being.