Yudai Kaneda experimented with assessing the performance of ChatGPT (an artificial intelligence program) in taking the national licensing examination for doctors administered in February.
Surprises, the AI chatbot with the capability to converse answered 55% of the queries correctly, and that too for free – an aspiring doctor couldn’t help but be amazed.
His score on Japan’s National Examination for Medical Practitioners was not a passing grade. However, it was higher than his score.
Kaneda says:
“In the future, we might be able to casually ask an AI about the exam’s questions,”
“Our way of studying medicine could change.”
How ChatGPT Can Help Test And Streamline Your Content Creation
OpenAI, a US-based startup, is responsible for developing ChatGPT – an AI chatbot. While it may not be adept in specific fields, media reports suggest that it has successfully taken and passed an MBA and medical licensing exam.
ChatGPT, a conversational agent, is expected to be useful when diagnosing patients, specifically for doctors on the front lines.
Kaneda and his team wanted to assess the performance of ChatGPT by posing questions from the national exam. Therefore, they experimented with ChatGPT to evaluate its understanding of the Japanese language.
National licensing test for doctors that requires candidates to be a sixth-year medical student at university:
Applicants for the national licensing test for doctors must have completed six years of medical studies at a university to be eligible.
Exam takers must have first completed a six-year medical program at a university and can bring home the test sheets upon completion.
The Japanese Health Ministry posts the questions and answers choices from the annual examination every year on its website. This year, Kaneda took it further by manually inputting all the 400 questions from February’s exam into ChatGPT.
He obtained the question sheets of the test, as another student – senior to him – had completed it and brought them home.
He then tallied the answers supplied by ChatGPT against the sample ones furnished by a prep school specialized in medical licensing exams for evaluation.
The 62-year-old female patient came to the hospital with fever and rash as her primary symptoms. The most probable diagnosis from the given choices is a Fixed Drug Eruption (FDE), which generally occurs after taking certain prescription medications and shows up as an itchy rash on the skin.
The diagnosis likely to explain the given symptoms could be drug-induced hypersensitivity syndrome. This results from taking cold medications for general use, leading to these symptoms.
Given: ChatGPT, responses, reasons
The choice offered to the user was justified. GPT Chat provided reasons for its responses, enabling users to comprehend why their selection was selected rather than mulling over other possibilities.
ChatGPT could often come up with seemingly reliable information. However, there were some cases in which it gave wrong answers.
ChatGPT is considered to be capable of making mistakes on occasion in a fashion not unlike humans.
The given answer was not correct when it came to the question posed about the 62-year-old patient. Therefore, it is necessary to reexamine the situation to provide an accurate response.
With an overall accuracy of 55%, ChatGPT gave correct answers for 389 questions without regard to the need to view image(s).
Unfortunately, ChatGPT only scored 135 out of 197 points on the mandatory questions, resulting in a passing rate of just 69 percent – below the required 80 percent.
Achieving 149 out of the 292 points on the National Certification Exam resulted in a pass rate of 51 percent, surpassing the 70 percent requirement for passing both the general and clinical questions.
Despite the recommended five options being predominantly multiple-choice questions, ChatGPT was unsuccessful in both parts of the exam.
Test-takers significantly surpassed the expectation of obtaining a 20 percent correct answer rate if guesses were randomly made, as ChatGPT achieved a much higher rate. Thus, the performance of ChatGPT was remarkable.
Kaneda says:
“I was honestly surprised at how AI correctly answered more than half of the questions, even though it isn’t designed to answer the question of the national exam for doctors and is available to everyone,”
“I believe that ChatGPT is as knowledgeable as medical students who are in their first months of the sixth year at universities, the period they start seriously studying for the exam.”
GPT-4, the latest model of the ChatGPT series, has even greater language capacity than its predecessors. According to Kaneda, GPT-4 successfully answered 16 out of 20 questions that earlier versions couldn’t answer.
What Does The Future Hold For Medical Studies?
The compilation of the paper with Kaneda by Tetsuya Tanimoto, a doctor from the Medical Governance Research Institute in Tokyo, has been labeled significant by the AI program’s results.
Tetsuya Tanimoto says:
“GPT-4 has an incredible level of language ability,”
“It can even write a tanka poem in Japanese, for example.
“If a conversational AI program is developed based on medically credible literature, not dubious blogs or something similar, it could be used for front-line medical services in the not-so-distant future.”
Kaneda this year attempted the national exam, benefiting from the questions given to him by an older student. Unfortunately, his score rate was only 29%, indicating there is still much progress to be made by him.
Tetsuya Tanimoto continues to say:
“When I take the exam in two years’ time, I might be able to casually ask an AI program like ChatGPT, ‘Why is this treatment the wrong answer for this question?’ or ‘How should I think about that question?’ I believe that (AI) will change the way of studying medicine.”
Kaneda and the team of investigators have put forward their research paper to a learned journal for inspection by fellow researchers who are in the process of giving their review.
However, it is important to note that AI should not be seen as a replacement for human doctors. Rather, it should be viewed as a complementary tool that can help improve patient outcomes and enhance the quality of care. As we move forward, it will be crucial to strike a balance between leveraging the benefits of AI and maintaining the essential role that human doctors play in the healthcare system.
Source: The Asahi Shimbun