Article Text

Download PDFPDF
Clinical science
Performance of ChatGPT and Bard on the official part 1 FRCOphth practice questions
  1. Thomas Fowler1,
  2. Simon Pullen2,
  3. Liam Birkett3
  1. 1 Department of Medicine, Barking Havering and Redbridge University Hospitals NHS Trust, London, UK
  2. 2 Department of Anaesthetics, Princess Alexandra Hospital, Harlow, UK
  3. 3 Emergency Medicine, Royal Free Hospital, London, UK
  1. Correspondence to Dr Thomas Fowler; thomas.fowler6{at}nhs.net

Abstract

Background Chat Generative Pre-trained Transformer (ChatGPT), a large language model by OpenAI, and Bard, Google’s artificial intelligence (AI) chatbot, have been evaluated in various contexts. This study aims to assess these models’ proficiency in the part 1 Fellowship of the Royal College of Ophthalmologists (FRCOphth) Multiple Choice Question (MCQ) examination, highlighting their potential in medical education.

Methods Both models were tested on a sample question bank for the part 1 FRCOphth MCQ exam. Their performances were compared with historical human performance on the exam, focusing on the ability to comprehend, retain and apply information related to ophthalmology. We also tested it on the book ‘MCQs for FRCOpth part 1’, and assessed its performance across subjects.

Results ChatGPT demonstrated a strong performance, surpassing historical human pass marks and examination performance, while Bard underperformed. The comparison indicates the potential of certain AI models to match, and even exceed, human standards in such tasks.

Conclusion The results demonstrate the potential of AI models, such as ChatGPT, in processing and applying medical knowledge at a postgraduate level. However, performance varied among different models, highlighting the importance of appropriate AI selection. The study underlines the potential for AI applications in medical education and the necessity for further investigation into their strengths and limitations.

  • Medical Education

Data availability statement

No data are available.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Data availability statement

No data are available.

View Full Text

Footnotes

  • Contributors We have all contributed to all parts of this manuscript.

    TF is guarantor.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Linked Articles

  • Highlights from this issue
    Frank Larkin