PT - JOURNAL ARTICLE AU - Zheng, Ce AU - Ye, Hongfei AU - Guo, Jinming AU - Yang, Junrui AU - Fei, Ping AU - Yuan, Yuanzhi AU - Huang, Danqing AU - Huang, Yuqiang AU - Peng, Jie AU - Xie, Xiaoling AU - Xie, Meng AU - Zhao, Peiquan AU - Chen, Li AU - Zhang, Mingzhi TI - Development and evaluation of a large language model of ophthalmology in Chinese AID - 10.1136/bjo-2023-324526 DP - 2024 Jul 17 TA - British Journal of Ophthalmology PG - bjo-2023-324526 4099 - http://bjo.bmj.com/content/early/2024/07/17/bjo-2023-324526.short 4100 - http://bjo.bmj.com/content/early/2024/07/17/bjo-2023-324526.full AB - Background Large language models (LLMs), such as ChatGPT, have considerable implications for various medical applications. However, ChatGPT’s training primarily draws from English-centric internet data and is not tailored explicitly to the medical domain. Thus, an ophthalmic LLM in Chinese is clinically essential for both healthcare providers and patients in mainland China.Methods We developed an LLM of ophthalmology (MOPH) using Chinese corpora and evaluated its performance in three clinical scenarios: ophthalmic board exams in Chinese, answering evidence-based medicine-oriented ophthalmic questions and diagnostic accuracy for clinical vignettes. Additionally, we compared MOPH’s performance to that of human doctors.Results In the ophthalmic exam, MOPH’s average score closely aligned with the mean score of trainees (64.7 (range 62–68) vs 66.2 (range 50–92), p=0.817), but achieving a score above 60 in all seven mock exams. In answering ophthalmic questions, MOPH demonstrated an adherence of 83.3% (25/30) of responses following Chinese guidelines (Likert scale 4–5). Only 6.7% (2/30, Likert scale 1–2) and 10% (3/30, Likert scale 3) of responses were rated as ‘poor or very poor’ or ‘potentially misinterpretable inaccuracies’ by reviewers. In diagnostic accuracy, although the rate of correct diagnosis by ophthalmologists was superior to that by MOPH (96.1% vs 81.1%, p>0.05), the difference was not statistically significant.Conclusion This study demonstrated the promising performance of MOPH, a Chinese-specific ophthalmic LLM, in diverse clinical scenarios. MOPH has potential real-world applications in Chinese-language ophthalmology settings.Data are available upon reasonable request. Data are available on reasonable request.