Leveraging Large Language Models for High-Quality Lay Summaries: Efficacy of ChatGPT-4 with Custom Prompts in a Consecutive Series of Prostate Cancer Manuscripts

Rinderknecht, Emily and Schmelzer, Anna and Kravchuk, Anton and Gossler, Christopher and Breyer, Johannes and Gilfrich, Christian and Burger, Maximilian and Engelmann, Simon Udo and Saberi, Veronika and Kirschner, Clemens and von Winning, Dominik and Mayr, Roman and Wuelfing, Christian and Borgmann, Hendrik and Buse, Stephan and Haas, Maximilian and May, Matthias (2025) Leveraging Large Language Models for High-Quality Lay Summaries: Efficacy of ChatGPT-4 with Custom Prompts in a Consecutive Series of Prostate Cancer Manuscripts. CURRENT ONCOLOGY, 32 (2): 102. ISSN 1198-0052, 1718-7729

Full text not available from this repository. (Request a copy)

Abstract

Clear and accessible lay summaries are essential for enhancing the public understanding of scientific knowledge. This study aimed to evaluate whether ChatGPT-4 can generate high-quality lay summaries that are both accurate and comprehensible for prostate cancer research in Current Oncology. To achieve this, it systematically assessed ChatGPT-4's ability to summarize 80 prostate cancer articles published in the journal between July 2022 and June 2024 using two distinct prompt designs: a basic "simple" prompt and an enhanced "extended" prompt. Readability was assessed using established metrics, including the Flesch-Kincaid Reading Ease (FKRE), while content quality was evaluated with a 5-point Likert scale for alignment with source material. The extended prompt demonstrated significantly higher readability (median FKRE: 40.9 vs. 29.1, p < 0.001), better alignment with quality thresholds (86.2% vs. 47.5%, p < 0.001), and reduced the required reading level, making content more accessible. Both prompt designs produced content with high comprehensiveness (median Likert score: 5). This study highlights the critical role of tailored prompt engineering in optimizing large language models (LLMs) for medical communication. Limitations include the exclusive focus on prostate cancer, the use of predefined prompts without iterative refinement, and the absence of a direct comparison with human-crafted summaries. These findings underscore the transformative potential of LLMs like ChatGPT-4 to streamline the creation of lay summaries, reduce researchers' workload, and enhance public engagement. Future research should explore prompt variability, incorporate patient feedback, and extend applications across broader medical domains.

Item Type:	Article
Uncontrolled Keywords:	; patient communication; artificial intelligence in healthcare; language model applications; plain language summaries; lay abstracts; prompt design; medical application; text generation; scientific literacy; readability metrics
Subjects:	600 Technology > 610 Medical sciences Medicine
Divisions:	Medicine > Lehrstuhl für Urologie
Depositing User:	Dr. Gernot Deinzer
Date Deposited:	19 May 2026 06:49
Last Modified:	19 May 2026 06:49
URI:	https://pred.uni-regensburg.de/id/eprint/67447

Actions (login required)

View Item