May 19, 2024
Generative AI

Generative AI in Patient Messaging Systems: Balancing Opportunities and Challenges, According to a New Study

A recent investigation led by Mass General Brigham researchers has shed light on the potential benefits and limitations of using large language models (LLMs) in patient messaging systems. The study, published in The Lancet Digital Health, reveals that while LLMs can help alleviate physician workload and enhance patient education, they also pose certain risks that necessitate careful oversight.

Amidst the growing administrative and documentation responsibilities, electronic health record (EHR) vendors have adopted generative AI algorithms to aid clinicians in drafting messages to patients. However, the efficiency, safety, and clinical impact of LLM usage in this context had remained unclear.

Danielle Bitterman, MD, a faculty member in the Artificial Intelligence in Medicine (AIM) Program at Mass General Brigham and a radiation oncologist at Brigham and Women’s Hospital, explained, “Generative AI offers a promising solution to reduce clinician burden and educate patients effectively. However, our team’s experience working with LLMs has raised concerns about potential risks associated with their integration into messaging systems.”

To identify the benefits and pitfalls of LLM implementation, the researchers employed OpenAI’s GPT-4, a foundational LLM, to generate 100 patient messaging scenarios and accompanying questions. Six radiation oncologists manually drafted responses, followed by LLM-generated responses, which were then reviewed and edited by the same radiation oncologists.

The physicians reported that LLM-assistance improved their perceived efficiency and deemed the LLM-generated responses to be safe in 82.1% of cases and acceptable to send to a patient without any further editing in 58.3% of cases. However, 7.1% of LLM-generated responses could potentially harm patients, and 0.6% posed a risk of death, primarily due to the failure to urgently instruct patients to seek immediate medical care.

Interestingly, LLM-generated/physician-edited responses were more similar in length and content to LLM-generated responses compared to manual responses. Physicians often retained the educational content generated by the LLM, suggesting its perceived value in promoting patient education.

Bitterman added, “As providers increasingly rely on LLMs, it’s crucial to ensure that systems monitor their quality, clinicians are trained to supervise their output, and both patients and clinicians possess adequate AI literacy. Ultimately, a better understanding of how to address LLM errors is essential.”

The researchers are currently investigating how patients perceive LLM-based communications and the influence of patients’ racial and demographic characteristics on LLM-generated responses, given the known biases in LLMs.

*Note:
1. Source: Coherent Market Insights, Public sources, Desk research
2. We have leveraged AI tools to mine information and compile it.