Might ChatGPT Transform Healthcare? Pioneering Leaders Are Working Forward to Find Out

Perhaps no other phenomenon in U.S. healthcare has been the subject of a greater amount of hype, expectation, and confusion, all mixed up together, than has the emergence of artificial intelligence (AI). When it comes to the famous “Gartner hype cycle,” the development of AI for clinical and clinical-operational uses in patient care organizations might at this moment be anywhere between the “Peak of Inflated Expectations,” the “Trough of Disillusionment,” or the “Slope of Enlightenment,” depending on one’s perceptions. Further complicating perceptions has been the emergence in the past several months of ChatGPT, a large language model developed by OpenAI launched late last year. That launch has intensified the complexity of a scenario in which needs, expectations, early advances, and yes, of course, many stalls and outright failures, are adding up to the current landscape, one that couldn’t possibly be more layered and complicated.

Yet amid all the hype, the expectations, and the confusion, some organizations are making real headway. One example? Michael Hasselberg, R.N., Ph.D., the chief digital health officer at the University of Rochester Medical Center in Rochester, N.Y., has been leading a highly promising effort to streamline messaging intended for clinicians and other staffers in the health system. Speaking to the origin of the initiative, Hasselberg explains that, “The problem has been MyChart messages [generated inside the patient-facing communications platform inside the Epic Systems Corporation electronic health record system] coming into our clinicians’ in-baskets. We have not had a good system to triage those messages, going to a staff member, nurse, or provider. We’ve pretty much been sending all those patient-generated messages to providers [physicians], and that’s caused chaos.”

Given that situation, Hasselberg reports, “Three or four years ago, we decided to focus on this to build natural language processing models to reliably and accurately triage messages, in order to send them to the right individuals.” Fast-forwarding to just a few months ago, he reports, the emergence of ChatGPT has turbocharged work on that project. “We’re excited because we’re one of the health systems that have access to GPT4 in Azure; we have our own instance of Azure. And because we have access to GPT4 on that instance of Azure, it’s secure and private.” And the exciting development has been that Hasselberg and his colleagues have been able to test GPT4 inside Azure, tuning the large language model to very reliably and accurately triage those messages, and, he reports, “It worked within two days.”

Indeed, he reports, “Once we had tuned the model, and prompted it, we ran it multiple times on our data. We looked at reliability: did it consistently send the same exact message to the same people? We got high 90s-level reliability back. And then we pulled random samples out of each of those buckets and sent them to random PCPs and asked them, should that message have gone to a physician, a nurse, a staff member? And the accuracy rate was somewhere around 85. And the boundary was if it wasn’t clear, send it to the provider. If you’re not sure, that’s the default.”

Hasselberg admits that, “Had you come to me six months ago to a year ago, I would have told you that I was really pessimistic around the applicability in the near term of AI in healthcare, because we have a data problem in healthcare: our data is very dirty and very noisy.” But the success in the past couple of months of this experiment has changed his view.

Experts express caution: it’s still early days

There remains widespread caution among many leaders in healthcare, given the high expectations around AI, and particularly around ChatGPT, and the hype in some quarters. Aaron Neinstein, M.D., who holds multiple professional titles, is now chief medical officer at the San Mateo, Calif.-based Notable Health, while he remains an associate professor of medicine in the Division of Endocrinology in the School of Medicine at the University of California, San Francisco, and also holds a position on the Health Information Technology Advisory Committee (HITAC) at the Department of Health and Human Services/the Office of the National Coordinator for Health IT (ONC).

“My lens and organizing principle or philosophy around this is that, yes, I’m excited about where AI could evolve forward in the clinical realm in the future. I do think that that’s years away,” Neinstein says. “Right now, we’re getting into appropriate concerns around trust, bias, responsibility—basically, an FDA-regulated territory, for good reason. And I keep coming back to Eric Topol’s quote in his Deep Medicine book; paraphrasing what he said there, we don’t need AI to cure cancer, just to help doctors and patients restore their relationship. We’ve overloaded doctors, nurses, clinic staff: everyone’s totally overwhelmed and burnt out, with too much to do.” Indeed, he says, “Someone did a study finding that if primary care physicians really did everything they were supposed to do in a day, it would take 30 hours to do. And there’s some element of that in every job in healthcare. So I really see the primary job of AI to help start lifting those burdens away, in all those different places in different workflows. We need to find all those pieces and start using AI there first.”

Per ChatGPT specifically, Neinstein emphasizes that “You definitely cannot use it out of the box, because of privacy and security issues. Health systems are starting to deploy more secure versions of GPT,” he comments. He believes that, over time, ChatGPT will be trained for context, and will be used in a number of different situations. “For example, the context of the patient’s record that you’re looking at. For me, it would be useful in helping me generate summaries of patient instructions or patient educational content or material. To do that well, it would need some training in the content of what I usually talk about . And when you lock it down to preserve privacy and security, it won’t do that. So there will sort of have to be mini spinoffs.”

Inevitably, Neinstein says, leaders in patient care organizations will have to integrate the uses of ChatGPT and other advanced technologies, into clinician workflow—and that will take a while. He sees what Michael Hasselberg and his colleagues are doing at the University of Rochester Health as hope-inducing in that regard. The challenge, he says, is that, “Ultimately, if you go down the road of building something within the health system, now you’ve basically built a software product. Do you really have the resources to do that?” In other words, he says, the better-resourced health systems will move far faster to leverage ChatGPT and other advanced technologies into clinical workflows.

Looking at the overall landscape nationwide, Mathaeus Dejori and Ryan Nellis see a natural evolution—with bumps along the way—ahead for patient care organization leaders. Dejori is chief data scientist and AI lead, and Nellis is vice president and general manager, at PINC AI^TMStanson Health, a subsidiary of the Charlotte-based Premier Inc. health alliance. PINC AI^TM Stanson Health provides a solution that provides point-of-care clinical decision support.

There’s a lot of work ahead, naturally. “These models are very powerful, but they’re very biased,” Dejori says. “And when you start to measure how good they are—these models are trained on internet data, and aren’t necessarily aligned to your domain. So you have to make sure each large language model is tamed to your domain and works in your domain. You can use ChatGPT, but in terms of guaranteeing accuracy and safety, everyone is struggling now to figure that out. And these models are powerful, and the cost aspect is interesting, because you have powerful models with billions of parameters. But do you really need such powerful, expensive models? We do real-time decision support. And you cannot get a real-time response of hundreds of billions of parameters model. So you need for clinicians to play with these models.”

In other words, Dejori emphasizes, it will take a lot of thought and effort for the leaders inside patient care organizations to take large language models and train them on clinical and operational processes inside hospitals, medical groups, and health systems. “People will need to train these large language models on our domains,” he says. “So whether to recommend whether a patient should get a Chest CT, you need to know the patient’s smoking history, for example. And it’s highly complex and written in notes all over the place. And these models can do surprisingly well-ish if you ask the right question, like what is the patient’s smoking history? But they’re not meeting our high-fidelity needs. It’s a matter of steering them into focus in the domains we need, and scaling them, and developing guarantees about the reliability of a model’s performance.”

“We have a data science team and a clinical team,” Nellis says. “And Mathaeus has taught me, and we’ve been building our own models for many years, and there’s this new breed now, with additional computing horsepower, but we need clinical input and guidance. And ChatGPT, you can say, hey, write me a four-paragraph thank-you letter to Mark, and include these three things. And it will do it, but it’s not ready to go from here to there. And that’s where we are in healthcare: these models are impressive, but you still need humans to get into high-fidelity production. And most folks are getting a little bit overhyped. One of our clients, a big Premier member, a big multi-state health system, organized all their clinical and business-line leaders to get together quarterly in person for a whole day. And they think AI is the future, that it will make HC faster, more sustainable, and of higher quality, for our patients, and they want to lead the charge. So they’re asking all the right questions. What are the high-value use cases? Mathaeus flew down to one, I did. And there has to be an active participant, using these things in a controlled, but meaningful way.”

Neinstein notes that “There’s this immensely powerful new capability with ChatGPT, and I think that having a very vibrant, broad ecosystem of trying new ways of using it, actually is of value,” he says. “People don’t necessarily know where the most value will be. I think of trying to do things in a diamond shape: do testing and experimentation, and then converge around where there’s traction and value. So I think it’s appropriate that we’re in a divergent state around this.” Neinstein says that what will be required will be several things, including strategy, intentionality, and “guardrails for safety and trust,” meaning, very conscious choosing of which areas to explore and which areas in which to attempt to build scale. “It can’t just be a thousand flowers blooming indefinitely.” In that regard, he predicts, “organizations with structure and strategy will do best; progress will be about rapid-cycle learning and innovation.”

What should CIOs and CMIOs be thinking right now, how should they be planning? “For me,” Neinstein says, “a fundamental organizing principle around this is that technology and operations cannot exist separately here. This cannot be looked at solely as a technology problem or opportunity; there has to be tight integration of technology with operations, particularly with workflow.”

What health IT leaders should be thinking about right now

And in that regard, asked what health IT leaders should be doing right now, Hasselberg says that, “If you haven’t already, you need to set up AI governance within your health system. Who sits at that table, and what policies are you applying? There’s a ton of work involved in creating the governance and project prioritization processes that will lead organizations to success in this area.” And he adds that, inevitably, the leaders at many patient care organizations will wait until their electronic health record and analytics vendors develop off-the-shelf systems for use, while others will move to “work with the Microsofts, the Amazons, the Googles of the world, and will look to those big companies to provide those services to them.” But he and his colleagues at the University of Rochester have made the commitment to get out ahead of commercialization and develop use cases and solutions that make sense for them and will move their organization forward sooner rather than later.

Nellis says, “First of all, be leaders. And you want to achieve focus, so that this doesn’t become some side project.” Most of all, he urges, “Match a really important business opportunity and business goals, with the potential technology. Pick things that matter,” such as improving workflow and the work lives of clinicians and others in the organization, improving cost-effectiveness or financial performance, and so on, “and be realistic about them.”

Meanwhile, for all the challenges ahead, those who have plunged in see a world of opportunity ahead. “Six months ago, a year ago, given our own experience,” Hasselberg says, “I was very pessimistic about the speed with which AI was going to transform healthcare. Now, I’m really, really excited. This is going to transform healthcare for the good, and it really, really excites me. I think we’ll move from micro-researchers to opportunities at scale quickly, in the next six months to two years.”