How Health Systems Can Leverage LLMs to Automate Tasks Efficiently
Researchers at the Icahn School of Medicine at Mount Sinai have identified strategies for using large language models (LLMs) in health systems while maintaining cost efficiency and performance.
The findings, published in the Nov. 18 online issue of npj Digital Medicine, provide insights into how health systems can leverage LLMs to automate tasks efficiently, saving time and reducing operational costs while ensuring these models remain reliable even under high task loads.
The researchers note that LLMs, such as OpenAI’s GPT-4, offer encouraging ways to automate and streamline workflows by assisting with various tasks. However, continuously running these AI models is costly, creating a financial barrier to widespread use, say the investigators.
The study involved testing 10 LLMs with real patient data, examining how each model responded to various types of clinical questions. The team ran more than 300,000 experiments, incrementally increasing task loads to evaluate how the models managed rising demands.
Along with measuring accuracy, the team evaluated the models’ adherence to clinical instructions. An economic analysis followed, revealing that grouping tasks could help hospitals cut AI-related costs while keeping model performance intact.
The study showed that by specifically grouping up to 50 clinical tasks—such as matching patients for clinical trials, structuring research cohorts, extracting data for epidemiological studies, reviewing medication safety, and identifying patients eligible for preventive health screenings—together, LLMs can handle them simultaneously without a significant drop in accuracy. This task-grouping approach suggests that hospitals could optimize workflows and reduce API costs as much as 17-fold, savings that could amount to millions of dollars per year for larger health systems, making advanced AI tools more financially viable.
“Our study was motivated by the need to find practical ways to reduce costs while maintaining performance so health systems can confidently use LLMs at scale,” explained first author Eyal Klang, M.D., director of the Generative AI Research Program in the D3M at Icahn Mount Sinai, in a statement. “We set out to ‘stress test’ these models, assessing how well they handle multiple tasks simultaneously, and to pinpoint strategies that keep both performance high and costs manageable.”
“Our findings provide a road map for healthcare systems to integrate advanced AI tools to automate tasks efficiently, potentially cutting costs for application programming interface (API) calls for LLMs up to 17-fold and ensuring stable performance under heavy workloads,” said co-senior author Girish Nadkarni, M.D., M.P.H, Irene and Dr. Arthur M. Fishberg Professor of Medicine at Icahn Mount Sinai, Director of The Charles Bronfman Institute of Personalized Medicine, and Chief of the Division of Data-Driven and Digital Medicine (D3M) at the Mount Sinai Health System, in a statement.