Synthetic Data Pioneer MDClone Creates Health System Network for Research

July 15, 2020
Intermountain Healthcare, Jefferson Health, Washington University in St Louis, Regenstrief Institute among inaugural members

An Israel-based company called MDClone that has pioneered the use of synthetic data sets for research has announced the creation of a Global Network of health systems that will use the platform, installed across the Global Network sites, to develop solutions and explore ideas together to improve patient health.

Among the health systems involved are Intermountain Healthcare, Jefferson Health, Washington University in St Louis, Regenstrief Institute, Jewish General Hospital and the Ottawa Hospital in Canada, and Sheba Medical Center in Israel.

 MDClone creates a synthetic copy of healthcare data collected from actual patient populations. While the synthetic data set is virtually identical to the original data, there's no identifying information that can be traced back to individual patients, the company said. MDClone says synthetic data can help reduce cycles of discovery from months and years to hours and days through the self-service platform and access to synthetic data without the need for institutional review board (IRB) approval. Intermountain Healthcare has used it to build a program for managing its chronic kidney disease population, and Washington University School of Medicine in St. Louis has developed machine-learning model for predicting sepsis.  Some of the first projects underway in the Global Network are focused on understanding treatments and outcomes for patients with COVID-19.

 In its first year, the Global Network will focus on three pillars of research: health services, clinical medicine and precision medicine. Members will be able to conduct projects within multiple sites, including study design and replication, testing new approaches and partnering with external healthcare organizations. These projects can be conducted without sharing individual patient information, utilizing synthetic data generated by MDClone at each member site from over 30 million total patients within the network.

“In a short time, we’ve seen dramatic impacts with physicians implementing new quality initiatives and researchers moving from idea to publication,” said Ziv Ofek, MDClone’s founder and CEO, in a statement. “The Global Network will take this to another level by bringing together some of the world’s most innovative healthcare organizations and transforming insights into action. Ultimately, the Global Network will create new technologies and services, built-in collaboration across the membership and benefiting patients worldwide.” 

David Nash, M.D., is Founding Dean Emeritus, and he remains on the full-time faculty as the Dr. Raymond C. and Doris N. Grandon Professor of Health Policy, at the Jefferson College of Population Health. He also serves as a special assistant to Jefferson Health CEO Stephen Klasko, M.D., M.B.A.

 In that role, he evaluated the potential of MDClone and whether Jefferson should join the network. In a July 14 interview, he described visits in 2019 to Israel and to Salt Lake City with a team of Jefferson executives to see the work being done at Intermountain. “We brought four use cases with us in areas such as quality and research to see if it would be possible to work on them. By the end of the day we had gone through all four use cases and every one of my guys said they were blown away.”

 Jefferson decided to get involved, and Nash is working on the launch of the Global Network and creating the governance structure to make it effective, including the establishment of an executive board. “This is a hard job, but I think all the organizations have a commitment to innovation and disruption. We are willing to ask some really interesting self-evaluation questions, and we would like to learn from other organizations,” Nash said. “We are willing to say, Intermountain has this cool chronic kidney disease effort — maybe we can do that, too. Regenstrief has a 30-year history of doing this stuff. We don’t have a school of computer science. We are not an engineering powerhouse. Let’s learn from them.”

Nash stressed that the network would not work without the synthetic data. “If it involved PHI [protected health information], the lawyers would never let us do it,” he said. “The cornerstone is the clone. The take-home message is benchmarking to improve using synthetic data.”

The Global Network will pick its own projects on a case by case basis, but Nash said he thinks COVID is going to be the top priority for at least the next year. “After that, we will start looking at things that matter in each geography.”

The NIH’s National COVID Cohort Collaborative (N3C) also recently announced that MDClone’s synthetic data will be piloted as part of its data enclave. The project’s website says that “currently, synthetic data is being validated through a pilot that includes Washington University in St. Louis, University of Indiana and University of Washington. The purpose of the pilot is to establish whether the Limited Data Set data can be fully de-identified and whether the synthetic data derivative is statistically sound and can be used to accurately derive results.”

In 2019, Philip Payne, Ph.D., director of the Institute for Informatics at Washington University in St. Louis, described some of their early work with MDClone. "To both protect patient privacy and be able to analyze health data in a meaningful way, we required a platform that would give us the ability to generate data sets that look and feel like data from real patients," he said. "The synthetic data that was produced by this platform is statistically identical to data from real patients, but it can't be associated with individual patients. This solution also allows us to quickly ask and answer important research questions that can improve the care we provide to patients and the health of the communities we serve."

To validate MDClone's platform, teams at Washington University selected three pilot studies to compare MDClone's synthetic data against the original data. In one project, researchers evaluated factors that influence pediatric admissions to the intensive care unit; another looked at whether a machine learning algorithm can predict sepsis; and a third evaluated the prevalence of sexually transmitted infections by location. The projects were selected because of interest to the healthcare community and the diversity of data and statistical models required for analyses. In each pilot study, the statistical analyses showed the synthetic data is a valid stand-in for the original data, the Institute said.

Sponsored Recommendations

Care Access Made Easy: A Guide to Digital Self Service

Embracing digital transformation in healthcare is crucial, and there is no one-size-fits-all strategy. Consider adopting a crawl, walk, run approach to digital projects, enabling...

Powering a Digital Front Door with a Comprehensive Provider Directory

Learn how Geisinger improved provider data accuracy, SEO, and patient acquisition with a comprehensive provider directory.

Data-driven, physician-focused approach to CDI improvement

Organizational profile Sisters of Charity of Leavenworth (SCL) Health* has been providing care since it originated in the 1600s in France as the Daughters of Charity. These religious...

Luminis Health improved quality and financial outcomes with advanced CDI technology and consulting from 3M

In the beginning, there were challengesBefore partnering with 3M Health Information Systems (HIS), Luminis Health’s clinical documentation integrity (CDI) program faced ...