Data Storage Strategies: Not All in the Cloud
Like many large health systems, the Salt Lake City-based Intermountain Healthcare is responsible for maintaining large and growing volumes of data. That is an important challenge, but not a new one, according to Don Franklin, Intermountain’s assistant vice president of infrastructure and operations, who notes that the 22-hospital health system currently manages about 4.7 petabytes of data. “This is not a new phenomenon for us. Intermountain is well known for its data analytics, for its massive amount of data and for managing that data,” he says. He notes that the volume of data will grow at about 25 to 30 percent each year for the foreseeable future, and estimates that the health system will be responsible for 15 petabytes in another five years.
Franklin is optimistic that Intermountain will be able to meet those challenges, citing declining costs of some storage disks and technology innovations. While Intermountain has explored the possibility of using the cloud, it has not moved in that direction regarding storage, he says. “Intermountain is pretty conservative,” he says. “We are focused on the patients and on protecting their data, so we are very conservative about moving data to the cloud.”
With that said, however, the health system has embraced other technologies to help it manage its data storage effectively.
One way the health system has managed data effectively and controlled costs has been by using multiple tiers of storage, by making data available at the appropriate speeds. Tiering is currently done manually, in terms of looking at the characteristics of the data and storing it appropriately at the beginning. The health system is exploring the use of auto-tiering, which automatically stores data on the appropriate media according to its availability needs, which can save costs by storing data on lower cost media when appropriate. Whether or not auto-tiering is implemented comes down to a matter of trust: “It’s a bit new, and when you are talking about needing to meet high availability of data, we are conservative,” Franklin says.
Intermountain has used storage virtualization technology for the last several years, which allows movement of data without downtime. Because the virtualization engine abstracts the server from the specific type of storage, it also eliminates concern about specific drivers, Franklin adds. The result has been “a lot of efficiency and uptime; and if it serves the right need at the lowest appropriate cost, then that works for you,” he says.
In addition, Franklin says that de-duplication technology has proved to be a valuable way to save storage space, by providing the intelligence within the storage subsystem to detect duplicate data, and storing it only once. He says de-duplication has saved Intermountain about 40 percent of its storage space.
Franklin says Intermountain’s IT department monitors its data use closely, and is aware of the characteristics of its users and applications. This has allowed it to apply spin provisioning technology that provisions a smaller amount of data in the storage subsystem than what is requested by a user. “It’s a bet, and the intelligence of the subsystem protects us. We’ve been doing this for a while and have never had an issue,” he says.
Intermountain has also embarked on initiatives to better interpreting of data to see if there are opportunities to evaluate how long data should be maintained and how many copies are needed. This involves looking at policy issues to understand what needs to be retained, and to find opportunities to save storage costs. “It comes down to interpretation on how long we need to retain it. If we have the discipline to go after that and define it, there is an opportunity, from a healthcare regulation perspective, to help with our growth,” he says.
Stay tuned for a feature on data storage that will appear in the October issue of Healthcare Informatics.