Coping With a Prolonged EHR Downtime

Sept. 28, 2018
How ready is your health system to survive a prolonged EHR downtime? Do your clinicians have alternate ways to record their prescriptions, lab orders and progress notes?

How ready is your health system to survive a prolonged EHR downtime? Do your clinicians have alternate ways to record their prescriptions, lab orders and progress notes? One hospital learned several lessons about its preparedness in 2015, and has since redoubled its efforts to have procedures in place to cope in the case of a similar incident in the future.

On Friday March 20, 2015, the EHR system at Boston Children's Hospital crashed and remained down until Wednesday, March, 25. Jonathan Hron, M.D., a pediatric hospitalist and physician lead for inpatient informatics, recalled the valiant efforts of clinicians to innovate and cope without the health IT systems they relied upon. “There were a lot of broken processes,” he said. “Fax machines were unable to keep up with the volume of orders. We learned a lot about our gaps during this downtime.”

During a recent talk in the Clinical Informatics Lecture Series put on by the Department of Biomedical Informatics at Harvard Medical School, Dr. Hron described the timeline of events and some lessons learned.

Early Friday morning a flash drive in a storage array failed in the data center failed. Two technicians went to the Needham, Mass., data center to repair the down drive, but mistakenly replaced the wrong one, he said. That led to a cascade of events. The mistake caused some data corruption in a database. They were able to briefly get the system up and running again, but that data corruption got copied onto the redundant database and it became clear by Saturday morning that that was unusable. Over the next three or four days they attempted to recover that system, working with three different vendors trying to fix the problem. They finally realized they had to restore the system from an outdated backup. “We rebuilt the system with that backup copy,” Hron recalled. By Tuesday evening the system was restored, and testing began to make sure it was functioning properly. Following a staged system re-entry across the enterprise, by Wednesday evening all users were back online. “We have been through several downtimes of four or six hours before,” he said, “but four or five days is just a whole different ball game. You really have to change the way you approach it.”  

Hron then described some of the impact. The lab ordering and documentation systems were completely down. “Over the course of that four to five days we saw a deficit of 40,000 to 50,000 lab orders and a deficit of around 11,000 electronic notes,” he said. “We had our friends at Brigham & Women’s Hospital running labs for urgent things.” A lot of ambulatory providers deferred labs until the system was back online.

They missed about 60,000 medication orders that had to be back-documented later, Hron said. “These are big numbers. All of that had to be done on paper.”

Clinicians were forced to write orders on paper and fax them over to the lab; the lab had to read their handwriting, and fax results back to them.

“There is a reliance on technology that is hard to quantify,” Hron said, “but when you ask a new intern to write out prescription orders, they don’t have a framework in their minds about how to do that. They are used to going to an admission order set that tells you what to do with required fields for things like frequency and refills. There was a lot of reminding that had to happen. It was one of the times our more senior clinicians felt very useful because they could help out the younger ones.”

The emergency management team put together a very detailed outline of what happened and why it happened and developed a list of 80 action items across 10 domains on things we had to follow up on and do better in the future.

The post-mortem identified a lot of broken and outdated processes and a lot of challenges with communications. Fax machines just couldn’t keep up with the volume, partly because these systems have been electronic for so long. “Our command center was overcrowded, and it was hard to hear people trying to communicate with satellite providers and everybody on the floor,” he said. “It was at times chaotic.”

Hron identified five “morals” of the downtime story at Boston Children’s in 2015.  

Moral No. 1: Nobody is safe from downtime events like this. He noted that soon after the downtime at Boston Children’s, United Airlines grounded flights due to IT issues. Corporate America has learned this in spades. Healthcare has been faced with some of the same struggles. This was predictable. The expanse of EHR implementations over the country was clearly going to increase the number of safety events. He quoted Prof. Dean Sittig as saying the more health IT you have, the more prepared you have to be. It is just an inevitability that these systems are going to go down.

Moral No. 2: Downtime is a dynamic process, not a set-it-and-forget-it sort of thing. Hron referenced the value of ONC-sponsored SAFER guides. (SAFER stands for Safety Assurance Factors for EHR Resilience.) “There has to be a process for reviewing these downtime policies over time, and your system is not stagnant,” he said. “Your organization is growing and you are implementing new technology. You need to update your downtime procedures to match your needs.” Basic questions include: When do you call a downtime? Who is in charge? How will everyone be notified?

Moral No. 3: Downtime procedures must be scalable. Our lab system has been electronic since the 1980s,” Hron said. “There are areas where the volume of work our pharmacy and lab are doing so much more than they were on paper. Because they have been computerized so long, there is that loss of institutional knowledge of how to manage it.”

Moral No. 4: Don’t be afraid to innovate. In a dfferent, shorter downtime, the hospital was piloting a HIPAA-compliant text-messaging system and that turned out to be a good time to give that system a try with everybody. Another example of innovation from the emergency department involved patient tracking during the downtime. They opened an Excel spreadsheet and an Adobe Connect session that everyone could sign into. It mimicked the patient tracking system. “This was a nice example of innovating in the moment,” Hron said.

Moral No. 5: Learn from your mistakes. “We have seen improvement in our planned downtimes over the past few years,” Hron said. Boston Children’s now has downtime code cards on each unit so everyone has the same paperwork. They worked on enhancing their downtime procedures and downtime website. They created Instructions for physicians, nurses and others, with links to the forms they need. They created specific recommendation around documentation and what a paper order should look like.  In addition, the hospital now holds a daily operational briefing in which the chief operating officer sits down with representatives from every area in the hospital to talk about any concern about safety events. The hospital now has a dashboard on system availability and tracks it over time, and in July 2017 moved to remote hosting.

Boston Children’s has continued its effort toward becoming a “high-reliability” organization. “It is an enterprise-wide commitment,” Hron said, “to embrace failure and learn from mistakes.”

Sponsored Recommendations

Elevating Clinical Performance and Financial Outcomes with Virtual Care Management

Transform healthcare delivery with Virtual Care Management (VCM) solutions, enabling proactive, continuous patient engagement to close care gaps, improve outcomes, and boost operational...

Examining AI Adoption + ROI in Healthcare Payments

Maximize healthcare payments with AI - today + tomorrow

Addressing Revenue Leakage in Hospitals

Learn how ReadySet Surgical helps hospitals stop the loss of earned money because of billing inefficiencies, processing and coding of surgical instruments. And helps reduce surgical...

Care Access Made Easy: A Guide to Digital Self Service

Embracing digital transformation in healthcare is crucial, and there is no one-size-fits-all strategy. Consider adopting a crawl, walk, run approach to digital projects, enabling...