We’ve covered the powerful Hadoop open source platform many times, but Hadoop is a complex Big Data-crunching system that many people have a hard time understanding. Hortonworks, a spin-off of Yahoo that focuses on all things Hadoop, has an interesting case study posted that illustrates why Hadoop is becoming more important to so many organizations. The case study involves the UC Irvine Medical Center and how it uses Hadoop to contend with mountains of healthcare data.
The healthcare system, of course, is at the center of the whole Big Data trend. Much healthcare data isn’t even digitized, and what is often isn’t organized, searchable and useful.
According to the U.C. Irvine case study:
"The Hadoop ecosystem is modular and within those modules lays the functionality to build algorithms for surveillance, detection and notification of conditions such as sepsis or the prediction of potential 30 day readmits. Other uses cases we are working on include monitoring “Sink Time”, that is how much time caregivers spend washing their hands; patient throughput with the ability to capture actual hand off times; patient scorecards pushed to the patient portal and the ability to discover the unknown unknowns in our data."
Yes, it’s true. Hadoop is used to monitor how often healthcare providers wash their hands.
Here is an inventory of what U.C. Irvine tracks with Hadoop:
- ll ancillary HL7 feeds (without the need for modification),
- EMR generated data,
- genomic data,
- financial data,
- RTLS data from assets,
- patient and caregiver data,
- smart pump data,
- incremental physiological monitoring measurements (across one minute increments),
- ventilator data in one minute or less increments
- and temperature and humidity data.
That’s a lot! As Jon Buys noted in his recent Quick Overview of Hadoop:
"The problem Hadoop solves is how to store and process big data. When you need to store and process petabytes of information, the monolithic approach to computing no longer makes sense. Single disks are too slow, and the larger single machines needed to process the data effectively are prohibitively expensive to most companies building web scale applications. Hadoop uses commodity hardware, and of course open source software, to distribute the data across multiple machines, using their combined power and storage to overcome monolithic bottlenecks."
These benefits are specifically cited in the U.C. Irvine case study. The Medical Center identified Hadoop as a promising platform because it is open source and therefore cost-effective, and because Hadoop logically works with distributed data, avoiding bottlenecks.
You can find much more on Hadoop in action at U.C. Irvine here.
Related Blog Posts
View full post on OStatic blogs