Apache Hive-Based Big Data Analysis of HealthCare

The Electronic Health Record (EHR) stores valuable information on
patient records in digital form. The amount of data in the EHR is increasing
due to government mandates and technological innovation. Patient data
are recorded using sensors and medical reports. Given huge amounts of
heterogeneous data in the EHR, there is a need for effective methods to
store and analyze these data for meaningful interpretations. This study
focuses on various analysis techniques for analyzing and retrieving
required information from big data in the EHR. Many Hive queries are
conducted in the Hadoop distributed file system to extract valuable
information. The study also proposes and demonstrates the use of Tableau
as a data analysis technique for effectively deducing valuable information
in the form of visual graphs.

Zhenlin Kan, Xinru Cheng, Seung Hyun Kim,Yuting Jin