Keke Chen (Advisor), Guozhu Dong (Committee Member), Mateen Rizki (Other), Thomas Wischgoll (Committee Member)
Master of Science in Computer Engineering (MSCE)
With the development and deployment of ubiquitous information sensing, mobile devices,wireless sensor networks, RFID readers, simulation, and computer generated software logs, big data have become precious resources for scientific study, business intelligence, and national security. As one of the most intuitive and effective analysis methods,visual cluster analysis remains as a significant challenge for big datasets. First, existing visualization models need to be updated to process big data in parallel. Second, processing big data inevitably bring large latency, which conflicts the requirement of interactivity. In this thesis, we develop the CloudVista framework to address the common problems with data reduction methods and the conflict between the latency caused by processing big data and the interactivity desired by visual cluster exploration. There are a number of components in the framework: (1) the data structure visual frame and the previously developed VISTA visualization model for parallel processing; (2)the RandGen algorithm that generates batches of meaningful visual frames; and (3) a workflow to minimize the cost of big data processing. The CloudVista demonstration system is designed and implemented with web services and Hadoop/MapReduce, assuming the entire big data stored in the cloud.Finally, we show some visualization results and performance evaluation results based on the demonstration system.
Department or Program
Department of Computer Science and Engineering
Year Degree Awarded
Copyright 2012, all rights reserved. This open access ETD is published by Wright State University and OhioLINK.