CloudVista: Interactive and Economical Visual Cluster Analysis for Big Data in the Cloud

Document Type


Publication Date



Analysis of big data has become an important problem for many business and scientific applications, among which clustering and visualizing clusters in big data raise some unique challenges. This demonstration presents the CloudVista prototype system to address the problems with big data caused by using existing data reduction approaches. It promotes a whole-big-data visualization approach that preserves the details of clustering structure. The prototype system has several merits. (1) Its visualization model is naturally parallel, which guarantees the scalability. (2) The visual frame structure minimizes the data transferred between the cloud and the client. (3) The RandGen algorithm is used to achieve a good balance between interactivity and batch processing. (4) This approach is also designed to minimize the financial cost of interactive exploration in the cloud. The demonstration will highlight the problems with existing approaches and show the advantages of the CloudVista approach. The viewers will have the chance to play with the CloudVista prototype system and compare the visualization results generated with different approaches.