Travis Doom (Committee Member), Jack Jean (Committee Member), Meilin Liu (Advisor)
Master of Science in Computer Engineering (MSCE)
The MapReduce framework is a programming model proposed by Google to process large datasets. It is an efficient framework that can be used in many areas, such as social network, scientific research, electronic business, etc. Hence, more and more MapReduce frameworks are implemented on different platforms, including Phoenix (based on multicore CPU), MapCG (based on GPU), and StreamMR (based on GPU). However, these MapReduce frameworks have limitations, and they cannot handle the collision problem in the map phase, and the unbalanced workload problems in the reduce phase. To improve the performance of the MapReduce framework on GPGPUs, in this thesis, a workload balance MapReduce framework (B_MapCG) on GPUs is proposed and developed based on the MapCG framework, to reduce the number of collisions while inserting key-value pairs in the map phase, and to handle the unbalanced workload problems in the reduce phase. The proposed B_MapCG framework is evaluated on the Tesla K40 GPU with four benchmarks and eight different datasets. The experimental results showed that the B_MapCG framework achieved big performance improvements for all the four test benchmarks both in the map phase and the reduce phase compared with MapCG.
Department or Program
Department of Computer Science and Engineering
Copyright, all rights reserved. My ETD will be available under the "Fair Use" terms of copyright law.