RASP-Boost: Confidential Boosting-Model Learning with Perturbed Data in the Cloud
Mining large data requires intensive computing resources and data mining expertise, which might be unavailable for many users. With widely available cloud computing resources, data mining tasks can now be moved to the cloud or outsourced to third parties to save costs. In this new paradigm, data and model confidentiality becomes the major concern to the data owner. Data owners have to understand the potential trade-offs among client-side costs, model quality, and confidentiality to justify outsourcing solutions. In this paper, we propose the RASPBoost framework to address these problems in confidential cloud-based learning. The RASP-Boost approach works with our previous developed Random Space Data Perturbation (RASP) method to protect data confidentiality and uses the boosting framework to overcome the difficulty of learning high-quality classifiers from RASP perturbed data. We develop several cloud-client collaborative boosting algorithms. These algorithms require low client-side computation and communication costs. The client does not need to stay online in the process of learning models. We have thoroughly studied the confidentiality of data, model, and learning process under a practical security model. Experiments on public datasets show that the RASP-Boost approach can provide high-quality classifiers, while preserving high data and model confidentiality and requiring low client-side costs.
& Guo, S.
(2015). RASP-Boost: Confidential Boosting-Model Learning with Perturbed Data in the Cloud. IEEE Transactions on Cloud Computing, PP (99).