Publication Date

2015

Document Type

Dissertation

Committee Members

Keke Chen (Committee Member), Michael Raymer (Committee Member), Shaojun Wang (Advisor), Xinhui Zhang (Committee Member)

Degree Name

Doctor of Philosophy (PhD)

Abstract

In this thesis, we discuss two issues in the learning to rank area, choosing effective objective loss function, constructing effective regresstion trees in the gradient boosting framework, as well as a third issus, applying learning to rank models into statistcal machine translation. First, list-wise based learning to rank methods either directly optimize performance measures or optimize surrogate functions of performance measures that have smaller gaps between optimized losses and performance measures, thus it is generally believed that they should be able to lead to better performance than point- and pair-wise based learning to rank methods. However, in real-world applications, state-of-the-art practical learning to rank systems, such as MART and LambdaMART, are not from list-wise based camp. One cause may be that several list-wise based methods work well in the popular but very small LETOR datasets but fail in real-world datasets that are often used for training practical systems. We propose a list-wise learning to rank method that is based on a list-wise surrogate function, the Plackett-Luce (PL) model. The PL model has convex loss to ensure a global optimal guarantee, and is proven to be consistent to certain performance measures such as NDCG score. When we conduct experiments on the PL model, we observe that it is actually unstable in performance; when the data has rich enough features, it gives very good results, but for data with scarce features, it fails horribly. For example, when we apply the PL with a linear model on the Microsoft 30K dataset, it gives 7.6 points worse NDCG@1 score than an average performance of several linear systems. This motivates us to propose our new ranking system, PLRank, that is suitable for any data sets through a mapping from feature space into tree space to gain more expressive power. PLRank is trained based on the gradient boosting framework, and it is simple to implement. It has the same time complexity as the LambdaMART, and runs a little bit faster in practice. Moreover, we extend three other list-wise surrogate functions in a gradient boosting framework for a fair and full comparison, and we find that the PL model has special advantages. Our experiments are conducted on the two largest publicly available real-world datasets, Yahoo challenge 2010 and Microsoft 30K. The results show this is the first time in the single model level for a list-wise based system to match or overpass state-of-the-art point- and pair-wise based ones, MART, LambdaMART, and McRank, in real-world datasets. Second, industry-level applications of learning to rank models have been dominated by gradient boosting framework, which fits a tree using least square error (SE) principle. Another tree fitting principle, (robust) weighted least square error ((R)WSE), has been widely used in classification, such as LogitBoost and its variants, but hasn't been reformulated to fulfill learning the rank tasks. For both principles, there is a lack of deep analysis on their relationship in the scenario of learning to rank. Motivated by AdaBoost, we propose a new principle named least objective loss based error (OLE) that enables us to analyze several important learning to rank systems: \emph{we prove that (R)WSE is actually a special case of OLE for derivative additive loss functions; OLE, (R)WSE and SE are equivalent for MART system}. Under the guidance of OLE principle, we implement three typical and strong systems and conduct our experiments in two real-world datasets. Experimental results show that our proposed OLE principle improves most results over SE. Thrid, Margin infused relaxed algorithms (MIRAs) dominate model tuning in statistical machine translation in the case of large scale features, but also they are famous for the complexity in implementation. We int...

Page Count

103

Department or Program

Department of Computer Science and Engineering


Share

COinS