Constrained Classification on Structured Data

Document Type


Publication Date



Most standard learning algorithms, such as Logistic Regression (LR) and the Support Vector Machine (SVM), are designed to deal with i.i.d. (independent and identically distributed) data. They therefore do not work effectively for tasks that involve non-i.i.d. data, such as “region segmentation”. (Eg, the “tumor vs non-tumor” labels in a medical image are correlated, in that adjacent pixels typically have the same label.) This has motivated the work in random fields, which has produced classifiers for such non-i.i.d. data that are significantly better than standard i.i.d.-based classifiers. However, these random field methods are often too slow to be trained for the tasks they were designed to solve. This paper presents a novel variant, Pseudo Conditional Random Fields (PCRFs), that is also based on i.i.d. learners, to allow efficient training but also incorporates correlations, like random fields. We demonstrate that this system is as accurate as other random fields variants, but significantly faster to train.


Presented at the 23rd AAAI Conference on Artificial Intelligence, Chicago, IL, July 13-17, 2008.