Document Type

Conference Proceeding

Publication Date

2006

Abstract

We present a new semi-supervised training procedure for conditional random fields (CRFs) that can be used to train sequence segmentors and labelers from a combination of labeled and unlabeled training data. Our approach is based on extending the minimum entropy regularization framework to the structured prediction case, yielding a training objective that combines unlabeled conditional entropy with labeled conditional likelihood. Although the training objective is no longer concave, it can still be used to improve an initial model (e.g. obtained from supervised training) by iterative ascent. We apply our new training algorithm to the problem of identifying gene and protein mentions in biological texts, and show that incorporating unlabeled data improves the performance of the supervised CRF in this case.

Comments

This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, http://dx.doi.org/10.3115/1220175.1220202 .

This paper was presented at the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, in Sydney, Australia, July 2006.

Repository Citation

Jiao, F., Wang, S., Lee, C., Greiner, R., & Schuurmans, D. (2006). Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling. Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, 209-216.
https://corescholar.libraries.wright.edu/knoesis/100

DOI

10.3115/1220175.1220202

Download

Included in

Bioinformatics Commons, Communication Technology and New Media Commons, Databases and Information Systems Commons, OS and Networks Commons, Science and Technology Studies Commons

COinS

Kno.e.sis Publications

Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling

Document Type

Publication Date

Abstract

Comments

Repository Citation

DOI

Included in

Search

Browse

About

SelectedWorks Sites

Kno.e.sis Publications

Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling

Authors

Document Type

Publication Date

Abstract

Comments

Repository Citation

DOI

Included in

Share

Search

Browse

About

SelectedWorks Sites