Adaptation of Relation Extraction Rules to New Domains

Feiyu Xu, Hans Uszkoreit, Hong Li, Niko Felger

In: Proceedings of the Poster Session of the Sixth International Conference on Language Resources and Evaluation, LREC 2008. International Conference on Language Resources and Evaluation (LREC) 5/2008.


This paper presents various strategies to improve the extraction performance of less prominent relations with help of the rules learned for some similar relations, for which a large amount of data with suitable data properties is available. The rules are learned via a minimally supervised machine learning system DARE. DARE extracts linguistic grammars labeling with semantic roles from parsed news texts annotated with some semantic seeds at the very beginning. The performance analysis with respect to different experiment domains shows that the data property plays an important role for DARE, namely, the data redundancy and the connectivity of instances and pattern rules have big influence on recall. However, most real world data sets do not own the small world property. Therefore, we propose three scenarios to overcome the data property problem of some domains by utilizing some similar domain with good data property. The first two strategies stay with the same corpus but try to extract new similar relations with learned rules. The third strategy adapts the learned rules to a new corpus. All three strategies show that some popular relation with good data property helps less popular or less mentioned relations with worse data resources.


LREC_2008_614.pdf (pdf, 352 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence