We will present the state-of-the-art in intelligent information
extraction (IE). The lecture will be subdivided into four major topics:
introduction, core technologies, machine learning (ML) methods and
applications.
We start with a historical overview and explain the different tasks and
evaluation methods of IE (e.g.,
template filling, domain ontologies). We summarize the core IE
functionality by contrasting rule-based and corpus-based system design.
This will also cover advanced NLP aspects like integration of shallow
and deep processing. Secondly, the participants will be faced with
major IE challenges wrt.
domain adaptivity, e.g., portability, and multi-linguality.
Consequently, we then focus on advanced ML methods for the different IE
tasks under various dimensions (supervised, unsupervised,
multi-lingual). Finally, we present different exciting applications
that embed IE as a major component, viz. open-domain question
answering, text summarization,
text data mining, and Semantic Web services.
H. Cunningham, D. Maynard, K.
Bontcheva, V. Tablan. GATE: A
Framework and Graphical Development Environment for Robust NLP Tools
and Applications. Proceedings of the 40th Anniversary Meeting of
the Association for Computational Linguistics
(ACL'02). Philadelphia, July 2002. PDF. BibTex
entry.
Information
Extraction
Technologies for Germany Texts
Berthold Crysmann, Anette Frank, Bernd Kiefer,, Stefan
Müller,
Günter Neumann, Jakub Piskorski, Ulrich Schäfer, Melanie
Siegel,
Hans Uszkoreit, Feiyu Xu,Markus Becker and
Hans-Ulrich
Krieger.An
integrated architecture
for shallow and deep processing. In Proceedigns of ACL-2002,
Association
for Computational Linguistics 40th Anniversary Meeting, University of
Pennsylvania,
Philadelphia, July 2002.
Machine Learning for Named Entity Recognition
D. Bikel, S. Miller, R.
Schwartz, and R. Weischedel. nymble:
a
high-performance
learning
name-finder. In Proceedings of the Fifth Conference on Applied
Natural Language Processing, pages 194-201, Washington, D.C., 1997.
M. Collins and Y. Singer. Unsupervised
models for named entity classification. In Proceedings of the
Joint SIGDAT Conference on Empirical Methods in Natural Language
Processing and Very Large Corpora, 1999.