DFKI-LT - DeepBank: A Dynamically Annotated Treebank of the Wall Street Journal

Daniel Flickinger, Yi Zhang, Valia Kordoni
DeepBank: A Dynamically Annotated Treebank of the Wall Street Journal
Proceedings of the Eleventh International Workshop on Treebanks and Linguistic Theories, Pages 85-96, Lisbon, Portugal, Edições Colibri, Lisbon, 2012
 
This paper describes a large on-going effort, nearing completion, which aims to annotate the text of all of the 25 Wall Street Journal sections included in the Penn Treebank, using a hand-written broad-coverage grammar of English, manual disambiguation, and a PCFG approximation for the sentences not yet successfully analyzed by the grammar. These grammar-based annotations are linguistically rich, including both fine-grained syntactic structures grounded in the Head-driven Phrase Structure Grammar framework, as well as logically sound semantic representations expressed in Minimal Recursion Semantics. The linguistic depth of these annotations on a large and familiar corpus should enable a variety of NLP-related tasks, including more direct comparison of grammars and parsers across frameworks, identification of sentences exhibiting linguistically interesting phenomena, and training of more accurate robust parsers and parse-ranking models that will also perform well on texts in other domains.
 
Files: BibTeX, DeepBank_tlt11.pdf