Deep Dependency-Oriented Analysis with Non-Discrete Constraints

Deep Dependency-Oriented Analysis with Non-Discrete Constraints

  • Duration:

The aim of this project is to improve existing methodology for generic deep linguistic analysis, i.e. the syntactic and semantic analysis needed for many language technology applications. A dependency grammar model will be developed that extends the representations of successful data-driven dependency parsing schemes by additional elements of linguistic and cognitive sophistication such as a typed feature system, explicit soft constraints, the use of both semantic and syntactic dependencies and means for incrementally produced partial results.

The knowledge incorporated in the existing German HPSG grammar of the Lab and in the Stanford ERG will be imported into the planned fully lexicalized dependency grammar. This will be possible because of the consequent and consistent use of the multiple-inheritance type hierarchy as the sole basis for all encoded linguistic knowledge. By redefining all lexical types, the existing lexicons will be automatically converted to the new format. The existing German and English HPSG grammars will also serve as a baseline for comparison.

Coverage of the grammars will be extended by learning from dependency banks, either native or converted from suitable treebanks. The lexicon will be extended by data-driven lexical-type prediction.

The parsing will be incremental and local (within a window of 3-5 words). The local decision making will be based on preferences learned from dependency banks. Several alternative parsing models will be implemented and tested, that in spirit are influenced by the transitionbased approaches to dependency parsing.

The grammar development will differentiate between a rather strict and tightly regimented core grammar that can be employed as the starting point for many applications and robust application specific extensions of this core grammar.

The new approach will be tested in two applications: (i) a diagnostic grammar checker for exercises and exams in computer assisted language learning and (ii) information extraction of complex relations including events and opinions.

Publications about the project

Andrea Moro, Roberto Navigli, Leonhard Hennig, Hans Uszkoreit,

In: Journal of Web Semantics: Science, Services and Agents on the World Wide Web Special Issue on Knowledge Graphs Elsevier 2016.

To the publication
Andrea Moro, Roberto Navigli, Hong Li, Hans Uszkoreit

In: ICAART 2015 - Proceedings of the 7th International Conference on Agents and Artificial Intelligence. International Conference on Agents and Artificial Intelligence (ICAART-15) 7th January 10-12 Lisbon Portugal SciTePress 2015.

To the publication
Leonhard Hennig, Hans Uszkoreit

In: 53nd Annual Meeting of the Association for Computational Linguistics, July. Annual Meeting of the Association for Computational Linguistics (ACL-15) July 27-30 Beijing China ACL 2015.

To the publication

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz