Skip to main content Skip to main navigation

Project | IMPRESS

Improving Embeddings with Semantic Knowledge

Improving Embeddings with Semantic Knowledge

Virtually all NLP systems nowadays use vector representations of words, a.k.a. word embeddings. Similarly, the processing of language combined with vision or other sensory modalities employs multimodal embeddings. While embeddings do embody some form of semantic relatedness, the exact nature of the latter remains unclear. This loss of precise semantic information can affect downstream tasks.

The goals of IMPRESS are to investigate the integration of semantic and common sense knowledge into linguistic and multimodal embeddings and the impact on selected downstream tasks. IMPRESS will also develop open source software and lexical resources, focusing on video activity recognition as a practical testbed. Furthermore, while there is a growing body of NLP research on languages other than English, most research on multimodal embeddings is still done on English. IMPRESS will consider a multilingual extension of the developed methods to handle French, German and English.


  1. DFKI 2. INRIA

Publications about the project

  1. Multilingual coreference resolution: Adapt and Generate

    Tatiana Anikina; Natalia Skachkova; Anna Mokhova

    In: Zdeněk ´abokrtský; Maciej Ogrodniczuk (Hrsg.). Proceedings of the CRAC 2023 Shared Task on Multilingual Coreference Resolution. Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC-2023), located at EMNLP 2023, December 6-7, Singapore, Singapore, Pages 19-33, Association for Computational Linguistics, 12/2023.


BMBF - Federal Ministry of Education and Research


BMBF - Federal Ministry of Education and Research