TrendMiner

Project

Large-scale, Cross-lingual Trend Mining and Summarisation of Real-time Media Streams

Duration:
11/01/2011 - 10/31/2014

The recent massive growth in online media and the rise of user-authored content (e.g weblogs, Twitter, Facebook) has lead to challenges of how to access and interpret these strongly multilingual data, in a timely, efficient, and affordable manner.

Scientifically, streaming online media pose new challenges, due to their shorter, noisier, and more colloquial nature. Moreover, they form a temporal stream strongly grounded in events and context. Consequently, existing language technologies fall short on accuracy, scalability and portability.

The goal of this project is to deliver innovative, portable open-source real-time methods for cross-lingual mining and summarisation of large-scale stream media.

TrendMiner will achieve this through an inter-disciplinary approach, combining deep linguistic methods from text processing, knowledge-based reasoning from web science, machine learning, economics, and political science.

No expensive human annotated data will be required due to our use of time-series data (e.g. financial markets, political polls) as a proxy. A key novelty will be weakly supervised machine learning algorithms for automatic discovery of new trends and correlations. Scalability and affordability will be addressed through a cloud-based infrastructure for real-time text mining from stream media.

Results will be validated in two high-profile case studies: financial decision support (with analysts, traders, regulators, and economists) and political analysis and monitoring (with politicians, economists, and political journalists).

The techniques will be generic with many business applications: business intelligence, customer relations management, community support. The project will also benefit society and ordinary citizens by enabling enhanced access to government data archives, summarisation of online health information, and tracking of hot societal issues.

Partners

Deutsches Forschungszentrum für Künstliche Intelligenz GmbH, Germany (Coordinator)
The University of Sheffield, United Kingdom
Ontotext AD, Bulgaria 1 36
University of Southampton, United Kingdom
Stichting Internet Memory Foundation, The Netherlands
Eurokleis S.R.L., Italy
Sora Ogris & Hofinger GmbH, Austria
Hardik Fintrade Pvt Ltd., India

Keyfacts

Involved research areas
Speech and Language Technology,
Multilinguality and Language Technology
Website
http://www.trendminer-project.eu

Publications about the project

Christian Eisenreich; Jana Ott; Tonio Süßdorf; Christian Willms; Thierry Declerck

In: Proceedings of ISWC 2014. International Semantic Web Conference (ISWC-14), 13th, October 19-23, Riva del Garda, Italy, Springer, 10/2014.

Dagmar Gromann; Thierry Declerck

In: Paul Buitelaar; Philipp Cimiano. Towards the Multilingual Semantic Web. Pages 227-242, ISBN 978-3-662-43584-7, Springer, Heidelberg, New York, Dordrecht, London, 9/2014.

Paloma Martínez; Isabel Segura; Thierry Declerck; José L. Martínez

In: Proceedings of the XXX Conference of the Spanish Society for Natural Language Processing. Conference of the Spanish Society for Natural Language Processing (SEPLN-14), September 16-19, Girona, Spain, SEPL, 9/2014.

All publications

Large-scale, Cross-lingual Trend Mining and Summarisation of Real-time Media Streams

Partners

Sponsors

EU - European Union

Keyfacts

From Tale to Speech: Ontology-based Emotion and Dialogue Annotation of Fairy Tales with a TTS Output

A Cross-Lingual Correcting and Completive Method for Multilingual Ontology Labels

TrendMiner: Large-scale Cross-lingual Trend Mining Summarization of Real-time Media Streams