Real-Time Discovery and Geospatial Visualization of Mobility and Industry Events from Large-Scale, Heterogeneous Data Streams

Leonhard Hennig, Philippe Thomas, Renlong Ai, Johannes Kirschnick, He Wang, Jakob Pannier, Nora Zimmermann, Sven Schmeier, Feiyu Xu, Jan Ostwald, Hans Uszkoreit

In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Annual Meeting of the Association for Computational Linguistics (ACL-16) 54th August 7-12 Berlin Germany ACL 2016.


Monitoring mobility- and industry-relevant events is important in areas such as personal travel planning and supply chain management, but extracting events pertaining to specific companies, transit routes and locations from heterogeneous, high-volume text streams remains a significant challenge. We present Spree, a scalable system for real-time, automatic event extraction from social media, news and domain-specific RSS feeds. Our system is tailored to a range of mobility- and industry-related events, and processes German texts within a distributed linguistic analysis pipeline implemented in Apache Flink. The pipeline detects and disambiguates highly ambiguous domain-relevant entities, such as street names, and extracts various events with their geo-locations. Event streams are visualized on a dynamic, interactive map for monitoring and analysis.


hennig_acl_2016.pdf (pdf, 712 KB )

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz