Skip to main content Skip to main navigation

Publikation

Boosting Object Representation Learning via Motion and Object Continuity

Quentin Delfosse; Wolfgang Stammer; Thomas Rothenbacher; Dwarak Vittal; Kristian Kersting
In: Danai Koutra; Claudia Plant; Manuel Gomez Rodriguez; Elena Baralis; Francesco Bonchi (Hrsg.). Machine Learning and Knowledge Discovery in Databases: Research Track - European Conference, ECML PKDD 2023, Turin, Italy, September 18-22, 2023, Proceedings, Part IV. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), Pages 610-628, Lecture Notes in Computer Science, Vol. 14172, Springer, 2023.

Zusammenfassung

Recent unsupervised multi-object detection models have shown impressive per- formance improvements, largely attributed to novel architectural inductive biases. Unfor- tunately, despite their good object localization and segmentation capabilities, their object encodings may still be suboptimal for downstream reasoning tasks, such as reinforcement learning. To overcome this, we propose to exploit object motion and continuity (objects do not pop in and out of existence). This is accomplished through two mechanisms: (i) provid- ing temporal loss-based priors on object locations, and (ii) a contrastive object continuity loss across consecutive frames. Rather than developing an explicit deep architecture, the resulting unsupervised Motion and Object Continuity (MOC) training scheme can be in- stantiated using any object detection model baseline. Our results show large improvements in the performances of variational and slot-based models in terms of object discovery, conver- gence speed and overall latent object representations, particularly for playing Atari games. Overall, we show clear benefits of integrating motion and object continuity for downstream reasoning tasks, moving beyond object representation learning based only on reconstruction as well as evaluation based only on instance segmentation quality.

Weitere Links