Publikation
SONAR-SLT: Multilingual Sign Language Translation via Language-Agnostic Sentence Embedding Supervision
Yasser Hamidullah; Shakib Yazdani; Cennet Oguz; Cristina España Bonnet; Josef van Genabith
In: Proceedings of the Tenth Conference on Machine Translation. Conference on Machine Translation (WMT-2025), November 8-9, Suzhou, China, Association for Computational Linguistics, 2025.
Zusammenfassung
Sign language translation (SLT) is typically trained with text in a single spoken language, which limits scalability and cross-language generalization. Earlier approaches have replaced gloss supervision with text-based sentence embeddings, up to now, these remain tied to a specific language and modality. In contrast, here we employ language-agnostic, multimodal embeddings trained on text and speech from multiple languages to supervise SLT, enabling direct multilingual translation.
To address data scarcity, we propose a coupled augmentation method that combines multilingual target augmentation with video-level perturbations, improving model robustness. Experiments show consistent BLEURT gains over text-only embedding supervision, with larger improvements in low-resource settings. Our results demonstrate that language-agnostic embedding supervision, combined with coupled augmentation, provides a scalable and semantically robust alternative to traditional SLT training.