Turkish Sign Language Recognition Through Self-Supervised Learning Features and Heatmap

Onur Şero, Oğulcan Özdemir, Lale Akarun

2025 33rd Signal Processing and Communications Applications Conference (SIU)

Abstract

Sign Language Recognition (SLR) poses a significant challenge due to its complexity and variability. Sign languages (SLs) rely on hand gestures, upper body movements, and facial expressions to convey meaning. Deep learning models have been effectively applied in the field of SLR. In this study, we propose a model that includes Self-Supervised Learning (SSL) features and a heatmap to recognize isolated Turkish Sign Language videos in the BosphorusSign22k dataset. The SSL approach, DINO, is used to extract features of hand gestures and facial expressions, whereas the heatmap-based technique focuses on upper body movement. While hand gestures include most of the sign information, adding facial expressions and upper body posture increased the performance significantly. In our experiments, we observed that similar or even better representations can be extracted from sign language videos by just using SSL approaches rather than hand and face-specific methods.