{rfName}
Ha

Indexed in

License and use

Citations

Altmetrics

Analysis of institutional authors

Gil-Martín M.AuthorMartín-Fernández I.AuthorEsteban-Romero S.Author

Share

Publications
>
Proceedings Paper

Hand Gesture Recognition Using MediaPipe Landmarks and Deep Learning Networks

Publicated to:International Conference On Agents And Artificial Intelligence. 3 24-30 - 2025-01-01 3(), DOI: 10.5220/0013053500003890

Authors: Gil-Martín M; Marini MR; Martín-Fernández I; Esteban-Romero S; Cinque L

Affiliations

Abstract

Advanced Human Computer Interaction techniques are commonly used in multiple application areas, from entertainment to rehabilitation. In this context, this paper proposes a framework to recognize hand gestures using a limited number of landmarks from the video images. This hand gesture recognition system comprises an image processing module that extracts and processes the coordinates of 21 hand points called landmarks, and a deep neural network module that models and classifies the hand gestures. These landmarks are extracted automatically through MediaPipe software. The experiments were carried out over the IPN Hand dataset in an independent-user scenario using a Subject-Wise Cross Validation. They cover the use of different landmark-based formats, normalizations, lengths of the gesture representations, and number of landmarks used as inputs. The system obtains significantly better accuracy when using the raw coordinates of the 21 landmarks through 125 timesteps and a light Recurrent Neural Network architecture (80.56 ± 1.19 %) or the hand anthropometric measures (82.20 ± 1.15 %) compared to using the speed of the hand landmarks through the gesture (72.93 ± 1.34 %). The proposed framework studied the effect of different landmark-based normalizations over the raw coordinates, obtaining an accuracy of 83.67 ± 1.12 % when using as reference the wrist landmark from each frame, and an accuracy of 84.66 ± 1.09 % when using as reference the wrist landmark from the first video frame of the current gesture. In addition, the proposed solution provided high recognition performance even when only using the coordinates from 6 (82.15 ± 1.16 %) or 4 (81.46 ± 1.17 %) specific hand landmarks using as reference the wrist landmark from the first video frame of the current gesture.

Keywords

Deep learningHand gesture recognitionHuman-computer interactionMediapipe landmarks

Quality index

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2025-06-02:

  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 3 (PlumX).

Leadership analysis of institutional authors

This work has been carried out with international collaboration, specifically with researchers from: Italy.

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (GIL MARTIN, MANUEL) .