{rfName}
Ex

APC

2 438,00 Euros

License and Use

Icono OpenAccess

Altmetrics

Analysis of institutional authors

Huertas-García áCorresponding AuthorMartín García, AlejandroAuthorHuertas-Tato JAuthorCamacho DCorresponding Author

Share

November 7, 2022
Publications
>
Article

Exploring Dimensionality Reduction Techniques in Multilingual Transformers

Publicated to: Cognitive Computation. 15 (2): 590-612 - 2023-03-01 15(2), DOI: 10.1007/s12559-022-10066-8

Authors:

Huertas-García, A; Martín, A; Huertas-Tato, J; Camacho, D
[+]

Affiliations

Univ Politecn Madrid, Dept Sistemas Informat, Madrid, Spain - Author
Universidad Politécnica de Madrid - Author

Abstract

In scientific literature and industry, semantic and context-aware Natural Language Processing-based solutions have been gaining importance in recent years. The possibilities and performance shown by these models when dealing with complex Human Language Understanding tasks are unquestionable, from conversational agents to the fight against disinformation in social networks. In addition, considerable attention is also being paid to developing multilingual models to tackle the language bottleneck. An increase in size has accompanied the growing need to provide more complex models implementing all these features without being conservative in the number of dimensions required. This paper aims to provide a comprehensive account of the impact of a wide variety of dimensional reduction techniques on the performance of different state-of-the-art multilingual siamese transformers, including unsupervised dimensional reduction techniques such as linear and nonlinear feature extraction, feature selection, and manifold techniques. In order to evaluate the effects of these techniques, we considered the multilingual extended version of Semantic Textual Similarity Benchmark (mSTSb) and two different baseline approaches, one using the embeddings from the pre-trained version of five models and another using their fine-tuned STS version. The results evidence that it is possible to achieve an average reduction of 91.58 % ± 2.59 % in the number of dimensions of embeddings from pre-trained models requiring a fitting time 96.68 % ± 0.68 % faster than the fine-tuning process. Besides, we achieve 54.65 % ± 32.20 % dimensionality reduction in embeddings from fine-tuned models. The results of this study will significantly contribute to the understanding of how different tuning approaches affect performance on semantic-aware tasks and how dimensional reduction techniques deal with the high-dimensional embeddings computed for the STS task and their potential for other highly demanding NLP tasks.
[+]

Keywords

deeplanguage modelsmultilingual transformersnatural language processingsemantic textual similarityComponent analysisDimensionality reductionLanguage modelsMultilingual transformersNatural language processingSemantic textual similarity

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal Cognitive Computation due to its progression and the good impact it has achieved in recent years, according to the agency WoS (JCR), it has become a reference in its field. In the year of publication of the work, 2023, it was in position 76/310, thus managing to position itself as a Q1 (Primer Cuartil), in the category Neurosciences.

From a relative perspective, and based on the normalized impact indicator calculated from World Citations provided by WoS (ESI, Clarivate), it yields a value for the citation normalization relative to the expected citation rate of: 1.47. This indicates that, compared to works in the same discipline and in the same year of publication, it ranks as a work cited above average. (source consulted: ESI Nov 13, 2025)

Specifically, and according to different indexing agencies, this work has accumulated citations as of 2026-04-27, the following number of citations:

  • WoS: 7
  • Scopus: 8
  • Google Scholar: 15
[+]

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2026-04-27:

  • The use, from an academic perspective evidenced by the Altmetric agency indicator referring to aggregations made by the personal bibliographic manager Mendeley, gives us a total of: 45.
  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 45 (PlumX).

With a more dissemination-oriented intent and targeting more general audiences, we can observe other more global scores such as:

  • The Total Score from Altmetric: 2.
  • The number of mentions on the social network X (formerly Twitter): 2 (Altmetric).

It is essential to present evidence supporting full alignment with institutional principles and guidelines on Open Science and the Conservation and Dissemination of Intellectual Heritage. A clear example of this is:

  • The work has been submitted to a journal whose editorial policy allows open Open Access publication.
  • Assignment of a Handle/URN as an identifier within the deposit in the Institutional Repository: https://oa.upm.es/88877/

As a result of the publication of the work in the institutional repository, statistical usage data has been obtained that reflects its impact. In terms of dissemination, we can state that, as of

  • Views: 160
  • Downloads: 134
[+]

Leadership analysis of institutional authors

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (HUERTAS GARCÍA, ÁLVARO) and Last Author (CAMACHO FERNANDEZ, DAVID).

the authors responsible for correspondence tasks have been HUERTAS GARCÍA, ÁLVARO and CAMACHO FERNANDEZ, DAVID.

[+]