{rfName}
Sy

License and Use

Icono OpenAccess

Altmetrics

Analysis of institutional authors

Esteban-Romero, SergioCorresponding AuthorMartín-Fernández, IvánAuthorGil-Martin, ManuelAuthorFernandez-Martinez, FernandoAuthor

Share

September 4, 2025
Publications
>
Article

Synthesizing Olfactory Understanding: Multimodal Language Models for Image-Text Smell Matching

Publicated to: Symmetry-Basel. 17 (8): 1349- - 2025-08-18 17(8), DOI: 10.3390/sym17081349

Authors:

Esteban-Romero, Sergio; Martin-Fernandez, Ivan; Gil-Martin, Manuel; Fernandez-Martinez, Fernando
[+]

Affiliations

Univ Politecn Madrid UPM, Grp Tecnol Habla & Aprendizaje Automat THAU Grp, Informat Proc & Telecommun Ctr, ETSI Telecomunicac, Madrid 28040, Spain - Author

Abstract

Olfactory information, crucial for human perception, is often underrepresented compared to visual and textual data. This work explores methods for understanding smell descriptions within a multimodal context, where scent information is conveyed indirectly through text and images. We address the challenges of the Multimodal Understanding of Smells in Texts and Images (MUSTI) task by proposing novel approaches that leverage language-specific models and state-of-the-art multimodal large language models (MM-LLMs). Our core contribution is a multimodal framework using language-specific encoders for text and image data. This allows for a joint embedding space that explores the semantic symmetry between smells, texts, and images to identify olfactory-related connections shared across the modalities. While ensemble learning with language-specific models achieved good performance, MM-LLMs demonstrated exceptional potential. Fine-tuning a quantized version of the Qwen-VL-Chat model achieved a state-of-the-art macro F1-score of 0.7618 on the MUSTI task. This highlights the effectiveness of MM-LLMs in capturing task requirements and adapting to specific formats.
[+]

Keywords

Contrastive language-image pretraining (clip)Contrastive language–image pretraining (clip)Multimodal large language models (mm-llmsMultimodal large language models (mm-llms)Multimodal perceptionOlfactory understanding

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal Symmetry-Basel due to its progression and the good impact it has achieved in recent years, according to the agency WoS (JCR), it has become a reference in its field. In the year of publication of the work, 2025, it was in position 54/136, thus managing to position itself as a Q2 (Segundo Cuartil), in the category Multidisciplinary Sciences. Notably, the journal is positioned en el Cuartil Q2 para la agencia Scopus (SJR) en la categoría Computer Science (Miscellaneous).

[+]

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2026-04-27:

  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 2 (PlumX).

With a more dissemination-oriented intent and targeting more general audiences, we can observe other more global scores such as:

    It is essential to present evidence supporting full alignment with institutional principles and guidelines on Open Science and the Conservation and Dissemination of Intellectual Heritage. A clear example of this is:

    • The work has been submitted to a journal whose editorial policy allows open Open Access publication.
    • Assignment of a Handle/URN as an identifier within the deposit in the Institutional Repository: https://oa.upm.es/90955/

    As a result of the publication of the work in the institutional repository, statistical usage data has been obtained that reflects its impact. In terms of dissemination, we can state that, as of

    • Views: 93
    • Downloads: 97
    [+]

    Leadership analysis of institutional authors

    There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (ESTEBAN ROMERO, SERGIO) and Last Author (FERNANDEZ MARTINEZ, FERNANDO).

    the author responsible for correspondence tasks has been ESTEBAN ROMERO, SERGIO.

    [+]

    Awards linked to the item

    Sergio Esteban-Romero's research was supported by the Spanish Ministry of Education (FPI grant PRE2022-105516). The research of Ivan Martin-Fernandez was supported by the Universidad Politecnica de Madrid (Programa Propio I+D+i). This work was funded by Project ASTOUND (101071191-HORIZON-EIC-2021-PATHFINDERCHALLENGES-01) of the European Commission and by the Spanish Ministry of Science and Innovation through the projects GOMINOLA (PID2020-118112RB-C22), TRUSTBOOST (PID2023-150584OB-C21), and BeWord (PID2021-126061OB-C43), funded by MCIN/AEI/ 10.13039/501100011033 and by the European Union "NextGenerationEU/PRTR".
    [+]