{rfName}
Tr

APC

2 860,00 Euros

License and Use

Icono OpenAccess

Altmetrics

Analysis of institutional authors

Solarte-Pabón OCorresponding AuthorGarcia-Barragan AAuthorMenasalvas EAuthorRobles VAuthor

Share

August 14, 2023
Publications
>
Article

Transformers for extracting breast cancer information from Spanish clinical narratives

Publicated to: ARTIFICIAL INTELLIGENCE IN MEDICINE. 143 102625- - 2023-09-01 143(), DOI: 10.1016/j.artmed.2023.102625

Authors:

Solarte-Pabón, O; Montenegro, O; García-Barragán, A; Torrente, M; Provencio, M; Menasalvas, E; Robles, V
[+]

Affiliations

Centro De Tecnologia Biomedica - Author
Centro de Tecnología Biomédica , Universidad del Valle, Cali - Author
Hosp Univ Puerta Hierro Madrid, Madrid, Spain - Author
Hospital Universitario Puerta de Hierro Majadahonda - Author
Univ Politecn Madrid, Ctr Tecnol Biomed, Madrid, Spain - Author
Univ Valle, Escuela Ingn Sistemas, Cali, Colombia - Author
Universidad del Valle, Cali - Author
See more

Abstract

The wide adoption of electronic health records (EHRs) offers immense potential as a source of support for clinical research. However, previous studies focused on extracting only a limited set of medical concepts to support information extraction in the cancer domain for the Spanish language. Building on the success of deep learning for processing natural language texts, this paper proposes a transformer-based approach to extract named entities from breast cancer clinical notes written in Spanish and compares several language models. To facilitate this approach, a schema for annotating clinical notes with breast cancer concepts is presented, and a corpus for breast cancer is developed. Results indicate that both BERT-based and RoBERTa-based language models demonstrate competitive performance in clinical Named Entity Recognition (NER). Specifically, BETO and multilingual BERT achieve F-scores of 93.71% and 94.63%, respectively. Additionally, RoBERTa Biomedical attains an F-score of 95.01%, while RoBERTa BNE achieves an F-score of 94.54%. The findings suggest that transformers can feasibly extract information in the clinical domain in the Spanish language, with the use of models trained on biomedical texts contributing to enhanced results. The proposed approach takes advantage of transfer learning techniques by fine-tuning language models to automatically represent text features and avoiding the time-consuming feature engineering process.
[+]

Keywords

breast cancerclassificationclinical narrativesdeep learningdocumentationnamed entity recognition (ner)oncologyrecognitionrecordsstageBreast cancerBreast neoplasmsClinical narrativesDeep learningElectronic health recordsInformation storage and retrievalMultilingualismNamed entity recognition (ner)Natural language processingNatural language processing (nlp)Pathology reports

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal ARTIFICIAL INTELLIGENCE IN MEDICINE due to its progression and the good impact it has achieved in recent years, according to the agency WoS (JCR), it has become a reference in its field. In the year of publication of the work, 2023, it was in position 4/44, thus managing to position itself as a Q1 (Primer Cuartil), in the category Medical Informatics. Notably, the journal is positioned above the 90th percentile.

From a relative perspective, and based on the normalized impact indicator calculated from World Citations provided by WoS (ESI, Clarivate), it yields a value for the citation normalization relative to the expected citation rate of: 3.31. This indicates that, compared to works in the same discipline and in the same year of publication, it ranks as a work cited above average. (source consulted: ESI Nov 13, 2025)

This information is reinforced by other indicators of the same type, which, although dynamic over time and dependent on the set of average global citations at the time of their calculation, consistently position the work at some point among the top 50% most cited in its field:

  • Weighted Average of Normalized Impact by the Scopus agency: 2.48 (source consulted: FECYT Mar 2025)

Specifically, and according to different indexing agencies, this work has accumulated citations as of 2026-04-27, the following number of citations:

  • WoS: 17
  • Scopus: 25
  • Google Scholar: 14
[+]

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2026-04-27:

  • The use, from an academic perspective evidenced by the Altmetric agency indicator referring to aggregations made by the personal bibliographic manager Mendeley, gives us a total of: 80.
  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 79 (PlumX).

With a more dissemination-oriented intent and targeting more general audiences, we can observe other more global scores such as:

  • The Total Score from Altmetric: 8.
  • The number of mentions in news outlets: 1 (Altmetric).

It is essential to present evidence supporting full alignment with institutional principles and guidelines on Open Science and the Conservation and Dissemination of Intellectual Heritage. A clear example of this is:

  • The work has been submitted to a journal whose editorial policy allows open Open Access publication.
  • Assignment of a Handle/URN as an identifier within the deposit in the Institutional Repository: https://oa.upm.es/81544/

As a result of the publication of the work in the institutional repository, statistical usage data has been obtained that reflects its impact. In terms of dissemination, we can state that, as of

  • Views: 265
  • Downloads: 172
[+]

Leadership analysis of institutional authors

This work has been carried out with international collaboration, specifically with researchers from: Colombia.

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (SOLARTE PABÓN, OSWALDO) and Last Author (ROBLES FORCADA, VICTOR).

the author responsible for correspondence tasks has been SOLARTE PABÓN, OSWALDO.

[+]