GPT for medical entity recognition in Spanish

Indexat a

Llicència i ús

Citacions

Cited 12 times in Scopus logo

Cited 8 times in Google Scholar logo

Altmetrics

Anàlisi d'autories institucional

Garcia-Barragan AAutor o coautorMenasalvas EAutor o coautorRobles VAutor o coautor

29 d’abril de 2024

Publicacions

Article

Sí

Publicat a:Multimedia Tools And Applications. - 2024-01-01 (), DOI: 10.1007/s11042-024-19209-5

Autors: García-Barragán Á; González Calatayud A; Solarte-Pabón O; Provencio M; Menasalvas E; Robles V

Afiliacions

Hospital Universitario Puerta de Hierro Majadahonda - Autor o coautor

Universidad del Valle, Cali - Autor o coautor

Universidad Politécnica de Madrid - Autor o coautor

Resum

In recent years, there has been a remarkable surge in the development of Natural Language Processing (NLP) models, particularly in the realm of Named Entity Recognition (NER). Models such as BERT have demonstrated exceptional performance, leveraging annotated corpora for accurate entity identification. However, the question arises: Can newer Large Language Models (LLMs) like GPT be utilized without the need for extensive annotation, thereby enabling direct entity extraction? In this study, we explore this issue, comparing the efficacy of fine-tuning techniques with prompting methods to elucidate the potential of GPT in the identification of medical entities within Spanish electronic health records (EHR). This study utilized a dataset of Spanish EHRs related to breast cancer and implemented both a traditional NER method using BERT, and a contemporary approach that combines few shot learning and integration of external knowledge, driven by LLMs using GPT, to structure the data. The analysis involved a comprehensive pipeline that included these methods. Key performance metrics, such as precision, recall, and F-score, were used to evaluate the effectiveness of each method. This comparative approach aimed to highlight the strengths and limitations of each method in the context of structuring Spanish EHRs efficiently and accurately.The comparative analysis undertaken in this article demonstrates that both the traditional BERT-based NER method and the few-shot LLM-driven approach, augmented with external knowledge, provide comparable levels of precision in metrics such as precision, recall, and F score when applied to Spanish EHR. Contrary to expectations, the LLM-driven approach, which necessitates minimal data annotation, performs on par with BERT’s capability to discern complex medical terminologies and contextual nuances within the EHRs. The results of this study highlight a notable advance in the field of NER for Spanish EHRs, with the few shot approach driven by LLM, enhanced by external knowledge, slightly edging out the traditional BERT-based method in overall effectiveness. GPT’s superiority in F-score and its minimal reliance on extensive data annotation underscore its potential in medical data processing.

Paraules clau

BertBreast cancerEhrGptInformation extractionLlmNer

Indicis de qualitat

Impacte bibliomètric. Anàlisi de la contribució i canal de difusió

El treball ha estat publicat a la revista Multimedia Tools And Applications a causa de la seva progressió i el bon impacte que ha aconseguit en els últims anys, segons l'agència Scopus (SJR), s'ha convertit en una referència en el seu camp. A l'any de publicació del treball, 2024 encara no hi ha indicis calculats, però el 2023, es trobava a la posició , aconseguint així situar-se com a revista Q1 (Primer Cuartil), en la categoria Media Technology.

Independentment de l'impacte esperat determinat pel canal de difusió, és important destacar l'impacte real observat de la pròpia aportació.

Segons les diferents agències d'indexació, el nombre de citacions acumulades per aquesta publicació fins a la data 2025-07-29:

Google Scholar: 8
Scopus: 12

Impacte i visibilitat social

Anàlisi del lideratge dels autors institucionals

Aquest treball s'ha realitzat amb col·laboració internacional, concretament amb investigadors de: Colombia.

Hi ha un lideratge significatiu, ja que alguns dels autors pertanyents a la institució apareixen com a primer o últim signant, es pot apreciar en el detall: Primer Autor (GARCIA BARRAGAN, ALVARO) i Últim Autor (ROBLES FORCADA, VICTOR).

Indexat a

Llicència i ús

Citacions

Altmetrics

Anàlisi d'autories institucional

Compartir