July 25, 2020
Publications
>
Article
No

Extracting diagnostic knowledge from medline plus: A comparison between metamap and cTAKES approaches

Publicated to: Current Bioinformatics. 13 (6): 573-582 - 2018-01-01 13(6), DOI: 10.2174/1574893612666170727094502

Authors:

Rodríguez-González, A; Costumero, R; Martinez-Romero, M; Wilkinson, MD; Menasalvas-Ruiz, E
[+]

Affiliations

Ctr Biotecnol & Genom Plantas UPM INIA - Author
Escuela Tecnica Superior de Ingenieros Informaticos, Universidad Politecnica de Madrid - Author
Instituto Nacional de Investigacion y Tecnologia Agraria y Alimentaria - Author
Stanford Univ, Stanford Ctr Biomed Informat Res - Author
Stanford University School of Medicine - Author
Univ Politecn Madrid, Ctr Tecnol Biomed - Author
Universidad Politécnica de Madrid - Author
See more

Abstract

© 2018 Bentham Science Publishers. Background: The development of diagnostic decision support systems (DDSS) requires having a reliable and consistent knowledge based on diseases and their symptoms, signs, and diagnostic tests. Physicians are typically the source of this knowledge but it is not always possible to obtain all the desired information from them. Other valuable sources are medical books and articles describing the diagnosis of diseases, but again, extracting this information is a hard and time-consuming task. Objective: In this paper we present the results of our research to compare two well-known tools that are used to perform NLP in medical domain. In this context we have used these tools to perform the operation of Name Entity Recognition to extract diagnostic terms from texts contained in MedLine Plus articles. Method: We have used Web scraping, natural language processing (NLP) techniques, a variety of publicly available sources of diagnostic knowledge and two widely known medical concept identifiers, MetaMap and cTAKES, to extract diagnostic criteria for infectious diseases from MedLine Plus articles. Results: A performance comparison of MetaMap and cTAKES is presented being visible that although the differences between both systems are not really significant there are some palpable differences in the results provided by the system. Conclusion: The extraction of diagnostic terms is a very important task for the creation of databases with this information. The use of NLP systems capable of extraction, those terms from texts are very valuable tools that need to be implemented and evaluated in order to obtain the maximum accuracy on this process.
[+]

Keywords

ArchitectureCdssCtakesDdssDiagnostic knowledgeInformation extractionMetamapNlpSystem

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal Current Bioinformatics, and although the journal is classified in the quartile Q4 (Agencia WoS (JCR)), its regional focus and specialization in Mathematical & Computational Biology, give it significant recognition in a specific niche of scientific knowledge at an international level.

Independientemente del impacto esperado determinado por el canal de difusión, es importante destacar el impacto real observado de la propia aportación.

Según las diferentes agencias de indexación, el número de citas acumuladas por esta publicación hasta la fecha 2026-04-27:

  • Google Scholar: 21
  • WoS: 5
  • Scopus: 6
[+]

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2026-04-27:

  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 22 (PlumX).

It is essential to present evidence supporting full alignment with institutional principles and guidelines on Open Science and the Conservation and Dissemination of Intellectual Heritage. A clear example of this is:

  • Assignment of a Handle/URN as an identifier within the deposit in the Institutional Repository: https://oa.upm.es/44984/

As a result of the publication of the work in the institutional repository, statistical usage data has been obtained that reflects its impact. In terms of dissemination, we can state that, as of

  • Views: 611
  • Downloads: 769
[+]

Leadership analysis of institutional authors

This work has been carried out with international collaboration, specifically with researchers from: United States of America.

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (RODRIGUEZ GONZALEZ, ALEJANDRO) and Last Author (Menasalvas-Ruiz E).

the author responsible for correspondence tasks has been RODRIGUEZ GONZALEZ, ALEJANDRO.

[+]