{rfName}
Tr

Indexed in

License and Use

Icono OpenAccess

Altmetrics

Analysis of institutional authors

Pedrera Jiménez, MiguelCorresponding Author

Share

October 24, 2022
Publications
>
Article

TransformEHRs: a flexible methodology for building transparent ETL processes for EHR reuse

Publicated to: Methods Of Information In Medicine. 61 (S 02): E89-E102 - 2022-01-01 61(S 02), DOI: 10.1055/s-0042-1757763

Authors:

Pedrera-Jimenez, Miguel; Garcia-Barrio, Noelia; Rubio-Mayo, Paula; Tato-Gomez, Alberto; Luis Cruz-Bermudez, Juan; Luis Bernal-Sobrino, Jose; Munoz-Carrero, Adolfo; Serrano-Balazote, Pablo;
[+]

Affiliations

Inst Invest Sanitaria Hosp Univ 12 Octubre, Data Sci Unit, Madrid, Spain - Author
Inst Salud Carlos III, Digital Hlth Res Unit, Madrid, Spain - Author
Univ Politecn Madrid, ETSI Telecomunicac, Madrid, Spain - Author
See more

Abstract

Background During the COVID-19 pandemic, several methodologies were designed for obtaining electronic health record (EHR)-derived datasets for research. These processes are often based on black boxes, on which clinical researchers are unaware of how the data were recorded, extracted, and transformed. In order to solve this, it is essential that extract, transform, and load (ETL) processes are based on transparent, homogeneous, and formal methodologies, making them understandable, reproducible, and auditable. Objectives This study aims to design and implement a methodology, according with FAIR Principles, for building ETL processes (focused on data extraction, selection, and transformation) for EHR reuse in a transparent and flexible manner, applicable to any clinical condition and health care organization. Methods The proposed methodology comprises four stages: (1) analysis of secondary use models and identification of data operations, based on internationally used clinical repositories, case report forms, and aggregated datasets; (2) modeling and formalization of data operations, through the paradigm of the Detailed Clinical Models; (3) agnostic development of data operations, selecting SQL and R as programming languages; and (4) automation of the ETL instantiation, building a formal configuration file with XML. Results First, four international projects were analyzed to identify 17 operations, necessary to obtain datasets according to the specifications of these projects from the EHR. With this, each of the data operations was formalized, using the ISO 13606 reference model, specifying the valid data types as arguments, inputs and outputs, and their cardinality. Then, an agnostic catalog of data was developed through data-oriented programming languages previously selected. Finally, an automated ETL instantiation process was built from an ETL configuration file formally defined. Conclusions This study has provided a transparent and flexible solution to the difficulty of making the processes for obtaining EHR-derived data for secondary use understandable, auditable, and reproducible. Moreover, the abstraction carried out in this study means that any previous EHR reuse methodology can incorporate these results into them.
[+]

Keywords

AgnosticArticleAutomationCase reportClinical articleComputer languageCovid-19Data exchangeData extractionData reusabilityDesignElectronic health recordElectronic health recordsEpidemiologyFair principlesHealth care organizationHumanHumansInformaticsPandemicPandemicsReal-world dataStandards

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal Methods Of Information In Medicine due to its progression and the good impact it has achieved in recent years, according to the agency Scopus (SJR), it has become a reference in its field. In the year of publication of the work, 2022, it was in position , thus managing to position itself as a Q1 (Primer Cuartil), in the category Advanced and Specialized Nursing.

Independientemente del impacto esperado determinado por el canal de difusión, es importante destacar el impacto real observado de la propia aportación.

Según las diferentes agencias de indexación, el número de citas acumuladas por esta publicación hasta la fecha 2025-12-21:

  • Google Scholar: 7
  • WoS: 1
  • Scopus: 4
[+]

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2025-12-21:

  • The use, from an academic perspective evidenced by the Altmetric agency indicator referring to aggregations made by the personal bibliographic manager Mendeley, gives us a total of: 26.
  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 26 (PlumX).

With a more dissemination-oriented intent and targeting more general audiences, we can observe other more global scores such as:

  • The Total Score from Altmetric: 1.
  • The number of mentions on the social network X (formerly Twitter): 2 (Altmetric).

It is essential to present evidence supporting full alignment with institutional principles and guidelines on Open Science and the Conservation and Dissemination of Intellectual Heritage. A clear example of this is:

  • The work has been submitted to a journal whose editorial policy allows open Open Access publication.
[+]

Leadership analysis of institutional authors

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (Pedrera-Jimenez, Miguel) .

the authors responsible for correspondence tasks have been PEDRERA JIMÉNEZ, MIGUEL and Pedrera-Jimenez, Miguel.

[+]

Awards linked to the item

Ministerio de Economia y Competitividad Instituto de Salud Carlos III PI18/00981 PI18/01047 PI18CIII/00019
[+]