{rfName}
Me

License and Use

Icono OpenAccess

Altmetrics

Analysis of institutional authors

Patiño-Martínez, MartaAuthor

Share

June 24, 2024
Publications
>
Article

Measuring and Improving the Energy Efficiency of Large Language Models Inference

Publicated to: IEEE Access. 12 80194-80207 - 2024-01-01 12(), DOI: 10.1109/ACCESS.2024.3409745

Authors:

Argerich, Mauricio Fadel; Patino-Martinez, Marta
[+]

Affiliations

Univ Politecn Madrid, Escuela Tecn Super Ingn Informat, Madrid 28040, Spain - Author

Abstract

Recent improvements in the accuracy of machine learning (ML) models in the language domain have propelled their use in a multitude of products and services, touching millions of lives daily. These new levels of accuracy have been attained mainly through exponential growth in model size, creating a new category of models known as Large Language Models (LLMs) and leading to a substantial increase in computing and energy demands. While recent studies have focused on measuring and improving the energy consumption of LLMs during training, inference has received little attention. In this article, we present an approach to profile the energy consumption of LLMs during inference and leverage it to improve energy efficiency. For this, we deploy several state-of-the-art LLMs and observe how model size, number of layers, parallelized attention, and even vocabulary size affect their energy consumption. In addition, we leverage input batch size and different quantization levels to optimize their inference energy efficiency and latency.
[+]

Keywords

Computational modelingDeep learningEnergy consumptionEnergy efficiencyEnergy measurementGraphics processing unitsLarge language modelLarge language modelsMachine learningSoftwareSoftware measurementTraining

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal IEEE Access due to its progression and the good impact it has achieved in recent years, according to the agency Scopus (SJR), it has become a reference in its field. In the year of publication of the work, 2024 there are still no calculated indicators, but in 2023, it was in position , thus managing to position itself as a Q1 (Primer Cuartil), in the category Engineering (Miscellaneous).

Independientemente del impacto esperado determinado por el canal de difusión, es importante destacar el impacto real observado de la propia aportación.

Según las diferentes agencias de indexación, el número de citas acumuladas por esta publicación hasta la fecha 2026-04-26:

  • WoS: 36
  • Scopus: 52
[+]

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2026-04-26:

  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 49 (PlumX).

It is essential to present evidence supporting full alignment with institutional principles and guidelines on Open Science and the Conservation and Dissemination of Intellectual Heritage. A clear example of this is:

  • The work has been submitted to a journal whose editorial policy allows open Open Access publication.
  • Assignment of a Handle/URN as an identifier within the deposit in the Institutional Repository: https://oa.upm.es/86674/

As a result of the publication of the work in the institutional repository, statistical usage data has been obtained that reflects its impact. In terms of dissemination, we can state that, as of

  • Views: 184
  • Downloads: 1,605
[+]

Leadership analysis of institutional authors

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (Argerich, Mauricio Fadel) and Last Author (PATIÑO MARTINEZ, MARTA).

the author responsible for correspondence tasks has been Argerich, Mauricio Fadel.

[+]

Awards linked to the item

No Statement Available
[+]