{rfName}
Ef

License and Use

Icono OpenAccess

Altmetrics

Analysis of institutional authors

Badenes-Olmedo CCorresponding AuthorCorcho OAuthor

Share

October 17, 2018
Publications
>
Proceedings Paper

Efficient clustering from distributions over topics

Publicated to: Proceedings of the Knowledge Capture Conference, K-CAP 2017. - 2017-12-04 (), DOI: 10.1145/3148011.3148019

Authors:

Badenes-Olmedo, C; Redondo-García, JL; Corcho, O
[+]

Affiliations

Amazon Res, Cambridge, England - Author
Amazon Research - Author
Univ Politecn Madrid, Ontol Engn Grp, Boadilla Del Monte, Spain - Author
Universidad Politécnica de Madrid - Author
See more

Abstract

© 2017 Copyright held by the owner/author(s). There are many scenarios where we may want to find pairs of textually similar documents in a large corpus (e.g. a researcher doing literature review, or an R&D project manager analyzing project proposals). To programmatically discover those connections can help experts to achieve those goals, but brute-force pairwise comparisons are not computationally adequate when the size of the document corpus is too large. Some algorithms in the literature divide the search space into regions containing potentially similar documents, which are later processed separately from the rest in order to reduce the number of pairs compared. However, this kind of unsupervised methods still incur in high temporal costs. In this paper, we present an approach that relies on the results of a topic modeling algorithm over the documents in a collection, as a means to identify smaller subsets of documents where the similarity function can then be computed. This approach has proved to obtain promising results when identifying similar documents in the domain of scientific publications. We have compared our approach against state of the art clustering techniques and with different configurations for the topic modeling algorithm. Results suggest that our approach outperforms (> 0.5) the other analyzed techniques in terms of efficiency.
[+]

Keywords

Large-scale text analysisScholarly dataSemantic similarityTopic models

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

Independientemente del impacto esperado determinado por el canal de difusión, es importante destacar el impacto real observado de la propia aportación.

Según las diferentes agencias de indexación, el número de citas acumuladas por esta publicación hasta la fecha 2026-04-27:

  • WoS: 2
  • Scopus: 9
[+]

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2026-04-27:

  • The use, from an academic perspective evidenced by the Altmetric agency indicator referring to aggregations made by the personal bibliographic manager Mendeley, gives us a total of: 25.
  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 25 (PlumX).

With a more dissemination-oriented intent and targeting more general audiences, we can observe other more global scores such as:

  • The Total Score from Altmetric: 1.
  • The number of mentions on the social network X (formerly Twitter): 3 (Altmetric).

It is essential to present evidence supporting full alignment with institutional principles and guidelines on Open Science and the Conservation and Dissemination of Intellectual Heritage. A clear example of this is:

  • The work has been submitted to a journal whose editorial policy allows open Open Access publication.
  • Assignment of a Handle/URN as an identifier within the deposit in the Institutional Repository: https://oa.upm.es/52009/

As a result of the publication of the work in the institutional repository, statistical usage data has been obtained that reflects its impact. In terms of dissemination, we can state that, as of

  • Views: 445
  • Downloads: 444
[+]

Leadership analysis of institutional authors

This work has been carried out with international collaboration, specifically with researchers from: United Kingdom.

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (BADENES OLMEDO, CARLOS) and Last Author (CORCHO GARCIA, OSCAR).

the author responsible for correspondence tasks has been BADENES OLMEDO, CARLOS.

[+]