June 9, 2019
Publications
>
Article
No

Computing Semantic Similarity of Concepts in Knowledge Graphs

Publicated to: IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. 29 (1): 72-85 - 2017-01-01 29(1), DOI: 10.1109/TKDE.2016.2610428

Authors:

Zhu, GG; Iglesias, CA
[+]

Affiliations

Univ Politecn Madrid, Escuela Tecn Super Ingn Telecomunicac, Avda Complutense 30, E-28040 Madrid, Spain - Author

Abstract

This paper presents a method for measuring the semantic similarity between concepts in Knowledge Graphs (KGs) such as WordNet and DBpedia. Previous work on semantic similarity methods have focused on either the structure of the semantic network between concepts (e.g., path length and depth), or only on the Information Content (IC) of concepts. We propose a semantic similarity method, namely wpath, to combine these two approaches, using IC to weight the shortest path length between concepts. Conventional corpus-based IC is computed from the distributions of concepts over textual corpus, which is required to prepare a domain corpus containing annotated concepts and has high computational cost. As instances are already extracted from textual corpus and annotated by concepts in KGs, graph-based IC is proposed to compute IC based on the distributions of concepts over instances. Through experiments performed on well known word similarity datasets, we show that the wpath semantic similarity method has produced a statistically significant improvement over other semantic similarity methods. Moreover, in a real category classification evaluation, the wpath method has shown the best performance in terms of accuracy and F score.
[+]

Keywords

DbpediaFrequencyInformation contentKnowledge graphOntologiesQueriesRelatednessRepresentationSemantic relatednessSemantic similarityTaxonomyWebWordnet

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING due to its progression and the good impact it has achieved in recent years, according to the agency WoS (JCR), it has become a reference in its field. In the year of publication of the work, 2017, it was in position 33/132, thus managing to position itself as a Q1 (Primer Cuartil), in the category Computer Science, Artificial Intelligence.

From a relative perspective, and based on the normalized impact indicator calculated from World Citations provided by WoS (ESI, Clarivate), it yields a value for the citation normalization relative to the expected citation rate of: 4.68. This indicates that, compared to works in the same discipline and in the same year of publication, it ranks as a work cited above average. (source consulted: ESI Nov 13, 2025)

This information is reinforced by other indicators of the same type, which, although dynamic over time and dependent on the set of average global citations at the time of their calculation, consistently position the work at some point among the top 50% most cited in its field:

  • Weighted Average of Normalized Impact by the Scopus agency: 6.96 (source consulted: FECYT Mar 2025)

Specifically, and according to different indexing agencies, this work has accumulated citations as of 2026-04-25, the following number of citations:

  • WoS: 156
  • Scopus: 215
[+]

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2026-04-25:

  • The use, from an academic perspective evidenced by the Altmetric agency indicator referring to aggregations made by the personal bibliographic manager Mendeley, gives us a total of: 211.
  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 211 (PlumX).

With a more dissemination-oriented intent and targeting more general audiences, we can observe other more global scores such as:

  • The Total Score from Altmetric: 11.
  • The number of mentions on the social network X (formerly Twitter): 2 (Altmetric).

It is essential to present evidence supporting full alignment with institutional principles and guidelines on Open Science and the Conservation and Dissemination of Intellectual Heritage. A clear example of this is:

  • Assignment of a Handle/URN as an identifier within the deposit in the Institutional Repository: https://oa.upm.es/43462/

As a result of the publication of the work in the institutional repository, statistical usage data has been obtained that reflects its impact. In terms of dissemination, we can state that, as of

  • Views: 621
  • Downloads: 1,353
[+]

Leadership analysis of institutional authors

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (ZHU, GANGGAO) and Last Author (IGLESIAS FERNANDEZ, CARLOS ANGEL).

[+]

Awards linked to the item

This work is supported by the Spanish Ministry of Economy and Competitiveness under the R&D projects SEMOLA (TEC2015-68284-R) and EmoSpaces (RTC-2016-5053-7), by the Regional Government of Madrid through the project MOSI-AGIL-CM (grant P2013/ICE-3019, co-funded by EU Structural Funds FSE and FEDER), and by the European Union through the project EUROSENTIMENT (Grant Agreement No: 296277) and MixedEmotions (Grant Agreement no: 141111).
[+]