Publications
>
Article

A method for outlier detection based on cluster analysis and visual expert criteria

Publicated to:Expert Systems. 37 (e12473): e12473- - 2020-10-01 37(e12473), DOI: 10.1111/exsy.12473

Authors: Lara, Juan A; Lara, Juan A; Lizcano, David; Ramperez, Victor; Soriano, Javier

Affiliations

Abstract

Outlier detection is an important problem occurring in a wide range of areas. Outliers are the outcome of fraudulent behaviour, mechanical faults, human error, or simply natural deviations. Many data mining applications perform outlier detection, often as a preliminary step in order to filter out outliers and build more representative models. In this paper, we propose an outlier detection method based on a clustering process. The aim behind the proposal outlined in this paper is to overcome the specificity of many existing outlier detection techniques that fail to take into account the inherent dispersion of domain objects. The outlier detection method is based on four criteria designed to represent how human beings (experts in each domain) visually identify outliers within a set of objects after analysing the clusters. This has an advantage over other clustering-based outlier detection techniques that are founded on a purely numerical analysis of clusters. Our proposal has been evaluated, with satisfactory results, on data (particularly time series) from two different domains: stabilometry, a branch of medicine studying balance-related functions in human beings and electroencephalography (EEG), a neurological exploration used to diagnose nervous system disorders. To validate the proposed method, we studied method outlier detection and efficiency in terms of runtime. The results of regression analyses confirm that our proposal is useful for detecting outlier data in different domains, with a false positive rate of less than 2% and a reliability greater than 99%.

Keywords

ClusteringData miningFallsIdentificationKddOutlier detectionRiskVisual expert criteria

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal Expert Systems due to its progression and the good impact it has achieved in recent years, according to the agency WoS (JCR), it has become a reference in its field. In the year of publication of the work, 2020, it was in position 31/110, thus managing to position itself as a Q2 (Segundo Cuartil), in the category Computer Science, Theory & Methods. Notably, the journal is positioned en el Cuartil Q2 para la agencia Scopus (SJR) en la categoría Artificial Intelligence.

From a relative perspective, and based on the normalized impact indicator calculated from the Field Citation Ratio (FCR) of the Dimensions source, it yields a value of: 3.55, which indicates that, compared to works in the same discipline and in the same year of publication, it ranks as a work cited above average. (source consulted: Dimensions Jun 2025)

Specifically, and according to different indexing agencies, this work has accumulated citations as of 2025-06-07, the following number of citations:

  • WoS: 11
  • Scopus: 15
  • OpenCitations: 13

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2025-06-07:

  • The use, from an academic perspective evidenced by the Altmetric agency indicator referring to aggregations made by the personal bibliographic manager Mendeley, gives us a total of: 31.
  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 30 (PlumX).

With a more dissemination-oriented intent and targeting more general audiences, we can observe other more global scores such as:

  • The Total Score from Altmetric: 3.

It is essential to present evidence supporting full alignment with institutional principles and guidelines on Open Science and the Conservation and Dissemination of Intellectual Heritage. A clear example of this is:

  • The work has been submitted to a journal whose editorial policy allows open Open Access publication.
  • Assignment of a Handle/URN as an identifier within the deposit in the Institutional Repository: https://oa.upm.es/63970/

As a result of the publication of the work in the institutional repository, statistical usage data has been obtained that reflects its impact. In terms of dissemination, we can state that, as of

  • Views: 349
  • Downloads: 1,103

Leadership analysis of institutional authors

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: Last Author (SORIANO CAMINO, FRANCISCO JAVIER).