Paper Accepted to ICDM 2021!

Prof. Mimi Zhang and I recently had a paper accepted to the International Conference on Data Mining 2021 (ERA rank A, Qualis Rank A1). Our paper introduces Density-Core Finding (DCF), an efficient and robust density-based clustering method.

DCF extends the popular Density Peaks Clustering (DPC) method which detects modes as points with high density and large distance to points of higher density. DPC often fails to detect low-density clusters in the data. Furthermore, DPC has quadratic complexity. The issues with DPC can be seen in the image below, multiple centers are selected from the high-density (top) cluster while the bottom cluster is ignored.

DCF aims to improve the applicability and efficiency of the peak-finding technique. The improvements are threefold: (1) the new algorithm is applicable to large datasets; (2) the algorithm is capable of detecting clusters of varying density; (3) the algorithm is competent at deciding the correct number of clusters, even when the number of clusters is very high.

DCF achieves this by directing the peak-finding technique to discover modal sets, rather than point modes. The concept of DCF can be seen below. On the right, the DPC method that detects point modes selects spurious modes from the high-density cluster while ignoring the low density cluster. By directing DCF to detect modal sets, we recover both clusters accurately.

When we apply DCF to the toy dataset above, we recover the true clusters. The shaded instances represent the modal sets for each cluster.

In the full version of the paper, available here, we present a theoretical analysis of our approach and experimental results to verify that our algorithm works well in practice. We demonstrate a potential application of our work for unsupervised face recognition.

More about this project:

  • Manuscript: Tobin J, Zhang M. (2021) DCF: An Efficient and Robust Density-Based Clustering Methods. DOI 10.1109/ICDM51629.2021.00074 (To appear in ICDM 2021)
  • Software: https://github.com/tobinjo96/DCFcluster
  • You can learn more about this project by watching my talk at ICDM 2021.

Leave a comment