Speaker:Professor Yan-Bin Chen (Master Program in Statistics of National Taiwan University)
Topic:Understanding Ambiguity in Image Clustering via Similarity Measures
Speaker:Professor Yan-Bin Chen (Master Program in Statistics of National Taiwan University)
Time:June 6 (Friday) , 2025, 10:40-11:30
Place: 4F-427, Assembly Building I
Abstract
In recent years, deep neural networks have driven substantial progress in data clustering. Nevertheless, clustering ambiguous images remains a significant challenge. This talk focuses on the identification of heterogeneous data structures through similarity measurements. We will introduce a detailed analysis of consistent neighbor distributions, which shows that clean images tend to have a higher proportion of consistent neighbors. This analysis motivates the refinement of clustering methods by accounting for the bias and variance of data features. Furthermore, it employs statistical techniques—specifically bias, variance, and consistent neighbors—to address the clustering problem. We then explain how these techniques are applied to manage data heterogeneity and enhance the robustness and accuracy of image clustering in deep neural networks.