In the literature, a popular dissimilarity measure between each pair of fuzzy data is defined as a weighted sum of the squared Euclidean distances among the centers and the spreads. The flexible weights introduced by D'Urso and Giordani in [Computational Statistics gf Data Analysis 50 (2006) 1496-1523] may be helpful in certain situations as shown in that paper. The flexible weights are obtained by the minimization algorithm, and a larger weight is given to the center distance than the spread one.In the present study, the weights are proposed to be fixed and equal to 0.5, which can contribute to higher performances of clustering methods in some cases. For instance, where the distance among the spreads is longer than that among the centers, i.e., the centers do not play a relevant role while the spreads play the predominant role, it is clear that a clustering method that takes the advantage of equal weights would outperform the other. Indeed, when there is no external condition, choosing the weights fixed and equal to 0.5 would be the best strategy.To deeply investigate our claim, we provide a wide simulation study to show whether and under what circumstances the flexible weights are (not) helpful. The results furnished by numerical experiments on both simulated and benchmark datasets demonstrate the merit of using the fixed and equal to 0.5 weights suggested in the present paper.
Improved determination of the weights in a clustering approach based on a weighted dissimilarity measure between fuzzy data
Tomasiello S.
2022
Abstract
In the literature, a popular dissimilarity measure between each pair of fuzzy data is defined as a weighted sum of the squared Euclidean distances among the centers and the spreads. The flexible weights introduced by D'Urso and Giordani in [Computational Statistics gf Data Analysis 50 (2006) 1496-1523] may be helpful in certain situations as shown in that paper. The flexible weights are obtained by the minimization algorithm, and a larger weight is given to the center distance than the spread one.In the present study, the weights are proposed to be fixed and equal to 0.5, which can contribute to higher performances of clustering methods in some cases. For instance, where the distance among the spreads is longer than that among the centers, i.e., the centers do not play a relevant role while the spreads play the predominant role, it is clear that a clustering method that takes the advantage of equal weights would outperform the other. Indeed, when there is no external condition, choosing the weights fixed and equal to 0.5 would be the best strategy.To deeply investigate our claim, we provide a wide simulation study to show whether and under what circumstances the flexible weights are (not) helpful. The results furnished by numerical experiments on both simulated and benchmark datasets demonstrate the merit of using the fixed and equal to 0.5 weights suggested in the present paper.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.