article

Clustering validity checking methods: part II

Authors:
Maria Halkidi

Athens University of Economics & Business

Athens University of Economics & Business
View Profile

,
Yannis Batistakis

Athens University of Economics & Business

Athens University of Economics & Business
View Profile

,
Michalis Vazirgiannis

Athens University of Economics & Business

Athens University of Economics & Business
View Profile

Authors Info & Claims

ACM SIGMOD Record Volume 31 Issue 3September 2002pp 19–27https://doi.org/10.1145/601858.601862

Published:01 September 2002Publication History

ACM SIGMOD Record

Abstract

Clustering results validation is an important topic in the context of pattern recognition. We review approaches and systems in this context. In the first part of this paper we presented clustering validity checking approaches based on internal and external criteria. In the second, current part, we present a review of clustering validity approaches based on relative criteria. Also we discuss the results of an experimental study based on widely known validity indices. Finally the paper illustrates the issues that are under-addressed by the recent approaches and proposes the research directions in the field.

References

Michael J. A. Berry, Gordon Linoff. Data Mining Techniques For marketing, Sales and Customer Support. John Willey & Sons, Inc, 1996.]] Google ScholarDigital Library
Bezdeck, J.C, Ehrlich, R., Full, W.. "FCM:Fuzzy C-Means Algorithm", Computers and Geoscience, 1984.]]Google Scholar
Dave, R. N.. "Validating fuzzy partitions obtained through c-shells clustering", Pattern Recognition Letters, Vol. 17, pp613-623, 1996.]] Google ScholarDigital Library
Davies, DL, Bouldin, D.W. "A cluster separation measure". IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 1, No2, 1979.]]Google ScholarDigital Library
Dunn, J. C.. "Well separated clusters and optimal fuzzy partitions", J. Cybern. Vol.4, pp. 95-104, 1974.]]Google ScholarCross Ref
Gath I., Geva A.B. "Unsupervised optimal fuzzy clustering", IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 11(7), 1989.]] Google ScholarDigital Library
Guha, S., Rastogi, R., Shim K. (1998). "CURE: An Efficient Clustering Algorithm for Large Databases", Published in the Proceedings of the ACM SIGMOD Conference.]] Google ScholarDigital Library
Halkidi, M., Vazirgiannis, M., Batistakis, I.. "Quality scheme assessment in the clustering process", Proceedings of PKDD, Lyon, France, 2000.]] Google ScholarDigital Library
Halkidi M, Vazirgiannis M., "A data set oriented approach for clustering algorithm selection", Proceedings of PKDD, Freiburg, Germany, 2001]] Google ScholarDigital Library
M. Halkidi, M. Vazirgiannis, "Clustering Validity Assessment: Finding the optimal partitioning of a data set", to appear in the Proceedings of ICDM, California, USA, November 2001.]] Google ScholarDigital Library
Krishnapuram, R., Frigui, H., Nasraoui. O. "Quadratic shell clustering algorithms and the detection of second-degree curves", Pattern Recognition Letters, Vol. 14(7), 1993]] Google ScholarDigital Library
MacQueen, J.B (1967). "Some Methods for Classification and Analysis of Multivariate Observations", In Proceedings of 5th Berkley Symposium on Mathematical Statistics and Probability, Volume I: Statistics, pp281-297.]]Google Scholar
Milligan, G.W. and Cooper, M.C.. "An Examination of Procedures for Determining the Number of Clusters in a Data Set", Psychometrika, Vol.50, pp 159-179, 1985.]]Google ScholarCross Ref
Pal, N.R., Biswas, J.. "Cluster Validation using graph theoretic concepts". Pattern Recognition, Vol. 30(6), 1997.]]Google Scholar
Rezaee, R, Lelieveldt, B.P.F., Reiber, J.H.C. "A new cluster validity index for the fuzzy c-mean", Pattern Recognition Letters, 19, pp. 237-246, 1998.]] Google ScholarDigital Library
Sharma, S.C.. Applied Multivariate Techniques. John Willwy & Sons, 1996.]] Google ScholarDigital Library
Smyth, P. "Clustering using Monte Carlo Cross-Validation". Proceedings of KDD Conference, 1996.]]Google Scholar
Theodoridis, S., Koutroubas, K.. Pattern recognition, Academic Press, 1999.]] Google ScholarDigital Library
Xie, X. L, Beni, G.. "A Validity measure for Fuzzy Clustering", IEEE Transactions on Pattern Analysis and machine Intelligence, Vol.13, No4, 1991.]] Google ScholarDigital Library

Recommendations

Cluster validity methods: part I

Clustering is an unsupervised process since there are no predefined classes and no examples that would indicate grouping properties in the data set. The majority of the clustering algorithms behave differently depending on the features of the data set ...
Read More
An overview of clustering methods

Data clustering is the process of identifying natural groupings or clusters within multidimensional data based on some similarity measure. Clustering is a fundamental process in many different disciplines. Hence, researchers from different fields are ...
Read More
K-means clustering versus validation measures: a data-distribution perspective

K-means is a well-known and widely used partitional clustering method. While there are considerable research efforts to characterize the key features of the K-means clustering algorithm, further investigation is needed to understand how data ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM SIGMOD Record Volume 31, Issue 3
September 2002
84 pages
ISSN:0163-5808
DOI:10.1145/601858
Issue’s Table of Contents

Copyright © 2002 Authors
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 September 2002
Check for updates
Author Tags
clustering validation
pattern discovery
unsupervised learning
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 338
  Total Citations
  View Citations
- 2,809
  Total Downloads
- Downloads (Last 12 months)59
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Clustering validity checking methods: part II

ACM SIGMOD Record

Abstract

References

Cited By

Recommendations

Cluster validity methods: part I

An overview of clustering methods

K-means clustering versus validation measures: a data-distribution perspective

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Clustering validity checking methods: part II

ACM SIGMOD Record

Abstract

References

Cited By

Recommendations

Cluster validity methods: part I

An overview of clustering methods

K-means clustering versus validation measures: a data-distribution perspective

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media