Abstract
Purpose: Completely labeled datasets of pathology slides are often difficult and time consuming to obtain. Semi-supervised learning methods are able to learn reliable models from small number of labeled instances and large quantities of unlabeled data. In this paper, we explored the potential of clustering analysis for semi-supervised support vector machine (SVM) classifier. Method: A clustering analysis method was proposed to find regions of high density prior to finding the decision boundary using a supervised SVM and was compared with another state-of-the-art semi-supervised technique. Different percentages of labeled instances were used to train supervised and semi-supervised SVM learners from an image dataset generated from 50 whole-mount images (8 patients) of breast specimen. Their cross-validated classification performances were compared with each other using the area under the ROC curve measure. Result: Our proposed clustering analysis for semi-supervised learning was able to produce a reliable classification model from small amounts of labeled data. Comparing the proposed method in this study with a well-known implementation of semi-supervised SVM, our method performed much faster and produced better results.
Preview
Unable to display preview. Download preview PDF.
References
Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: OPTICS: ordering points to identify the clustering structure. In: ACM SIGMOD International Conference on Management of Data, pp. 49–60. ACM Press (1999)
Chang, C.C., Lin, C.J.: LIBSVM : A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology 2(3), 27:1–27:27 (2011)
Chapelle, O., Schölkopf, B.: Semi-Supervised Learning. The MIT Press, September 2006
Chapelle, O., Sindhwani, V., Keerthi, S.: Branch and bound for semi-supervised support vector machines. In: Advances in Neural Information Processing Systems (NIPS) (2006)
Chapelle, O., Sindhwani, V., Keerthi, S.: Optimization Techniques for Semi-Supervised Support Vector Machines. Journal of Machine Learning Research 9, 203–233 (2008)
Chapelle, O., Zien, A.: Semi-supervised classification by low density separation. In: Tenth International Workshop on Artificial Intelligence and Statistics (AISTAT 2005) (2005)
Chapelle, O., Zien, A.: A continuation method for semi-supervised SVMs. In: International Conference on Machine Learning (2006)
Gan, H., Sang, N., Huang, R., Tong, X., Dan, Z.: Using clustering analysis to improve semi-supervised classification. Neurocomputing 101, 290–298 (2013)
Geusebroek, J.M., Smeulders, A.W.M., van de Weijer, J.: Fast anisotropic Gauss filtering. IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society 12(8), 938–943 (2003)
Helmi, H., Teck, D., Lai, C., Garibaldi, J.M.: Semi-supervised techniques in breast cancer classification. In: 12th Annual Workshop on Computational Intelligence (UKCI) (2012)
Joachims, T., Dortmund, U., Joachimscsuni-Dortmundde, T.: Advances in kernel methods. In: Support Vector Learning, pp. 169–184 (1999)
Shi, M., Zhang, B.: Semi-supervised learning improves gene expression-based prediction of cancer recurrence. Bioinformatics 27(21), 3017–3023 (2011)
Yuille, A.L., Rangarajan, A.: The Concave-Convex Procedure (CCCP). Neural Computation 15(2), 915–936 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Peikari, M., Zubovits, J., Clarke, G., Martel, A.L. (2015). Clustering Analysis for Semi-supervised Learning Improves Classification Performance of Digital Pathology. In: Zhou, L., Wang, L., Wang, Q., Shi, Y. (eds) Machine Learning in Medical Imaging. MLMI 2015. Lecture Notes in Computer Science(), vol 9352. Springer, Cham. https://doi.org/10.1007/978-3-319-24888-2_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-24888-2_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24887-5
Online ISBN: 978-3-319-24888-2
eBook Packages: Computer ScienceComputer Science (R0)