Abstract
Clustering algorithms are used to find communities of nodes that all belong to the same group. This grouping process is also known as image segmentation in image processing. The clustering problem is also deeply connected to machine learning because a solution to the clustering problem may be used to propagate labels from observed data to unobserved data. In general network analysis, the identification of a grouping allows for the analysis of the nodes within each group as separate entities. In this chapter, we use the tools of discrete calculus to examine both the targeted clustering problem (i.e., finding a specific group) and the untargeted clustering problem (i.e., discovering all groups). We additionally show how to apply these clustering models to the clustering of higher-order cells, e.g., to cluster edges.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In the rest of this chapter we treated the data as univariate in order to simplify the exposition, with the understanding that all of the machinery could also be applied to multivariate data. However, since k-means is almost exclusively applied to multivariate data we have adopted a multivariate view of data in this section. Therefore, it is assumed that each node (data point) is associated with a tuple of data, rather than a scalar.
- 2.
Some authors have tried to incorporate spatial location into k-means by using the pixel coordinates as part of the feature vector in the application of k-means. This device can mitigate the problem described here in certain circumstances, but does not generalize to applications in which the network has no embedding or when the embedding is complicated, as in the gene expression example in Sect. 6.5.4 or the geospatial example in Sect. 5.9.4.
References
Agarwal, S., Branson, K., Belongie, S.: Higher order learning with graphs. In: Proc. of the 23rd Int. Conf. on Mach. Learn., vol. 148, pp. 17–24 (2006)
Allène, C., Audibert, J.Y., Couprie, M., Cousty, J., Keriven, R.: Some links between min cuts, optimal spanning forests and watersheds. In: Proc. of ISMM’07, vol. 2, pp. 253–264 (2007)
Aloise, D., Deshpande, A., Hansen, P., Popat, P.: NP-hardness of Euclidean sum-of-squares clustering. Machine Learning 75(2), 245–248 (2009)
Alvino, C.V., Unal, G.B., Slabaugh, G., Peny, B., Fang, T.: Efficient segmentation based on Eikonal and diffusion equations. International Journal of Computer Mathematics 84(9), 1309–1324 (2007)
Appleton, B., Talbot, H.: Globally optimal surfaces by continuous maximal flows. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(1), 106–118 (2006)
Aronsson, G., Crandall, M.G., Juutinen, P.: A tour of the theory of absolutely minimizing functions. Bulletin of the American Mathematical Society 41(4), 439–505 (2004)
Arora, S., Rao, S., Vazirani, U.: Geometry, flows, and graph-partitioning algorithms. Communications of the ACM 51(10), 96–105 (2008)
Bae, E., Tai, X.C.: Graph cut optimization for the piecewise constant level set method applied to multiphase image segmentation. In: Proc. of the International Conference of Scale Space and Variational Methods in Computer Vision, pp. 1–13 (2009)
Bai, X., Sapiro, G.: A geodesic framework for fast interactive image and video segmentation and matting. In: ICCV (2007)
Bengio, Y., Delalleau, O., Roux, N.L., Paiement, J.F., Vincent, P., Ouimet, M.: Learning eigenfunctions links spectral embedding and kernel PCA. Neural Computation 16(10), 2197–2219 (2004)
Bengio, Y., Paiement, J., Vincent, P., Delalleau, O., Le Roux, N., Ouimet, M.: Out-of-sample extensions for LLE, Isomap, MDS, eigenmaps, and spectral clustering. In: Proc. of NIPS, pp. 177–184 (2004)
Bohland, J., Bokil, H., Pathak, S., Lee, C., Ng, L., Lau, C., Kuan, C., Hawrylycz, M., Mitra, P.: Clustering of spatial gene expression patterns in the mouse brain and comparison with classical neuroanatomy. Methods 50(2), 105–112 (2010)
Boykov, Y., Jolly, M.P.: Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In: Proc. of ICCV 2001, pp. 105–112 (2001)
Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(9), 1124–1137 (2004)
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(11), 1222–1239 (2001)
Bruckstein, A.M., Netravali, A.N., Richardson, T.J.: Epi-convergence of discrete elastica. Applicable Analysis 79(1–2), 137–171 (2001)
Chambolle, A.: An algorithm for total variation minimization and applications. Journal of Mathematical Imaging and Vision 20(1–2), 89–97 (2004)
Chan, T., Vese, L.: Active contours without edges. IEEE Transactions on Image Processing 10(2), 266–277 (2001)
Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge (2006)
Chen, Y., Dong, M., Rege, M.: Gene expression clustering: A novel graph partitioning approach. In: Proceedings of International Joint Conference on Neural Networks (2007)
Chung, F.R.K.: The Laplacian of a hypergraph. In: Proc. of a DIMACS Workshop, Discrete Math. Theoret. Comput. Sci., vol. 10, pp. 21–36. Am. Math. Soc., Providence (1993)
Chung, F.R.K.: Spectral Graph Theory. Regional Conference Series in Mathematics, vol. 92. Am. Math. Soc., Providence (1997)
Cohen, L.D.: On active contour models and balloons. CVGIP: Image Understanding 53(2), 211–218 (1991)
Cohen, L., Cohen, I.: Finite-element methods for active contour models and balloons for 2-D and 3-D images. IEEE Transactions on Pattern Analysis and Machine Intelligence 15(11), 1131–1147 (1993)
Coifman, R., Lafon, S., Lee, A., Maggioni, M., Nadler, B., Warner, F., Zucker, S.: Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps. Proceedings of the National Academy of Sciences of the United States of America 102(21), 7426–7431 (2005)
Couprie, C., Grady, L., Najman, L., Talbot, H.: Power watersheds: A new image segmentation framework extending graph cuts, random walker and optimal spanning forest. In: Proc. of ICCV, pp. 731–738 (2009)
Criminisi, A., Sharp, T., Blake, A.: GeoS: Geodesic image segmentation. In: Proc. of ECCV, pp. 99–112 (2008)
Darbon, J.: A note on the discrete binary Mumford–Shah model. In: Proc. of the 3rd Int. Conf. on Computer Vision/Computer Graphics Collaboration Techniques. Lecture Notes in Computer Science, pp. 283–294. Springer, Berlin (2007)
Darbon, J., Sigelle, M.: Image restoration with discrete constrained total variation part I: Fast and exact optimization. Journal of Mathematical Imaging and Vision 26(3), 261–276 (2006)
Dheeraj Singaraju, L.G., Vidal, R.: P-brush: Continuous valued MRFs with normed pairwise distributions for image segmentation. In: Proc. of CVPR 2009. IEEE Comput. Soc., Los Alamitos (2009)
Donath, W., Hoffman, A.: Algorithms for partitioning of graphs and computer logic based on eigenvectors of connection matrices. IBM Technical Disclosure Bulletin 15, 938–944 (1972)
El-Zehiry, N., Xu, S., Sahoo, P., Elmaghraby, A.: Graph cut optimization for the Mumford–Shah model. In: Proc. of VIIP (2007)
El-Zehiry, N.Y., Elmaghraby, A.: Brain MRI tissue classification using graph cut optimization of the Mumford–Shah functional. In: Proceedings of Image and Vision Computing, New Zealand, pp. 321–326 (2007)
Falcão, A.X., Lotufo, R.A., Araujo, G.: The image foresting transformation. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(1), 19–29 (2004)
Falcão, A.X., Udupa, J.K., Samarasekera, S., Sharma, S., Elliot, B.H., de Lotufo, A.R.: User-steered image segmentation paradigms: Live wire and live lane. Graphical Models and Image Processing 60(4), 233–260 (1998)
Fiedler, M.: Algebraic connectivity of graphs. Czechoslovak Mathematical Journal 23(98), 298–305 (1973)
Fiedler, M.: Eigenvectors of acyclic matrices. Czechoslovak Mathematical Journal 25(100), 607–618 (1975)
Fiedler, M.: A property of eigenvectors of nonnegative symmetric matrices and its applications to graph theory. Czechoslovak Mathematical Journal 25(100), 619–633 (1975)
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Annals of Eugenics 7, 179–188 (1936)
Girvan, M., Newman, M.: Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America 99(12), 7821–7826 (2002)
Grady, L.: Multilabel random walker image segmentation using prior models. In: Proc. of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR, vol. 1, pp. 763–770. IEEE Press, San Diego (2005)
Grady, L.: Random walks for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(11), 1768–1783 (2006)
Grady, L.: Minimal surfaces extend shortest path segmentation methods to 3D. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(2), 321–334 (2010)
Grady, L., Alvino, C.: The piecewise smooth Mumford-Shah functional on an arbitrary graph. IEEE Transactions on Image Processing 18(11), 2547–2561 (2009)
Grady, L., Schwartz, E.L.: Isoperimetric graph partitioning for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(3), 469–475 (2006)
Grady, L., Schwartz, E.L.: Isoperimetric partitioning: A new algorithm for graph partitioning. SIAM Journal on Scientific Computing 27(6), 1844–1866 (2006)
Greig, D., Porteous, B., Seheult, A.: Exact maximum a posteriori estimation for binary images. Journal of the Royal Statistical Society. Series B 51(2), 271–279 (1989)
Guattery, S., Miller, G.: On the quality of spectral separators. SIAM Journal on Matrix Analysis and Applications 19(3), 701–719 (1998)
Hall, K.M.: An r-dimensional quadratic placement algorithm. Management Science 17(3), 219–229 (1970)
Harrison, L.M., Penny, W., Flandin, G., Ruff, C.C., Weiskopf, N., Friston, K.J.: Graph-partitioned spatial priors for functional magnetic resonance images. NeuroImage 43(4), 694–707 (2008)
Higham, D., Kalna, G., Kibble, M.: Spectral clustering and its use in bioinformatics. Journal of Computational and Applied Mathematics 204(1), 25–37 (2007)
Jain, A., Murty, M., Flynn, P.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)
Kass, M., Witkin, A., Terzopoulos, D.: Snakes: Active contour models. International Journal of Computer Vision 1(4), 321–331 (1988)
Kavitha, S., Roomi, S., Ramaraj, N.: Lossy compression through segmentation on low depth-of-field images. Digital Signal Processing 19(1), 59–65 (2009)
Khaira, M.S., Miller, G.L., Sheffler, T.J.: Nested dissection: A survey and comparison of various nested dissection algorithms. Technical Report CMU-CS-92-106R, Computer Science Department, Carnegie Mellon University (1992)
Kodres, U.R.: Geometrical positioning of circuit elements in a computer. In: Proceedings of the 1959 AIEE Fall General Meeting. AIEE, New York (1959) No. CP59-1172
Kolmogorov, V., Boykov, Y., Rother, C.: Applications of parametric maxflow in computer vision. In: Proc. of ICCV (2007)
Konstantinos, T.: Maximum flow techniques for network clustering. Ph.D. thesis, Princeton University (2002)
Lein, E., Hawrylycz, M., Ao, N., Ayres, M., Bensinger, A., Bernard, A., Boe, A., Boguski, M., Brockway, K., Byrnes, E., et al.: Genome-wide atlas of gene expression in the adult mouse brain. Nature 445(7124), 168–176 (2006)
Levin, A., Lischinski, D., Weiss, Y.: A closed-form solution to natural image matting. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(2), 228–242 (2008)
Lloyd, S.P.: Least square quantization in PCM. Technical Report, Bell Telephone Laboratories Paper (1957)
Lloyd, S.: Least squares quantization in PCM. IEEE Transactions on Information Theory 28(2), 129–137 (1982)
Mahajan, M., Nimbhorkar, P., Varadarajan, K.: The planar k-means problem is NP-hard. In: Proceedings of the 3rd International Workshop on Algorithms and Computation, pp. 274–285. Springer, Berlin (2009)
Michel, J., Pellegrini, F., Roman, J.: Unstructured graph partitioning for sparse linear system solving. In: Proc. of the 4th International Symposium, IRREGULAR’97, pp. 273–286 (1997)
Mortensen, E., Barrett, W.: Interactive segmentation with intelligent scissors. Graphical Models in Image Processing 60(5), 349–384 (1998)
Muhammad, A., Egerstedt, M.: Control using higher order Laplacians in network topologies. In: Proc. of the 17th Int. Symp. on Math. Theory of Networks and Systems, pp. 1024–1038 (2006)
Mumford, D., Shah, J.: Optimal approximations by piecewise smooth functions and associated variational problems. Communications on Pure and Applied Mathematics 42, 577–685 (1989)
Newman, M.: Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America 103(23), 8577–8582 (2006)
Nicholls, F., Torr, P.H.S.: Discrete minimum ratio curves and surfaces. In: Proc. of CVPR (2010)
Pal, N., Pal, S.: A review on image segmentation techniques. Pattern Recognition 26(9), 1277–1294 (1993)
Pothen, A., Simon, H., Liou, K.P.: Partitioning sparse matrices with eigenvectors of graphs. SIAM Journal on Matrix Analysis and Applications 11(3), 430–452 (1990)
Rand, W.M.: Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66, 846–850 (1971)
Roerdink, J., Meijster, A.: The watershed transform: definitions, algorithms, and parallelization strategies. Fundamenta Informaticae 41, 187–228 (2000)
Schaeffer, S.: Graph clustering. Computer Science Review 1(1), 27–64 (2007)
Schmalz, M.S., Ritter, G.X.: Region segmentation techniques for object-based image compression: A review. In: Schmalz, M.S. (ed.) Mathematics of Data/Image Coding, Compression, and Encryption VII, with Applications, vol. 5561, pp. 62–75. SPIE, Bellingham (2004)
Schoenemann, T., Kahl, F., Cremers, D.: Curvature regularity for region-based image segmentation and inpainting: A linear programming relaxation. In: IEEE International Conference on Computer Vision (ICCV), Kyoto, Japan (2009)
Serra, J.: Image Analysis and Mathematical Morphology. Academic Press, London (1982)
Sethian, J.: Level Set Methods and Fast Marching Methods. Cambridge University Press, Cambridge (1999)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)
Simon, H.D., Teng, S.H.: How good is recursive bisection? SIAM Journal of Scientific Computing 18(5), 1436–1445 (1997)
Singaraju, D., Grady, L., Sinop, A.K., Vidal, R.: P-brush: A continuous valued MRF for image segmentation. In: Blake, A., Kohli, P., Rother, C. (eds.) Advances in Markov Random Fields for Vision and Image Processing. MIT Press, Cambridge (2010)
Singaraju, D., Grady, L., Vidal, R.: Interactive image segmentation of quadratic energies on directed graphs. In: Proc. of CVPR 2008. IEEE Comput. Soc., Los Alamitos (2008)
Sinop, A.K., Grady, L.: A seeded image segmentation framework unifying graph cuts and random walker which yields a new algorithm. In: Proc. of ICCV 2007. IEEE Comput. Soc., Los Alamitos (2007)
Smith, B.F., Bjørstad, P.E., Gropp, W.: Domain Decomposition: Parallel Multilevel Methods for Elliptic Partial Differential Equations. Cambridge University Press, Cambridge (1996)
Spielman, D.A., Teng, S.H.: Spectral partitioning works: Planar graphs and finite element meshes. Technical Report UCB CSD-96-898, University of California, Berkeley (1996)
Strang, G.: Maximum flows through a domain. Mathematical Programming 26, 123–143 (1983)
Stuwe, M.: Plateau’s Problem and the Calculus of Variations. Princeton University Press, Princeton (1989)
Sullivan, J.M.: A crystalline approximation theorem for hypersurfaces. Ph.D. thesis, Princeton University, Princeton, NJ (1990)
Szallasi, Z., Somogyi, R.: Genetic network analysis—The millennium opening version. In: Proc. Pacific Symposium of Biocomputing Tutorial (2001)
Toselli, A., Widlund, O.: Domain Decomposition Methods—Algorithms and Theory. Springer Series in Computational Mathematics, vol. 34. Springer, Berlin (2004)
Trichili, H., Bouhlel, M.S., Kammoun, F.: Review and evaluation of medical image segmentation using methods of optimal filtering. Journal of Testing and Evaluation 31(5), 398–404 (2003)
Unger, M., Pock, T., Bischof, H.: Interactive globally optimal image segmentation. Technical Report 08/02, Inst. for Computer Graphics and Vision, Graz University of Technology (2008)
Unger, M., Pock, T., Trobin, W., Cremers, D., Bischof, H.: TVSeg—Interactive total variation based image segmentation. In: Proc. of British Machine Vision Conference (2008)
Walshaw, C., Cross, M., Everett, M.: Mesh partitioning and load-balancing for distributed memory parallel systems. In: Topping, B. (ed.) Proc. Parallel & Distributed Computing for Computational Mechanics (1997)
Wu, Z., Leahy, R.: An optimal graph theoretic approach to data clustering: Theory and its application to image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 15(11), 1101–1113 (1993)
Xing, E., Karp, R.: CLIFF: Clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts. Bioinformatics 17, 306–315 (2001)
Xu, C., Prince, J.: Snakes, shapes, and gradient vector flow. IEEE Transactions on Image Processing 7(3), 359–369 (1998)
Yu, S.X., Shi, J.: Segmentation with pairwise attraction and repulsion. In: Proc. of ICCV, vol. 1. IEEE Comput. Soc., Los Alamitos (2001)
Yu, S.X., Shi, J.: Understanding popout through repulsion. In: Proc. of CVPR, vol. 2. IEEE Comput. Soc., Los Alamitos (2001)
Zachary, W.W.: An information flow model for conflict and fission in small groups. Journal of Anthropological Research 33, 452–473 (1977)
Zahn, C.: Graph theoretical methods for detecting and describing Gestalt clusters. IEEE Transactions on Computers 20, 68–86 (1971)
Zeng, X., Chen, W., Peng, Q.: Efficiently solving the piecewise constant Mumford–Shah model using graph cuts. Technical Report, Zhejiang University (2006)
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic functions. In: Machine Learning: Proceedings of the Twentieth International Conference on Machine Learning, pp. 912–919 (2003)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2010 Springer-Verlag London Limited
About this chapter
Cite this chapter
Grady, L.J., Polimeni, J.R. (2010). Clustering and Segmentation. In: Discrete Calculus. Springer, London. https://doi.org/10.1007/978-1-84996-290-2_6
Download citation
DOI: https://doi.org/10.1007/978-1-84996-290-2_6
Publisher Name: Springer, London
Print ISBN: 978-1-84996-289-6
Online ISBN: 978-1-84996-290-2
eBook Packages: Computer ScienceComputer Science (R0)