Skip to main content
Log in

Maximum Matching Sans Maximal Matching: A New Approach for Finding Maximum Matchings in the Data Stream Model

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

The problem of finding a maximum size matching in a graph (known as the maximum matching problem) is one of the most classical problems in computer science. Despite a significant body of work dedicated to the study of this problem in the data stream model, the state-of-the-art single-pass semi-streaming algorithm for it is still a simple greedy algorithm that computes a maximal matching, and this way obtains \({1}/{2}\)-approximation. Some previous works described two/three-pass algorithms that improve over this approximation ratio by using their second and third passes to improve the above mentioned maximal matching. One contribution of this paper continues this line of work by presenting new three-pass semi-streaming algorithms that work along these lines and obtain improved approximation ratios of 0.6111 and 0.5694 for triangle-free and general graphs, respectively. Unfortunately, a recent work Konrad and Naidu (Approximation, randomization, and combinatorial optimization. Algorithms and techniques, APPROX/RANDOM 2021, August 16–18, 2021. LIPIcs, vol 207, pp 19:1–19:18, 2021. https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2021.19) shows that the strategy of constructing a maximal matching in the first pass and then improving it in further passes has limitations. Additionally, this technique is unlikely to get us closer to single-pass semi-streaming algorithms obtaining a better than \({1}/{2}\)-approximation. Therefore, it is interesting to come up with algorithms that do something else with their first pass (we term such algorithms non-maximal-matching-first algorithms). No such algorithms were previously known, and the main contribution of this paper is describing such algorithms that obtain approximation ratios of 0.5384 and 0.5555 in two and three passes, respectively, for general graphs. The main significance of our results is not in the numerical improvements, but in demonstrating the potential of non-maximal-matching-first algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Algorithm 2
Algorithm 3
Algorithm 4
Fig. 4
Algorithm 5
Algorithm 6
Algorithm 7
Fig. 5

Similar content being viewed by others

Notes

  1. A maximal matching is a matching that is inclusion-wise maximal. It is well known that the size of any maximal matching is a \({1}/{2}\)-approximation for the size of a maximum matching.

  2. There is also an impossibility result for general two-pass semi-streaming algorithms due to [13], but it is much weaker.

  3. We recall that every bipartite graph is triangle-free, and therefore, the same result is obtained also for bipartite graphs.

  4. Intuitively, the charge assigned to the connected components of u and v is proportional to the “blame” that can be assigned to them if (uv) ends up to be outside P. For example, an isolated edge could not alone prevent (uv) from being added to P, but two such edges (one intersecting u and the other intersecting v) could, together, prevent (uv) from being added to P. Therefore, we assign a charge of \({1}/{2}\) to isolated edges. Observation 3.3 is based on this intuition.

  5. One of the two \(M^*\) edges intersecting \(C_u\) is guaranteed to be an edge of the triangle itself since the triangle is not counted by \({\text {(}\#\text {non-}{M^*}\text {-triangles)}}\). Since such edges cannot be counted by \({\text {(}\#\text {component-free)}}\), we get that the edge (uv) can exclude at most 2 edges of \({\text {(}\#\text {component-free)}}\) rather than 3. However, we ignore this observation as it does not lead to a better approximation guarantee.

  6. A path P is an augmenting path for a matching M if \(M \oplus E(P)\) is a valid matching of size \(|M| + 1\).

References

  1. Konrad, C., Naidu, K.K.: On two-pass streaming algorithms for maximum bipartite matching. In: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2021, August 16–18, 2021. LIPIcs, vol. 207 (2021), pp. 19:1–19:18. https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2021.19

  2. Balinski, M.L., Gonzalez, J.: Maximum matchings in bipartite graphs via strong spanning trees. Networks 21(2), 165–179 (1991). https://doi.org/10.1002/net.3230210203

    Article  MathSciNet  Google Scholar 

  3. Edmonds, J.: Maximum matching and a polyhedron with 0, 1-vertices. J. Res. Natl. Bureau Stand. B 69(125–130), 55–56 (1965)

    MathSciNet  Google Scholar 

  4. Hopcroft, J.E., Karp, R.M.: An \(n^{5/2}\) algorithm for maximum matchings in bipartite graphs. SIAM J. Comput. 2(4), 225–231 (1973). https://doi.org/10.1137/0202019

    Article  MathSciNet  Google Scholar 

  5. Feigenbaum, J., Kannan, S., McGregor, A., Suri, S., Zhang, J.: On graph problems in a semi-streaming model. Theor. Comput. Sci. 348(2–3), 207–216 (2005). https://doi.org/10.1016/j.tcs.2005.09.013

    Article  MathSciNet  Google Scholar 

  6. Kapralov, M.: Space lower bounds for approximating maximum matching in the edge arrival model. In: Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms, SODA 2021, January 10–13, 2021 (2021), pp. 1874–1893. https://doi.org/10.1137/1.9781611976465.112

  7. Goel, A., Kapralov, M., Khanna, S.: On the communication and streaming complexity of maximum bipartite matching. In: Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (2012), pp. 468–485. https://doi.org/10.1137/1.9781611973099.41

  8. Kapralov, M.: Better bounds for matchings in the streaming model. In: Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (2013), pp. 1679–1697. https://doi.org/10.1137/1.9781611973105.121

  9. Konrad, C.: A simple augmentation method for matchings with applications to streaming algorithms. In: 43rd International Symposium on Mathematical Foundations of Computer Science (MFCS). LIPIcs, vol. 117 (2018), pp. 74:1–74:16. https://doi.org/10.4230/LIPIcs.MFCS.2018.74

  10. Kale, S., Tirodkar, S.: Maximum matching in two, three, and a few more passes over graph streams. In: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2017, August 16-18, 2017, Berkeley, CA, USA. LIPIcs, vol. 81 (2017), pp. 15:1–15:21. https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2017.15

  11. Konrad, C., Magniez, F., Mathieu, C.: Maximum matching in semi-streaming with few passes. In: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques—15th International Workshop, APPROX 2012, and 16th International Workshop, RANDOM 2012, Cambridge, MA, USA, August 15–17, 2012. Proceedings. Lecture Notes in Computer Science, vol. 7408, pp. 231–242 (2012). https://doi.org/10.1007/978-3-642-32512-0_20

  12. Esfandiari, H., Hajiaghayi, M., Monemizadeh, M.: Finding large matchings in semi-streaming. In: Domeniconi, C., Gullo, F., Bonchi, F., Domingo-Ferrer, J., Baeza-Yates, R., Zhou, Z., Wu, X. (eds.) IEEE International Conference on Data Mining Workshops, ICDM Workshops 2016, December 12–15, 2016, Barcelona, Spain, pp. 608–614. IEEE Computer Society (2016). https://doi.org/10.1109/ICDMW.2016.0092

  13. Assadi, S.: A two-pass (conditional) lower bound for semi-streaming maximum matching. In: Naor, J.S., Buchbinder, N. (eds.) ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 708–742. SIAM (2022). https://doi.org/10.1137/1.9781611977073.32

  14. Azarmehr, A., Behnezhad, S., Roghani, M.: Fully dynamic matching: (2-\(\surd \)2)-approximation in polylog update time. CoRR arXiv:abs/2307.08772 (2023). https://doi.org/10.48550/arXiv.2307.08772

  15. Kapralov, M., Khanna, S., Sudan, M.: Approximating matching size from random streams. In: Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (2014), pp. 734–751. https://doi.org/10.1137/1.9781611973402.55

  16. Cormode, G., Jowhari, H., Monemizadeh, M., Muthukrishnan, S.: The sparse awakens: streaming algorithms for matching size estimation in sparse graphs. In: 25th Annual European Symposium on Algorithms, ESA 2017, September 4–6, 2017, Vienna, Austria. LIPIcs, vol. 87, pp. 29:1–29:15 (2017). https://doi.org/10.4230/LIPIcs.ESA.2017.29

  17. Esfandiari, H., Hajiaghayi, M., Liaghat, V., Monemizadeh, M., Onak, K.: Streaming algorithms for estimating the matching size in planar graphs and beyond. ACM Trans. Algorithms 14(4), 48:1-48:23 (2018). https://doi.org/10.1145/3230819

    Article  MathSciNet  Google Scholar 

  18. McGregor, A., Vorotnikova, S.: Planar matching in streams revisited. In: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM), pp. 17:1–17:12 (2016). https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2016.17

  19. McGregor, A., Vorotnikova, S.: A simple, space-efficient, streaming algorithm for matchings in low arboricity graphs. In: 1st Symposium on Simplicity in Algorithms (SOSA). OASICS, vol. 61, pp. 14:1–14:4 (2018). https://doi.org/10.4230/OASIcs.SOSA.2018.14

  20. Assadi, S., Khanna, S., Li, Y.: On estimating maximum matching size in graph streams. In: Klein, P.N. (ed) Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017, Barcelona, Spain, Hotel Porta Fira, January 16–19, pp. 1723–1742. SIAM (2017). https://doi.org/10.1137/1.9781611974782.113

  21. Assadi, S., Kol, G., Saxena, R.R., Yu, H.: Multi-pass graph streaming lower bounds for cycle counting, max-cut, matching size, and other problems. In: Irani, S. (ed) 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16–19, 2020, pp. 354–364. IEEE (2020). https://doi.org/10.1109/FOCS46700.2020.00041

  22. Assadi, S., N, V.: Graph streaming lower bounds for parameter estimation and property testing via a streaming XOR lemma. In: Khuller, S., Williams, V.V. (eds.) STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, June 21–25, 2021, pp. 612–625. ACM (2021). https://doi.org/10.1145/3406325.3451110

  23. Chitnis, R., Cormode, G., Esfandiari, H., Hajiaghayi, M., McGregor, A., Monemizadeh, M., Vorotnikova, S.: Kernelization via sampling with applications to finding matchings and related problems in dynamic graph streams. In: Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (2016), pp. 1326–1344. https://doi.org/10.1137/1.9781611974331.ch92

  24. Assadi, S., Behnezhad, S., Khanna, S., Li, H.: On regularity lemma and barriers in streaming and dynamic matching. In: Saha, B., Servedio, R.A. (eds.) 55th Annual ACM Symposium on Theory of Computing (STOC), pp. 131–144. ACM (2023). https://doi.org/10.1145/3564246.3585110

  25. McGregor, A.: Finding graph matchings in data streams. In: Approximation, Randomization and Combinatorial Optimization, Algorithms and Techniques, 8th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems (APPROX) and 9th International Workshop on Randomization and Computation (RANDOM) (2005), pp. 170–181. https://doi.org/10.1007/11538462_15

  26. Ahn, K.J., Guha, S.: Access to data and number of iterations: Dual primal algorithms for maximum matching under resource constraints. ACM Trans. Parallel Comput. 4(4), 17:1-17:40 (2018). https://doi.org/10.1145/3154855

    Article  Google Scholar 

  27. Assadi, S., Jambulapati, A., Jin, Y., Sidford, A., Tian, K.: Semi-streaming bipartite matching in fewer passes and optimal space. arXiv e-prints pp. arXiv–2011 (2020)

  28. Assadi, S., Liu, S.C., Tarjan, R.E.: An auction algorithm for bipartite matching in streaming and massively parallel computation models. In: 4th Symposium on Simplicity in Algorithms (SOSA), pp. 165–171 (2021). https://doi.org/10.1137/1.9781611976496.18

  29. Fischer, M., Mitrovic, S., Uitto, J.: Deterministic (1+\(\epsilon \))-approximate maximum matching with poly(1/\(\epsilon \)) passes in the semi-streaming model and beyond. In: Leonardi, S., Gupta, A. (eds.) STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20–24, 2022, pp. 248–260. ACM (2022). https://doi.org/10.1145/3519935.3520039

  30. Huang, S., Su, H.: \((1-\varepsilon )\)-approximate maximum weighted matching in poly\((1/\varepsilon , \log n)\) time in the distributed and parallel settings. In: Oshman, R., Nolin, A., Halldórsson, M.M., Balliu, A. (eds.), ACM Symposium on Principles of Distributed Computing (PODC), pp. 44–54. ACM (2023). https://doi.org/10.1145/3583668.3594570

  31. Assadi, S.: A simple (1-\(\epsilon \))-approximation semi-streaming algorithm for maximum (weighted) matching. CoRR arXiv:abs/2307.02968 (2023). https://doi.org/10.48550/arXiv.2307.02968

  32. Assadi, S., Behnezhad, S.: Beating two-thirds for random-order streaming matching. In: 48th International Colloquium on Automata, Languages, and Programming, ICALP 2021, July 12–16, 2021. LIPIcs, vol. 198, pp. 19:1–19:13 (2021). https://doi.org/10.4230/LIPIcs.ICALP.2021.19

  33. Bernstein, A.: Improved bounds for matching in random-order streams. In: Czumaj, A. Dawar, A., Merelli, E. (eds.) 47th International Colloquium on Automata, Languages, and Programming, ICALP 2020, July 8–11, 2020, Saarbrücken, Germany. LIPIcs, vol. 168, (Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020), pp. 12:1–12:13. https://doi.org/10.4230/LIPIcs.ICALP.2020.12

  34. Crouch, M.S., Stubbs, D.M.: Improved streaming algorithms for weighted matching, via unweighted matching. In: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM). LIPIcs, vol. 28, pp. 96–104 (2014). https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2014.96

  35. Epstein, L., Levin, A., Mestre, J., Segev, D.: Improved approximation guarantees for weighted matching in the semi-streaming model. SIAM J. Discrete Math. 25(3), 1251–1265 (2011). https://doi.org/10.1137/100801901

    Article  MathSciNet  Google Scholar 

  36. Zelke, M.: Weighted matching in the semi-streaming model. Algorithmica 62(1–2), 1–20 (2012). https://doi.org/10.1007/s00453-010-9438-5

    Article  MathSciNet  Google Scholar 

  37. Paz, A., Schwartzman, G.: A (\(2 + \epsilon \))-approximation for maximum weight matching in the semi-streaming model. ACM Trans. Algorithms 15(2), 18:1-18:15 (2019). https://doi.org/10.1145/3274668

    Article  MathSciNet  Google Scholar 

  38. Bernstein, A., Dudeja, A., Langley, Z.: A framework for dynamic matching in weighted graphs. In: Khuller, S., Williams, V.V. (eds.) STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, June 21–25, 2021, pp. 668–681. ACM (2021). https://doi.org/10.1145/3406325.3451113

Download references

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed equally.

Corresponding author

Correspondence to Moran Feldman.

Ethics declarations

Conflict of interest

The research leading to these results received funding from Israel Science Foundation (ISF) grants no. 1357/16 and 459/20. The authors have no other relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

An extended abstract version of this work appeared in APPROX 2022.

Appendices

Appendix A Three-Pass Non-MMF Algorithm

In this section we prove Theorem 1.2, which we repeat below for convenience.

Theorem 1.2

There exists a non-MMF 3-pass (\({5}/{9} = {1}/{2} + {1}/{18}\))-approximation semi-streaming algorithm for finding a maximum size matching in a general graph.

The algorithm used for proving Theorem 1.2 is a modified version of Algorithm 1 that appears as Algorithm 8 and manages to obtain an improved approximation ratio at the cost of making an additional pass (i.e., it makes 3 passes). The first pass of Algorithm 8 is identical to the first pass of Algorithm 1, however, the second and third passes of Algorithm 8 each consider only one of the two kinds of edges considered together in the second pass of Algorithm 1.

To describe these passes in more details we use the terminology defined in Sect. 3 for describing Algorithm 1. In the second pass of Algorithm 8, we construct a set \(A_1\) in the same way in which this is done by Algorithm 1, i.e., by greedily adding to \(A_1\) edges that connect a connection vertex of a naïve partial triangle with an isolated vertex. Then, in the third pass of Algorithm 8, we greedily collect into another set, termed \(A_2\), edges that connect connection vertices of two distinct naïve partial triangles. We stress that the construction of \(A_2\) by Algorithm 8 is slightly different compared to the construction of the set carrying the same name in Algorithm 1. Upon termination of its third pass, Algorithm 8 outputs a maximum matching in the set of all the edges that it kept.

Algorithm 8
figure h

Maximum Matching via Greedy Triangles - 3 passes

The proof of Observation 3.1 applies to Algorithm 8 as well, and therefore, Algorithm 8 is a semi-streaming algorithm. Below we concentrate on analyzing the approximation guarantee of this algorithm. It is important to note that the analysis of the approximation ratio of Algorithm 1 up to Lemma 3.5 only depends on the behavior of the algorithm during its first pass, and therefore, applies also to Algorithm 8 since the two algorithms have identical first passes.

In principle, the proof of Lemma 3.6 applies also to Algorithm 8 since this proof is based on the method used by Algorithm 1 to construct the set \(A_1\), and this set is constructed in the same way by the two algorithms. However, it turns out that we need in this section a slightly stronger version of Lemma 3.6. Specifically, Lemma 3.6 includes the value \({\text {(}\#\text {non-}{M^*}\text {-triangles)}}\) in one of its terms. This value counts the number of connected components in (VP) that are triangles and do not include as one of their edges any edge of \(M^*\). Each such connected component is intersected by at most a single edge of \(A_1\) or \(A_2\), and in this section we need to count separately the connected components of this kind that intersect edges from each one of these sets.

Formally, we let \({\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_1}\text {)}}\) be the number of connected components of (VP) that (1) are triangles, (2) do not include any edge of \(M^*\) as one of their three edges, and (3) intersect an edge of \(A_1\). Similarly, \({\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_2}\text {)}}\) is the number of connected components of (VP) that (1) are triangles, (2) do not include any edge of \(M^*\), and (3) intersect an edge of \(A_2\). Since every partial triangle in (VP) intersects at most a single edge of \(A_1 \cup A_2\), and the sets \(A_1\) and \(A_2\) are disjoint, we immediately get from these definitions \({\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_1}\text {)}}+ {\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_2}\text {)}}\le {\text {(}\#\text {non-}{M^*}\text {-triangles)}}\). Furthermore, it is not difficult to verify that the proof of Lemma 3.6 in fact implies the following stronger version of the lemma.

Lemma A.1

(Stronger version of Lemma 3.6) It holds that

$$\begin{aligned} 3|A_1| \ge {\text {(}\#\text {component-free)}}- {\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_1}\text {)}}\hspace{5.0pt}. \end{aligned}$$

Lemma A.1 lower bounds the size of the set \(A_1\). Our next objective is to find a lower bound also for the size of \(A_2\). As a first step towards this goal, we upper bound the number of edges that have a potential to be added to \(A_2\) immediately after the first pass of Algorithm 8, but are removed from this potential during the second pass of the algorithm. To formalize this notion, let us recall that \({\text {(}\#\text {component-component)}}\) is the set of edges of \(M^*\) that connect connection vertices of two distinct partial triangles of (VP) (see Fig. 2i). Intuitively, \({\text {(}\#\text {component-component)}}\) counts edges that have a potential to be added to \(A_2\); however, for such an edge to really end up in \(A_2\), it is required that the two partial triangles it intersect remain naïve after the second pass. Therefore, the size of the “lost potential” is the number of edges that are counted by \({\text {(}\#\text {component-component)}}\), but intersect at least one partial triangle of (VP) that is also intersected by an edge of \(A_1\). In the following, we denote this number by \({\text {(}\#\text {lost-component-component)}}\).

Lemma A.2

It holds that

$$\begin{aligned} {\text {(}\#\text {lost-component-component)}}&\le 3|A_1| - {\text {(}\#\text {component-free)}}\\&\quad + {\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_1}\text {)}}\hspace{5.0pt}. \end{aligned}$$

Proof

The proof of this lemma is similar to the proof of Lemma 3.6, however, we write it fully for completeness.

We say that an edge e of \(M^*\) that is counted by either \({\text {(}\#\text {component-free)}}\) or \({\text {(}\#\text {component-component)}}\) is excluded by an edge \(f \in A_1\) if e and f intersect the same connected component of (VP). One can observe that every edge e counted by \({\text {(}\#\text {component-free)}}\) is excluded by some edge of \(A_1\) (possibly itself) when Algorithm 8 terminates because otherwise Algorithm 8 would have added e to \(A_1\), which would have resulted in e excluding itself. Therefore, the number of edges counted by \({\text {(}\#\text {component-component)}}\) that are excluded by some edge of \(A_1\), which is exactly \({\text {(}\#\text {lost-component-component)}}\), can be upper bound by the difference \(|J| - {\text {(}\#\text {component-free)}}\), where J is the set of edges counted by either \({\text {(}\#\text {component-free)}}\) or \({\text {(}\#\text {component-component)}}\) that are excluded by the edges of \(A_1\). In other words,

$$\begin{aligned} {\text {(}\#\text {lost-component-component)}}\le |J| - {\text {(}\#\text {component-free)}}\hspace{5.0pt}. \end{aligned}$$
(A1)

Let (uv) be an edge of \(A_1\), and assume without loss of generality that v is the end point of this edge which is an isolated vertex of (VP). This implies that u is a connection vertex of a connected component \(C_u\) of (VP) which is either a path of length 2 or a triangle. If \(C_u\) is a path of length 2, then the edge (uv) can exclude only edges counted by either \({\text {(}\#\text {component-free)}}\) or \({\text {(}\#\text {component-component)}}\) that intersect either v or a connection vertex of \(C_u\), and there can be only 3 such edges because \(M^*\) is a matching (see Fig. 3a). Next, consider the case in which \(C_u\) is a triangle which is not counted by \({\text {(}\#\text {non-}{M^*}\text {-triangles)}}\). In this case there can be at most 2 edges of \(M^*\) intersecting \(C_u\), and therefore, even though (uv) can exclude any edge of \({\text {(}\#\text {component-free)}}\) or \({\text {(}\#\text {component-component)}}\) intersecting \(C_u\) or v, there can be only 3 such edges (see Fig. 3b). It remains to consider the case in which \(C_u\) is a triangle counted by \({\text {(}\#\text {non-}{M^*}\text {-triangles)}}\). In this case, (uv) can again exclude every edge of \({\text {(}\#\text {component-free)}}\) or \({\text {(}\#\text {component-component)}}\) that intersects \(C_u\) or v, and this time there can be at most 4 such edges (see Fig. 3c). Combining all the above, we get that the number |J| of edges excluded by all the edges of \(A_1\) is at most

$$\begin{aligned}&3|A_1| + |\{e \in A_1 \mid e \,\text {intersects a triangle counted by}\, {\text {(}\#\text {non-}{M^*}\text {-triangles)}}\}|\\&\quad = 3|A_1| + {\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_1}\text {)}}\hspace{5.0pt}, \end{aligned}$$

where the equality holds because a triangle counted by \({\text {(}\#\text {non-}{M^*}\text {-triangles)}}\) is counted also by \({\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_1}\text {)}}\) if and only if some edge of \(A_1\) intersects it. Plugging the last upper bound on |J| into Inequality (A1) completes the proof of the lemma. \(\square \)

We can now prove the promised lower bound on the size of \(A_2\).

Lemma A.3

It holds that

$$\begin{aligned} 4|A_2|&\ge {\text {(}\#\text {component-component)}}- {\text {(}\#\text {lost-component-component)}}\\&\quad - {\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_2}\text {)}}\hspace{5.0pt}. \end{aligned}$$

Proof

Recall that \({\text {(}\#\text {lost-component-component)}}\) counts a subset of the edges that are counted by \({\text {(}\#\text {component-component)}}\). Let D be the set of edges (of \(M^*\)) counted by \({\text {(}\#\text {component-component)}}\) but not by \({\text {(}\#\text {lost-component-component)}}\). We say that an edge \(e \in D\) is excluded by an edge \(f \in A_2\) if e and f intersect the same connected component of (VP). One can observe that every edge \(e \in D\) is excluded by some edge of \(A_2\) (possibly itself) when Algorithm 8 terminates because otherwise Algorithm 8 would have added e to \(A_2\), which would have resulted in e excluding itself. Therefore, we can upper bound the size of D by counting the number of edges excluded by the edges of \(A_2\).

Let (uv) be an edge of \(A_2\), and let \(C_u\) and \(C_v\) be the connected components of (VP) that include u and v respectively. Notice that since \((u, v) \in A_2\), both \(C_u\) and \(C_v\) must be either paths of length 2 or triangles. The edge (uv) excludes every edge of D that intersects either \(C_u\) or \(C_v\). The number of \(D \subseteq M^*\) edges that intersect \(C_u\) can be at most 2, unless \(C_u\) is a triangle counted by \({\text {(}\#\text {non-}{M^*}\text {-triangles)}}\), in which case there might be 3 edges of D intersecting \(C_u\). Since a similar claim applies to \(C_v\), we get that the number of edges excluded by all the edges of \(A_2\) is at most

$$\begin{aligned} 4|A_2| + \sum _{e \in A_2} T(e) = 4|A_2| + {\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_2}\text {)}}\hspace{5.0pt}, \end{aligned}$$

where T(e) is the number of triangles counted by \({\text {(}\#\text {non-}{M^*}\text {-triangles)}}\) that intersect e, and the equality holds since a triangle is counted by \({\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_2}\text {)}}\) if and only if it is both counted by \({\text {(}\#\text {non-}{M^*}\text {-triangles)}}\) and intersects an edge of \(A_2\). As explained above, the last expression is an upper bound on the size of D. Therefore, we get

$$\begin{aligned}&{\text {(}\#\text {component-component)}}- {\text {(}\#\text {lost-component-component)}}= |D|\\&\quad \le 4|A_2| + {\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_2}\text {)}}\hspace{5.0pt}. \end{aligned}$$

The lemma now follows by rearranging this inequality. \(\square \)

Corollary A.4

We have

$$\begin{aligned} 12|A_1| + 12|A_2|&\ge 4{\text {(}\#\text {component-free)}}+ 3{\text {(}\#\text {component-component)}}\\&\quad - 4{\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_1}\text {)}}- 3{\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_2}\text {)}}\hspace{5.0pt}. \end{aligned}$$

Proof

Plugging Lemma A.2 into Lemma A.3, we get

$$\begin{aligned} 4|A_2|&\ge {\text {(}\#\text {component-component)}}- {\text {(}\#\text {lost-component-component)}}\\&\quad - {\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_2}\text {)}}\\&\ge {\text {(}\#\text {component-component)}}- (3|A_1| - {\text {(}\#\text {component-free)}}\\&\quad + {\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_1}\text {)}}) - {\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_2}\text {)}}\hspace{5.0pt}. \end{aligned}$$

Rearranging the last inequality, and multiplying it by 3, yields

$$\begin{aligned} 9|A_1| + 12|A_2|&\ge 3{\text {(}\#\text {component-component)}}+ 3{\text {(}\#\text {component-free)}}\\&\quad - 3{\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_1}\text {)}}- 3{\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_2}\text {)}}\hspace{5.0pt}. \end{aligned}$$

The corollary now follows by adding Lemma A.1 to the last inequality. \(\square \)

Let us now define \(L_2 = {\text {(}\#\text {single)}}+ {\text {(}\#\text {double)}}+ {\text {(}\#\text {triangle)}}+ |A_1| + |A_2|\). The following lemma shows that one can obtain an approximation guarantee for Algorithm 8 by lower bounding \(L_2\). Since the proof of this lemma is very similar to the proof of Lemma 3.11, we omit it.

Lemma A.5

Algorithm 8 outputs a matching of size at least \(L_2\).

It remains now to lower bound \(L_2\), which we do in the next lemma. Together with Lemma A.5 and the above observation that Algorithm 8 is a semi-streaming algorithm, this lemma completes the proof of Theorem 1.2.

Lemma A.6

\(L_2 \ge {5}/{9} |M^*|\).

Proof

Observe that

$$\begin{aligned} 12L_2&= 12{\text {(}\#\text {single)}}+ 12{\text {(}\#\text {double)}}+ 12{\text {(}\#\text {triangle)}}+ 12|A_1| + 12|A_2|\\&\ge 12{\text {(}\#\text {single)}}+ 12{\text {(}\#\text {double)}}+ 12{\text {(}\#\text {triangle)}}+ 4{\text {(}\#\text {component-free)}}\\&\quad + 3{\text {(}\#\text {component-component)}}- 4{\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_1}\text {)}}\\&\quad - 3{\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_2}\text {)}}\\&\ge 12{\text {(}\#\text {single)}}+ 12{\text {(}\#\text {double)}}+ 12{\text {(}\#\text {triangle)}}+ 4{\text {(}\#\text {component-free)}}\\&\quad + 3{\text {(}\#\text {component-component)}}- 4{\text {(}\#\text {non-}{M^*}\text {-triangles)}}\hspace{5.0pt}, \end{aligned}$$

where the first inequality follows from Corollary A.4; and the second inequality holds since we already observed that \({\text {(}\#\text {non-}{M^*}\text {-triangles)}}\ge {\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_1}\text {)}}+ {\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_2}\text {)}}\), and the value \({\text {(}\#\text {non-}{M^*}\text {-triangles-}{A_2}\text {)}}\) is non-negative by definition.

To further develop the last inequality, we recall that the analysis from Sect. 3 up until, and including, Lemma 3.4 applies to Algorithm 8 as well. Therefore,

$$\begin{aligned} 12L_2&\ge 12{\text {(}\#\text {single)}}+ 12{\text {(}\#\text {double)}}+ 12{\text {(}\#\text {triangle)}}+ 4{\text {(}\#\text {component-free)}}\\&\quad + 3{\text {(}\#\text {component-component)}}- 4{\text {(}\#\text {non-}{M^*}\text {-triangles)}}\\&\ge \tfrac{28}{3}{\text {(}\#\text {single)}}+ 4{\text {(}\#\text {double)}}+ 4{\text {(}\#\text {triangle)}}+ \tfrac{20}{3}{\text {(}\#\text {component-free)}}\\&\quad + \tfrac{25}{3}{\text {(}\#\text {component-component)}}+ \tfrac{8}{3}{\text {(}\#\text {single-single)}}\\&\quad + 4{\text {(}\#\text {single-component)}}+ \tfrac{8}{3}{\text {(}\#\text {middle)}}- 4{\text {(}\#\text {non-}{M^*}\text {-triangles)}}\\&\ge \tfrac{28}{3}{\text {(}\#\text {single)}}+ \tfrac{20}{3}{\text {(}\#\text {component-free)}}+ \tfrac{25}{3}{\text {(}\#\text {component-component)}}\\&\quad + \tfrac{8}{3}{\text {(}\#\text {single-single)}}+ 4{\text {(}\#\text {single-component)}}+ \tfrac{20}{3}{\text {(}\#\text {middle)}}\\&\ge \tfrac{20}{3}{\text {(}\#\text {component-free)}}+ \tfrac{25}{3}{\text {(}\#\text {component-component)}}+ 12{\text {(}\#\text {single-single)}}\\&\quad + \tfrac{26}{3}{\text {(}\#\text {single-component)}}+ \tfrac{20}{3}{\text {(}\#\text {middle)}}\\&\ge \tfrac{20}{3}|M^*| + \tfrac{5}{3}{\text {(}\#\text {component-component)}}+ \tfrac{16}{3}{\text {(}\#\text {single-single)}}\\&\quad + 2{\text {(}\#\text {single-component)}}\hspace{5.0pt}, \end{aligned}$$

where the second Inequality holds by Inequality (1), the third inequality follows from Inequality (3) (of Lemma 3.4), the fourth inequality follows from Inequality (4) (of Lemma 3.4), and the last inequality holds by Inequality (2) (of Lemma 3.4).

The lemma now follows by rearranging the last inequality and observing that \({\text {(}\#\text {single-single)}}\), \({\text {(}\#\text {component-component)}}\) and \({\text {(}\#\text {single-component)}}\) are all non-negative values by definition. \(\square \)

Appendix B Tight Example for the Analysis of Algorithm 1

Recall that Theorem 1.1 was proved in Sect. 3 by showing that the approximation ratio of Algorithm 1 is at least 7/13. In this section we describe an example that leads Algorithm 1 to construct sets \(A_1\) and \(A_2\) such that neither \((V, P \cup A_1)\) nor \((V, P \cup A_2)\) gives a better than 7/13-approximation. Our example is not a tight example for Algorithm 1 since this algorithm returns the largest matching in \((V, P \cup A_1 \cup A_2)\), which is larger in this case than the largest matching in either \((V, P \cup A_1)\) or \((V, P \cup A_2)\). However, our example does show that any analysis aiming to prove a better approximation ratio for Algorithm 1 will have to prove some synergy between the augmentations enabled by \(A_1\) and \(A_2\).

Our example consists of two gadgets. The first gadget is described in Fig. 6a. This gadget includes an optimal solution of size 8. However, when Algorithm 1 is executed on this gadget, it constructs sets \(A_1\) and \(A_2\) such that the maximum matchings in the parts of \((V, P \cup A_1)\) and \((V, P \cup A_2)\) corresponding to the gadget are only of sizes 4 and 5, respectively. The second gadget of our example is described in Fig. 6b. This gadget includes an optimal solution of size 18. However, when Algorithm 1 is executed on this gadget, it constructs sets \(A_1\) and \(A_2\) such that the maximum matchings in the parts of \((V, P \cup A_1)\) and \((V, P \cup A_2)\) corresponding to the gadget are only of sizes 10 and 9, respectively.

Fig. 6
figure 6

The two gadgets of the example studied in Appendix B. The solid lines are the edges of P, the dotted lines labeled either \(A_1\) or \(A_2\) are the edges of these sets, respectively, and the dashed lines are the edges of \(M^*\). It is assumed that the edges of P arrive first in the stream, followed by the edges of \(A_2\), the edges of \(A_1\), and finally the edges of \(M^*\)

Our complete example includes an independent copy of each one of the two gadgets. Thus, the complete example includes a matching of size \(8 + 18 = 26\). However, after executing Algorithm 1 on this example, \((V, P \cup A_1)\) only includes a matching of size \(4 + 10 = 14\) and \((V, P \cup A_2)\) only includes a matching of size \(5 + 9 = 14\). Hence, the maximum matchings in both these graphs only give an approximation ratio of \(14/26 = 7/13\).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Feldman, M., Szarf, A. Maximum Matching Sans Maximal Matching: A New Approach for Finding Maximum Matchings in the Data Stream Model. Algorithmica 86, 1173–1209 (2024). https://doi.org/10.1007/s00453-023-01190-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-023-01190-4

Keywords

Navigation