ABSTRACT
We prove conditional near-quadratic running time lower bounds for approximate Bichromatic Closest Pair with Euclidean, Manhattan, Hamming, or edit distance. Specifically, unless the Strong Exponential Time Hypothesis (SETH) is false, for every δ>0 there exists a constant ε>0 such that computing a (1+ε)-approximation to the Bichromatic Closest Pair requires Ω(n2−δ) time. In particular, this implies a near-linear query time for Approximate Nearest Neighbor search with polynomial preprocessing time.
Our reduction uses the recently introduced Distributed PCP framework, but obtains improved efficiency using Algebraic Geometry (AG) codes. Efficient PCPs from AG codes have been constructed in other settings before, but our construction is the first to yield new hardness results.
Supplemental Material
- Scott Aaronson and Avi Wigderson. 2009.Google Scholar
- Algebrization: A New Barrier in Complexity Theory. TOCT 1, 1 (2009), 2:1–2:54.Google Scholar
- 1490272Google Scholar
- Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. 2015. Tight Hardness Results for LCS and other Sequence Similarity Measures. In Proc. of the 56th FOCS. 59–78. Google ScholarDigital Library
- Amir Abboud and Aviad Rubinstein. 2018.Google Scholar
- Fast and Deterministic Constant Factor Approximation Algorithms for LCS Imply New Circuit Lower Bounds. In 9th Innovations in Theoretical Computer Science Conference, ITCS 2018, January 11-14, 2018, Cambridge, MA, USA. 35:1–35:14. 2018.35Google Scholar
- Amir Abboud, Aviad Rubinstein, and R. Ryan Williams. 2017. Distributed PCP Theorems for Hardness of Approximation in P. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017. 25–36.Google Scholar
- Amirali Abdullah and Suresh Venkatasubramanian. 2015. A directed isoperimetric inequality with application to bregman near neighbor lower bounds. In Proc. of the 47th STOC. ACM, 509–518. Google ScholarDigital Library
- Pankaj K. Agarwal, Herbert Edelsbrunner, and Otfried Schwarzkopf. 1991. Euclidean Minimum Spanning Trees and Bichromatic Closest Pairs. Discrete & Computational Geometry 6 (1991), 407–422. Google ScholarDigital Library
- Thomas Dybdahl Ahle. 2017. Optimal Las Vegas Locality Sensitive Data Structures. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017. 938–949. FOCS.2017.91Google Scholar
- Josh Alman, Timothy M. Chan, and R. Ryan Williams. 2016. Polynomial Representations of Threshold Functions and Algorithmic Applications. In IEEE 57th Annual Symposium on Foundations of Computer Science, FOCS 2016, 9- 11 October 2016, Hyatt Regency, New Brunswick, New Jersey, USA. 467–476.Google Scholar
- Josh Alman and Ryan Williams. 2015. Probabilistic polynomials and hamming nearest neighbors. In Proc. of the 56th FOCS. IEEE, 136–150. Google ScholarDigital Library
- Alexandr Andoni, Dorian Croitoru, and Mihai Patrascu. 2008. Hardness of Nearest Neighbor under L-infinity. In Proc. of the 49th FOCS. 424–433. Google ScholarDigital Library
- Alexandr Andoni, Michel Deza, Anupam Gupta, Piotr Indyk, and Sofya Raskhodnikova. 2003. Lower bounds for embedding edit distance into normed spaces. In Proc. of the 14th SODA. 523–526. Google ScholarDigital Library
- Alexandr Andoni and Piotr Indyk. 2006. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In Proc. of the 47th FOCS. IEEE, 459–468. Google ScholarDigital Library
- Alexandr Andoni, Piotr Indyk, Thijs Laarhoven, Ilya Razenshteyn, and Ludwig Schmidt. 2015. Practical and optimal LSH for angular distance. In Advances in Neural Information Processing Systems. 1225–1233. Google ScholarDigital Library
- Alexandr Andoni, Piotr Indyk, Huy L Nguyen, and Ilya Razenshteyn. 2014. Beyond locality-sensitive hashing. In Proc. of the 25th SODA. SIAM, 1018–1028. Google ScholarDigital Library
- Alexandr Andoni, Piotr Indyk, and Mihai Patrascu. 2006. On the optimality of the dimensionality reduction method. In Proc. of the 47th FOCS. IEEE, 449–458. Google ScholarDigital Library
- Alexandr Andoni, T. S. Jayram, and Mihai Patrascu. 2010. Lower Bounds for Edit Distance and Product Metrics via Poincaré-Type Inequalities. In Proc. of the 21st SODA. 184–192. Google ScholarDigital Library
- Alexandr Andoni and Robert Krauthgamer. 2007. The Computational Hardness of Estimating Edit Distance {Extended Abstract}. In Proc. of the 48th FOCS. 724–734. Google ScholarDigital Library
- Alexandr Andoni, Thijs Laarhoven, Ilya P. Razenshteyn, and Erik Waingarten. 2017.Google Scholar
- Optimal Hashing-based Time-Space Trade-offs for Approximate Near Neighbors. In Proc. of the 28th SODA. 47–66.Google Scholar
- Alexandr Andoni and Ilya Razenshteyn. 2015. Optimal data-dependent hashing for approximate near neighbors. In Proc. of the Forty-Seventh Annual ACM on Symposium on Theory of Computing. ACM, 793–801. Google ScholarDigital Library
- Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. 1998. Proof Verification and the Hardness of Approximation Problems. J. ACM 45, 3 (1998), 501–555. Google ScholarDigital Library
- Sanjeev Arora and Shmuel Safra. 1998. Probabilistic Checking of Proofs: A New Characterization of NP. J. ACM 45, 1 (1998), 70–122. 273865.273901 Google ScholarDigital Library
- Sunil Arya and Timothy M. Chan. 2014.Google Scholar
- Better ϵ -Dependencies for Offline Approximate Nearest Neighbor Search, Euclidean Minimum Spanning Trees, and ϵ -Kernels. In 30th Annual Symposium on Computational Geometry, SOCG’14, Kyoto, Japan, June 08 - 11, 2014. 416. Google ScholarDigital Library
- Sunil Arya and Ho-Yam Addy Fu. 2003.Google Scholar
- Expected-Case Complexity of Approximate Nearest Neighbor Searching. SIAM J. Comput. 32, 3 (2003), 793–815. Google ScholarDigital Library
- Sunil Arya, Theocharis Malamatos, and David M. Mount. 2009.Google Scholar
- Space-time tradeoffs for approximate nearest neighbor searching. J. ACM 57, 1 (2009), 1:1–1:54. Google ScholarDigital Library
- Sunil Arya and David M. Mount. 1993. Approximate Nearest Neighbor Queries in Fixed Dimensions. In Proceedings of the Fourth Annual ACM/SIGACT-SIAM Symposium on Discrete Algorithms, 25-27 January 1993, Austin, Texas. 271–280. http://dl.acm.org/citation.cfm?id=313559.313768 Google ScholarDigital Library
- Sunil Arya, David M. Mount, Nathan S. Netanyahu, Ruth Silverman, and Angela Y. Wu. 1994. An Optimal Algorithm for Approximate Nearest Neighbor Searching. In Proceedings of the Fifth Annual ACM-SIAM Symposium on Discrete Algorithms. 23-25 January 1994, Arlington, Virginia. 573–582. http://dl.acm.org/citation.cfm? id=314464.314652 Google ScholarDigital Library
- Arturs Backurs and Piotr Indyk. 2015. Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false). In Proc. of the 47th Annual ACM SIGACT Symposium on Theory of Computing (STOC). 51–58. Google ScholarDigital Library
- Ziv Bar-Yossef, TS Jayram, Robert Krauthgamer, and Ravi Kumar. 2004.Google Scholar
- Approximating edit distance efficiently. In Foundations of Computer Science, 2004. Proceedings. 45th Annual IEEE Symposium on. IEEE, 550–559. Google ScholarDigital Library
- Tuğkan Batu, Funda Ergun, and Cenk Sahinalp. 2006.Google Scholar
- Oblivious string embeddings and edit distance approximations. In Proc. of the seventeenth annual ACM-SIAM symposium on Discrete algorithm. Society for Industrial and Applied Mathematics, 792–801. Google ScholarDigital Library
- Avraham Ben-Aroya and Amnon Ta-Shma. 2013. Constructing Small-Bias Sets from Algebraic-Geometric Codes. Theory of Computing 9 (2013), 253–272.Google ScholarCross Ref
- Eli Ben-Sasson, Alessandro Chiesa, Ariel Gabizon, Michael Riabzev, and Nicholas Spooner. 2017. Interactive Oracle Proofs with Constant Rate and Query Complexity. (2017), 40:1–40:15. Hardness of Approximate Nearest Neighbor Search STOC’18, June 25–29, 2018, Los Angeles, CA, USAGoogle Scholar
- Eli Ben-Sasson, Yohay Kaplan, Swastik Kopparty, Or Meir, and Henning Stichtenoth. 2016. Constant Rate PCPs for Circuit-SAT with Sublinear Query Complexity. J. ACM 63, 4 (2016), 32:1–32:57. Google ScholarDigital Library
- Jon Louis Bentley and Michael Ian Shamos. 1976.Google Scholar
- Divide-and-Conquer in Multidimensional Space. In Proceedings of the 8th Annual ACM Symposium on Theory of Computing, May 3-5, 1976, Hershey, Pennsylvania, USA. 220–230. Google ScholarDigital Library
- Amit Chakrabarti and Oded Regev. 2010. An Optimal Randomized Cell Probe Lower Bound for Approximate Nearest Neighbor Searching. SIAM J. Comput. 39, 5 (2010), 1919–1940.Google ScholarDigital Library
- Timothy M. Chan. 2017. Orthogonal Range Searching in Moderate Dimensions: k-d Trees and Range Trees Strike Back. In 33rd International Symposium on Computational Geometry, SoCG 2017, July 4-7, 2017, Brisbane, Australia. 27:1– 27:15.Google Scholar
- Lijie Chen. 2018.Google Scholar
- On The Hardness of Approximate and Exact (Bichromatic) Maximum Inner Product. Electronic Colloquium on Computational Complexity (ECCC) 25 (2018), 26. https://eccc.weizmann.ac.il/report/2018/026 Google ScholarDigital Library
- Gil Cohen and Amnon Ta-Shma. 2013.Google Scholar
- Pseudorandom Generators for Low Degree Polynomials from Algebraic Geometry Codes. Electronic Colloquium on Computational Complexity (ECCC) 20 (2013), 155. http://eccc.hpiweb.de/report/ 2013/155Google Scholar
- Karthik C.S., Bundit Laekhanukit, and Pasin Manurangsi. 2018. On the Parameterized Complexity of Approximating Dominating Set. In STOC. To appear. Google ScholarDigital Library
- Roee David, Karthik C. S., and Bundit Laekhanukit. 2016. The Curse of Medium Dimension for Geometric Problems in Almost Every Norm. CoRR abs/1608.03245 (2016). http://arxiv.org/abs/1608.03245Google Scholar
- Irit Dinur. 2007. The PCP theorem by gap amplification. J. ACM 54, 3 (2007), 12. Google ScholarDigital Library
- Harold N. Gabow, Jon Louis Bentley, and Robert Endre Tarjan. 1984. Scaling and Related Techniques for Geometry Problems. In Proceedings of the 16th Annual ACM Symposium on Theory of Computing, April 30 - May 2, 1984, Washington, DC, USA. 135–143. Google ScholarDigital Library
- Valerii Denisovich Goppa. 1981. Codes on algebraic curves. Soviet mathematics - Doklady 24 (1981), 170–172.Google Scholar
- Venkatesan Guruswami and Anindya C. Patthak. 2008.Google Scholar
- Correlated algebraicgeometric codes: Improved list decoding over bounded alphabets. Math. Comput. 77, 261 (2008), 447–473. 5718- 07- 02012- 1Google Scholar
- Venkatesan Guruswami and Madhu Sudan. 1999. Improved decoding of Reed-Solomon and algebraic-geometry codes. IEEE Trans. Information Theory 45, 6 (1999), 1757–1767. Google ScholarDigital Library
- Venkatesan Guruswami and Chaoping Xing. 2013. List decoding reed-solomon, algebraic-geometric, and gabidulin subcodes up to the singleton bound. In Symposium on Theory of Computing Conference, STOC’13, Palo Alto, CA, USA, June 1-4, 2013. 843–852. Google ScholarDigital Library
- Venkatesan Guruswami and Chaoping Xing. 2014. Optimal rate list decoding of folded algebraic-geometric codes over constant-sized alphabets. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2014, Portland, Oregon, USA, January 5-7, 2014. 1858–1866. Google ScholarDigital Library
- 1137/1.9781611973402.134Google Scholar
- Brett Hemenway, Noga Ron-Zewi, and Mary Wootters. 2017. Local List Recovery of High-rate Tensor Codes and Applications. CoRR abs/1706.03383 (2017).Google Scholar
- arXiv: 1706.03383 http://arxiv.org/abs/1706.03383Google Scholar
- Russell Impagliazzo and Ramamohan Paturi. 2001. On the Complexity of k-SAT. J. Comput. Syst. Sci. 62, 2 (2001), 367–375. Google ScholarDigital Library
- Piotr Indyk. 1999. A Sublinear Time Approximation Scheme for Clustering in Metric Spaces. In 40th Annual Symposium on Foundations of Computer Science, FOCS. 154–159. Google ScholarDigital Library
- Piotr Indyk. 2000. Dimensionality reduction techniques for proximity problems. In Proc. of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms. 371–378. http://dl.acm.org/citation.cfm?id=338219.338582 Google ScholarDigital Library
- Piotr Indyk. 2003. Better algorithms for high-dimensional proximity problems via asymmetric embeddings. In Proc. of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms. 539–545. http://dl.acm.org/citation.cfm?id=644108.644200 Google ScholarDigital Library
- Piotr Indyk. 2004. Approximate Nearest Neighbor under edit distance via product metrics. In Proc. of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA. 646–650. http://dl.acm.org/citation.cfm?id=982792.982889 Google ScholarDigital Library
- Piotr Indyk and Rajeev Motwani. 1998.Google Scholar
- Approximate nearest neighbors: towards removing the curse of dimensionality. In Proc. of the thirtieth annual ACM symposium on Theory of computing. ACM, 604–613. Google ScholarDigital Library
- Michael Kapralov and Rina Panigrahy. 2012.Google Scholar
- NNS lower bounds via metric expansion for l ∞ and EMD. In International Colloquium on Automata, Languages, and Programming. Springer, 545–556. Google ScholarDigital Library
- Matti Karppa, Petteri Kaski, and Jukka Kohonen. 2016.Google Scholar
- A faster subquadratic algorithm for finding outlier correlations. In Proc. of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, 1288–1305. Google ScholarDigital Library
- Subhash Khot and Assaf Naor. 2005.Google Scholar
- Nonembeddability theorems via Fourier analysis. In Proc. of the 46th FOCS. IEEE, 101–110. Google ScholarDigital Library
- Samir Khuller and Yossi Matias. 1995. A Simple Randomized Sieve Algorithm for the Closest-Pair Problem. Inf. Comput. 118, 1 (1995), 34–37. 1006/inco.1995.1049 Google ScholarDigital Library
- Hartmut Klauck. 2003. Rectangle Size Bounds and Threshold Covers in Communication Complexity. In 18th Annual IEEE Conference on Computational Complexity (Complexity 2003), 7-10 July 2003, Aarhus, Denmark. 118–134.Google Scholar
- Jon M. Kleinberg. 1997. Two Algorithms for Nearest-Neighbor Search in High Dimensions. In Proceedings of the Twenty-Ninth Annual ACM Symposium on the Theory of Computing, El Paso, Texas, USA, May 4-6, 1997. 599–608. Google ScholarDigital Library
- Swastik Kopparty. 2013. Lecture 5: k-wise Independent Hashing and Applications. (2013).Google Scholar
- Lecture notes for Topics in Complexity Theory and Pseudorandomness (Rutgers University ).Google Scholar
- Eyal Kushilevitz, Rafail Ostrovsky, and Yuval Rabani. 2000. Efficient Search for Approximate Nearest Neighbor in High Dimensional Spaces. SIAM J. Comput. 30, 2 (2000), 457–474. Google ScholarDigital Library
- Mingmou Liu, Xiaoyin Pan, and Yitong Yin. 2016.Google Scholar
- Randomized Approximate Nearest Neighbor Search with Limited Adaptivity. In Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2016, Asilomar State Beach/Pacific Grove, CA, USA, July 11-13, 2016. 23–33. Google ScholarDigital Library
- George S. Lueker. 2009.Google Scholar
- Improved bounds on the average length of longest common subsequences. J. ACM 56, 3 (2009), 17:1–17:38. 1516512.1516519 Google ScholarDigital Library
- Or Meir. 2013. IP = PSPACE Using Error-Correcting Codes. SIAM J. Comput. 42, 1 (2013), 380–403.Google ScholarCross Ref
- Rajeev Motwani, Assaf Naor, and Rina Panigrahy. 2007.Google Scholar
- Lower Bounds on Locality Sensitive Hashing. SIAM J. Discrete Math. 21, 4 (2007), 930–935. Google ScholarDigital Library
- Ilan Newman. 1991. Private vs. Common Random Bits in Communication Complexity. Inf. Process. Lett. 39, 2 (1991), 67–71. 0190(91) 90157-D Google ScholarDigital Library
- Ryan O’Donnell, Yi Wu, and Yuan Zhou. 2014. Optimal lower bounds for localitysensitive hashing (except when q is tiny). ACM Transactions on Computation Theory (TOCT) 6, 1 (2014), 5. Google ScholarDigital Library
- Rafail Ostrovsky and Yuval Rabani. 2007. Low distortion embeddings for edit distance. Journal of the ACM (JACM) 54, 5 (2007), 23. Google ScholarDigital Library
- Rina Panigrahy, Kunal Talwar, and Udi Wieder. 2008. A geometric approach to lower bounds for approximate near-neighbor search and partial match. In Proc. of the 49th FOCS. IEEE, 414–423. Google ScholarDigital Library
- Rina Panigrahy, Kunal Talwar, and Udi Wieder. 2010.Google Scholar
- Lower bounds on near neighbor search via metric expansion. In Proc. of the 51st FOCS. IEEE, 805–814.Google Scholar
- Mihai Patrascu and Mikkel Thorup. 2007. Randomization does not help searching predecessors. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007, New Orleans, Louisiana, USA, January 7-9, 2007. Google ScholarDigital Library
- 555–564. http://dl.acm.org/citation.cfm?id=1283383.1283443Google Scholar
- Ilya Razenshteyn. 2017.Google Scholar
- High-Dimensional Similarity Search and Sketching: Algorithms and Hardness. Ph.D. Dissertation. MIT.Google Scholar
- Michael Ian Shamos and Dan Hoey. 1975. Closest-Point Problems. In 16th Annual Symposium on Foundations of Computer Science, Berkeley, California, USA, October 13-15, 1975. 151–162.Google Scholar
- Kenneth W. Shum, Ilia Aleshnikov, P. Vijay Kumar, Henning Stichtenoth, and Vinay Deolalikar. 2001.Google Scholar
- A low-complexity algorithm for the construction of algebraic-geometric codes better than the Gilbert-Varshamov bound. IEEE Trans. Information Theory 47, 6 (2001), 2225–2241. Google ScholarDigital Library
- Henning Stichtenoth. 2009.Google Scholar
- Algebraic Function Fields and Codes. Springer Berlin Heidelberg, Berlin, Heidelberg, Chapter Algebraic Geometry Codes, 45–65. 3- 540- 76878- 4_2Google Scholar
- Gregory Valiant. 2015.Google Scholar
- Finding correlations in subquadratic time, with applications to learning parities and the closest pair problem. Journal of the ACM (JACM) 62, 2 (2015), 13. Google ScholarDigital Library
- Ryan Williams. 2018.Google Scholar
- On the Difference Between Closest, Furthest, and Orthogonal Pairs: Nearly-Linear vs Barely-Subquadratic Complexity. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018. 1207–1215. Google ScholarDigital Library
- R. Ryan Williams. 2005. A new algorithm for optimal 2-constraint satisfaction and its implications. Theoretical Computer Science 348, 2–3 (2005), 357–365. Google ScholarDigital Library
Index Terms
- Hardness of approximate nearest neighbor search
Recommendations
Randomized Approximate Nearest Neighbor Search with Limited Adaptivity
Special Issue on SPAA 2016We study the complexity of parallel data structures for approximate nearest neighbor search in d-dimensional Hamming space {0,1}d. A classic model for static data structures is the cell-probe model [27]. We consider a cell-probe model with limited ...
Randomized Approximate Nearest Neighbor Search with Limited Adaptivity
SPAA '16: Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and ArchitecturesWe study the problem of approximate nearest neighbor search in $d$-dimensional Hamming space {0,1}d. We study the complexity of the problem in the famous cell-probe model, a classic model for data structures. We consider algorithms in the cell-probe ...
A Fast Approximate Nearest Neighbor Search Algorithm in the Hamming Space
A fast approximate nearest neighbor search algorithm for the (binary) Hamming space is proposed. The proposed Error Weighted Hashing (EWH) algorithm is up to 20 times faster than the popular locality sensitive hashing (LSH) algorithm and works well even ...
Comments