Abstract
Given a non-deterministic finite automaton (NFA) A with m states, and a natural number n (presented in unary), the #NFA problem asks to determine the size of the set L(A,n) of words of length n accepted by A. While the corresponding decision problem of checking the emptiness of L(A,n) is solvable in polynomial time, the #NFA problem is known to be #P-hard. Recently, the long-standing open question --- whether there is an FPRAS (fully polynomial time randomized approximation scheme) for #NFA --- was resolved by Arenas, Croquevielle, Jayaram, and Riveros in [ACJR19]. The authors demonstrated the existence of a fully polynomial randomized approximation scheme with a time complexity of ~O(m17 n17 • 1/ε14 • log (1/δ)), for a given tolerance ε and confidence parameter δ.
Given the prohibitively high time complexity in terms of each of the input parameters, and considering the widespread application of approximate counting (and sampling) in various tasks in Computer Science, a natural question arises: is there a faster FPRAS for #NFA that can pave the way for the practical implementation of approximate #NFA tools? In this work, we answer this question in the positive. We demonstrate that significant improvements in time complexity are achievable, and propose an FPRAS for #NFA that is more efficient in terms of both time and sample complexity.
A key ingredient in the FPRAS due to Arenas, Croquevielle, Jayaram, and Riveros [ACJR19] is inter-reducibility of sampling and counting, which necessitates a closer look at the more informative measure --- the number of samples maintained for each pair of state q and length i <= n. In particular, the scheme of [ACJR19] maintains O(m7/n7 ε7 ) samples per pair of state and length. In the FPRAS we propose, we systematically reduce the number of samples required for each state to be only poly-logarithmically dependent on m, with significantly less dependence on n and ε, maintaining only ~O(n4/ε2) samples per state. Consequently, our FPRAS runs in time ~O((m2n10 + m3n6) • 1/ε4 • log2(1/δ)). The FPRAS and its analysis use several novel insights. First, our FPRAS maintains a weaker invariant about the quality of the estimate of the number of samples for each state q and length i <= n. Second, our FPRAS only requires that the distribution of the samples maintained is close to uniform distribution only in total variation distance (instead of maximum norm). We believe our insights may lead to further reductions in time complexity and thus open up a promising avenue for future work towards the practical implementation of tools for approximate #NFA.
- Antoine Amarilli, Timothy van Bremen, and Kuldeep S. Meel. 2024. Conjunctive Queries on Probabilistic Graphs: The Limits of Approximability. In 27th International Conference on Database Theory, ICDT, Vol. 290. 15:1--15:20. https://doi.org/10.4230/LIPICS.ICDT.2024.15Google ScholarCross Ref
- Renzo Angles, Marcelo Arenas, Pablo Barceló, Aidan Hogan, Juan Reutter, and Domagoj Vrgovc. 2017. Foundations of Modern Query Languages for Graph Databases. ACM Comput. Surv., Vol. 50, 5, Article 68 (2017). https://doi.org/10.1145/3104031Google ScholarDigital Library
- Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros. 2019. Efficient Logspace Classes for Enumeration, Counting, and Uniform Generation. In Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS). 59--73. https://doi.org/10.1145/3294052.3319704Google ScholarDigital Library
- Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros. 2021. # NFA Admits an FPRAS: Efficient Enumeration, Counting, and Uniform Generation for Logspace Classes. Journal of the ACM (JACM), Vol. 68, 6 (2021), 1--40.Google ScholarDigital Library
- Lucas Bang, Abdulbaki Aydin, Quoc-Sang Phan, Corina S. Puasuareanu, and Tevfik Bultan. 2016. String Analysis for Side Channels with Segmented Oracles. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE). 193--204.Google ScholarDigital Library
- Alexandre Donzé, Rafael Valle, Ilge Akkaya, Sophie Libkind, Sanjit A. Seshia, and David Wessel. 2014. Machine Improvisation with Formal Specifications. In Music Technology meets Philosophy - From Digital Echos to Virtual Ethos: Joint Proceedings of the 40th International Computer Music Conference, ICMC. https://hdl.handle.net/2027/spo.bbp2372.2014.196Google Scholar
- Pengfei Gao, Jun Zhang, Fu Song, and Chao Wang. 2019. Verifying and Quantifying Side-Channel Resistance of Masked Software Implementations. ACM Trans. Softw. Eng. Methodol., Vol. 28, 3, Article 16 (jul 2019), 32 pages. https://doi.org/10.1145/3330392Google ScholarDigital Library
- Vivek Gore, Mark Jerrum, Sampath Kannan, Z. Sweedyk, and Stephen R. Mahaney. 1997. A Quasi-Polynomial-Time Algorithm for Sampling Words from a Context-Free Language. Inf. Comput., Vol. 134, 1 (1997), 59--74. https://doi.org/10.1006/inco.1997.2621Google ScholarDigital Library
- Kenneth E Iverson. 1962. A programming language. In Proceedings of the May 1--3, 1962, spring joint computer conference. 345--351.Google ScholarDigital Library
- Mark Jerrum, Leslie G. Valiant, and Vijay V. Vazirani. 1986. Random Generation of Combinatorial Structures from a Uniform Distribution. Theor. Comput. Sci., Vol. 43 (1986), 169--188. https://doi.org/10.1016/0304--3975(86)90174-XGoogle ScholarCross Ref
- Sampath Kannan, Z. Sweedyk, and Stephen R. Mahaney. 1995. Counting and Random Generation of Strings in Regular Languages. In Proceedings of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 551--557.Google Scholar
- Richard M. Karp and Michael Luby. 1985. Monte-Carlo algorithms for the planar multiterminal network reliability problem. J. Complex., Vol. 1, 1 (1985), 45--64. https://doi.org/10.1016/0885-064X(85)90021--4Google ScholarCross Ref
- Axel Legay, Benoît Delahaye, and Saddek Bensalem. 2010. Statistical Model Checking: An Overview. In Runtime Verification. 122--135.Google Scholar
- Burcu Kulahcioglu Ozkan, Rupak Majumdar, and Simin Oraee. 2019. Trace Aware Random Testing for Distributed Systems. Proc. ACM Program. Lang., Vol. 3, OOPSLA, Article 180 (oct 2019), 29 pages. https://doi.org/10.1145/3360606Google ScholarDigital Library
- Seemanta Saha, Surendra Ghentiyala, Shihua Lu, Lucas Bang, and Tevfik Bultan. 2023. Obtaining Information Leakage Bounds via Approximate Model Counting. Proc. ACM Program. Lang., Vol. 7, PLDI, Article 167 (jun 2023). https://doi.org/10.1145/3591281Google ScholarDigital Library
- Michael Sutton, Adam Greene, and Pedram Amini. 2007. Fuzzing: Brute Force Vulnerability Discovery. Addison-Wesley Professional.Google ScholarDigital Library
- Timothy van Bremen and Kuldeep S. Meel. 2023. Probabilistic Query Evaluation: The Combined FPRAS Landscape. In PODS. 339--347.Google Scholar
- Carme Álvarez and Birgit Jenner. 1993. A very hard log-space counting class. Theoretical Computer Science, Vol. 107, 1 (1993), 3--30. https://doi.org/10.1016/0304--3975(93)90252-OGoogle ScholarDigital Library
Index Terms
- A faster FPRAS for #NFA
Recommendations
#NFA Admits an FPRAS: Efficient Enumeration, Counting, and Uniform Generation for Logspace Classes
In this work, we study two simple yet general complexity classes, based on logspace Turing machines, that provide a unifying framework for efficient query evaluation in areas such as information extraction and graph databases, among others. We investigate ...
The complexity of counting homomorphisms seen from the other side
For every class of relational structures C, let HOM(C, _) be the problem of deciding whether a structure A ∈ C has a homomorphism to a given arbitrary structure B. Grohe has proved that, under a certain complexity-theoretic assumption, HOM(C, _) is ...
Comments