Untestable Properties

. Property testing is a kind of randomized approximation in which one takes a small, random sample of a structure and wishes to determine whether the structure satisﬁes some property or is far from satisfying the property. We focus on the testability of classes of ﬁrst-order expressible properties, and in particular, on the clas-siﬁcation of preﬁx-vocabulary classes for testability. The main result is the untestability of [ ∀∃∀ , (0 , 1)] = . We also show that this class remains untestable without equality in at least one model of testing. These classes are well-known and (at least one is) minimal for untestability. We discuss what is currently known about the classiﬁcation for testability and brieﬂy compare it to other classiﬁcations.


Introduction
Testing a property can be viewed as a form of approximation where we trade accuracy for efficiency.As far as we are aware, de Leeuw et al. (1956) first formalized probabilistic machines.They showed that such machines cannot compute uncomputable properties under reasonable assumptions, but mention the possibility that probabilistic machines could perhaps be more efficient than deterministic machines.This topic attracted considerable attention including Gill (1977).
Early examples of such results were presented by Freivalds (1977), (1979) including his matrix multiplication checker.
In property testing, we take a random sample of some large structure and wish to distinguish between the case that it has some desired property and the case that it is far from having the property.We focus on the testability of first-order expressible properties, and in particular on the classification of prefix-vocabulary classes of first-order logic for testability.Rubinfeld and Sudan (1996) and Blum et al. (1993) introduced the notion of property testing in the context of formal verification.The basic idea was soon extended by Goldreich et al. (1998), who represented graphs as binary functions and focused on testing graph properties.We omit a detailed history of testing, see the introduction to testing graph properties by Goldreich (2010), two surveys by Ron (2008Ron ( ), (2009)), and older surveys by Fischer (2001) and Ron (2001).
We are particularly interested in the testability of properties expressible in subclasses of first-order logic, and review relevant work in Subsection 1.1.
Here, we show that there exist untestable graph properties expressible with quantifier prefix ∀∃∀ when equality is allowed (see Section 3 for a formal statement).In addition, we use a variation of that proof to show that this prefix remains untestable in at least one of our models even when equality is forbidden.We suspect that the class is untestable without equality in all of our models.
Taking into account the related work described in Subsection 1.1 and using the notation of Definition 7, the current classification for testability is the following (cf.Jordan and Zeugmann (2012)).We omit the result for [∀∃∀, (0, 1)] because it may depend on the choice of model.
We are especially interested in determining the testability of variants of the Gödel class (i.e., classes whose prefix contains at least ∀ 2 ∃) as this would suffice to complete the classification for the special case of predicate logic with equality.The above classification is consistent with several other well-known classifications, such as that for the finite model property (see, e.g., Chapter 6 of Börger et al. (1997), for docility (or finite satisfiability, see Kolaitis and Vardi (2000)) and for associated 0-1 laws for fragments of existential second-order logic (see Kolaitis and Vardi (2000)).It would be interesting to know if the classification for testability coincides with one of these classifications.
The rest of the paper is organized as follows: First, we review related work in Subsection 1.1.Definitions and notation are in Section 2. The proof of untestability for [∀∃∀, (0, 1)] = is presented in Section 3, while the case without equality is considered in Section 4. Alon et al. (2001) proved that the regular languages are testable, implying that monadic first-order is testable given the well-known results of Büchi (1960) or McNaughton and Papert (1971).Alon et al. (2000) were the first to directly consider the classification problem for testability, restricted to properties of undirected, loop-free graphs.They showed the testability of all such properties expressible by prenex sentences with quantifier prefix ∃ * ∀ * , and also proved that there exists an untestable property expressible with quantifier prefix ∀ * ∃ * (examining the proof shows that ∀ 12 ∃ 5 suffices).
It is easy to show that [∀∃∀, (0, 1)] (even without equality) has infinity axioms 1 .Vedø (1997) showed that a 0-1 law does not hold for second-order existential logic when the first-order part is in this class (again, even without equality).
The current paper sharpens some (prefixes ∀ 2 ∃∀, ∀∃∀∃, ∀∃∀ 2 ) of the results of Jordan and Zeugmann (2012) and so we briefly outline the improvement that allows us to minimize the prefix considered.The untestable property considered there is closely related to the untestable property of Alon et al. (2000), but modified to minimize the number of quantifiers used.These properties are essentially first-order expressible versions of checking an explicitly given isomorphism between two graphs 2 .In fact, restricting the properties to checking an explicitly given isomorphism between undirected, bipartite graphs maintains hardness for testing.See Figure 1(a) for an example where the goal is to check whether the directed edges give an isomorphism between the two bipartite subgraphs.However, graph isomorphism seems to require one to discuss at least four vertices simultaneously (because one wishes to state that an edge is present iff its image is present and the edges are disjoint in general).Sharing one of the partitions would seem to remove the need for four quantifiers.See Figure 1(b) for an example where the goal is again to check whether the directed edges give an isomorphism between the two halves of the graph.The resulting property is perhaps closer to a variant of function isomorphism, e.g., for functions f, g : {1, . . ., n} → {0, 1} n where the bit i of f (j) is 1 if there is an edge from j in the leftmost partition to i in the middle partition and likewise for g(j) and the right partition.This property is not first-order expressible, but there is a somewhat tedious first-order encoding that is sufficiently similar.Figure 1(c) gives an example of this first-order property; details are in Section 3.
The connection with function isomorphism allows us to leverage recent work on the testability of (Boolean) function isomorphism and use recent ideas and techniques from Alon and Blais (2010) to prove Lemma 2. In Section 4, we use a variation of this property that removes the use of equality.

Preliminaries
The goal in property testing is always to distinguish structures that have some property from those that are far from having the property.Here, we focus on first-order expressible properties of directed graphs and so we begin with the necessary definitions.Definition 1.A graph is an ordered pair G = (V, E), where V is a finite set and E ⊆ V × V a binary relation defined on V .
Let N = {1, 2, 3, . ..} denote the set of all natural numbers.We generally identify V with the first n natural numbers [n] := {1, . . ., n} and call #(G) := |V | = n the size of a graph G. Furthermore, let G n be the set of graphs of size n and let G := ∪ n≥0 G n be the set of all (finite) graphs.Note that our graphs are directed and may contain loops.
A property P ⊆ G of graphs is any set of graphs.We are particularly interested in first-order expressible properties.Our logic is a basic first-order predicate logic with equality.There are no function or constant symbols.We focus on firstorder properties of graphs, and so the only predicate symbol (besides the special symbol =) is the binary edge symbol E.
A sentence ϕ defines a property in the natural way, We require a distance between graphs and properties, which we define in the following way: We denote the symmetric difference of sets M and N by M N and let E A and E B be the edge predicates of graphs A and B, respectively.
The distance generalizes to properties, dist(A, P ) := min B∈P dist(A, B).Definition 2 results in a typical model of testing based on the dense graph model introduced by Goldreich et al. (1998).We now proceed to the remaining testing definitions.Definition 3.An ε-tester for property P is a randomized algorithm that makes queries for the existence of edges in a graph A. The tester must accept with probability at least 2/3 if A has P and must reject with probability at least 2/3 if dist(A, P ) ≥ ε.Definition 4. Property P is called testable if there is some function c(ε) and for every ε > 0, an ε-tester for P such that the tester makes at most c(ε) queries.
Note that the query complexity is bounded by a function that does not depend on the size of the graphs.We allow different ε-testers for each ε > 0 and so this is a non-uniform model.However, we are focused on proving untestability and our results hold even in the non-uniform case.
Next, we will define indistinguishability, a relation on properties introduced by Alon et al. (2000) that preserves testability.However, testers can focus on loops and distinguish between structures that have an asymptotically small difference (because the number of loops is asymptotically dominated by the number of non-loops).We therefore begin with an alternative definition of distance.In the following, ⊕ denotes exclusive-or.
Definition 5. Let n ∈ N, n ≥ 2, and let U be any universe such that |U | = n.Furthermore, let A = (U, E A ) and B = (U, E B ) be any two graphs with universe U .For notational convenience, let The mr-distance between A and B is Definition 6.Two properties P and Q of graphs are indistinguishable if they are closed under isomorphisms and for every ε > 0 there exists an N ε such that for any graph A with universe of size n ≥ N ε , if An important property of indistinguishability is that it preserves testability.The proof of the following is analogous to that given in Alon et al. (2000).
In fact, as the proof constructs an ε-tester for P by iterating an ε/2-tester for Q three times, one can also relate the query complexities of P and Q.
Definition 5 (mrdist) is a distance measure that can be used in place of Definition 2 (dist) when defining testability.The resulting model makes testing strictly more difficult than using dist, see Jordan and Zeugmann (2012).We refer to properties that are testable using mrdist as mr-testable.In Section 3, we prove the untestability of ∀∃∀ with equality in both models3 .However, our proof for the class without equality (cf.Section 4) is restricted to proving it is not mr-testable.We suspect that this restriction can be removed.
Many proofs of hardness for testability rely on Yao's Principle (1977), an interpretation of von Neumann's minimax theorem for randomized computation.For completeness, we state the version that we use.

Principle 1 (Yao's Principle).
If there is an ε ∈ (0, 1) and a distribution over G n such that all deterministic testers with complexity c have an error-rate greater than 1/3 for property P , then property P is not testable with complexity c.The definition of "testable" is of course our usual one involving random testers.In general, one seeks to show that for sufficiently large n and some increasing function c := c(n), there is a distribution of inputs such that all deterministic testers with complexity c have error-rates greater than 1/3.
Finally, we briefly define the notation we use to specify prefix-vocabulary classes.See Börger et al. (1997) for details and related material.Definition 7. Let Π be a string over the four-character alphabet {∃, ∀, ∃ * , ∀ * }.Then [Π, (0, 1)] = is the set of sentences in prenex normal form which satisfy the following conditions: 1.The quantifier prefix is contained in the regular language given by Π (for technical reasons, one usually treats ∃ and ∀ as matching the relevant quantifier and also the empty string).2. There are zero (0) monadic predicate symbols.3.In addition to the equality predicate (=), there is at most one (1) binary predicate symbol.4.There are no other predicate symbols.
That is, [Π, (0, 1)] = is the set of prenex sentences in the logic defined above whose quantifier prefixes match Π.If the second component of the specification is all, then conditions two and three are removed (any number of predicate symbols with any arities are acceptable).

The Case with Equality
Our goal in this section is Theorem 1.
We begin by outlining the proof.First, we define P f , a property expressible in the class [∀∃∀, (0, 1)] = which, as described in Subsection 1.1, is in some sense a somewhat tedious but first-order expressible variant of checking (explicit) isomorphism of undirected bipartite graphs in tripartite graphs.We then define a variant P 2 , in which the isomorphism is not explicitly given and we must test whether there exists some suitable isomorphism.Although this increases the complexity of deciding the problem from checking an isomorphism to finding one, it does not change hardness for testing.We show that P 2 and P f are indistinguishable and so P 2 is testable iff P f is testable.Finally, we prove directly that P 2 is untestable, even with o( √ n) queries, using an argument based on a recent proof by Alon and Blais (2010).
Proof (Theorem 1).We begin by defining P f .Formally, it is the set of graphs satisfying the following conjunction of four clauses (see Figure 1(c) for an example): A graph satisfies this formula if the following conditions are all satisfied: 1.The nodes without loops form a complete subgraph.2. For every node x with a loop, there is exactly one y without a loop such that there is an edge from x to y. 3.For every node y without a loop, there is exactly one x with a loop such that there is an edge from x to y. 4. For all nodes x, z with loops, and y the unique node without a loop such that E(x, y), it holds that E(x, z) iff E(y, z).
Property P 2 below is similar to P f , except that the isomorphism is not explicitly given.Definition 8.A graph G = (V, E) has P 2 if it satisfies the following conditions: The nodes without loops form a complete subgraph.3.There are no edges from a node with a loop to a node without a loop.

There exists a bijection
It is not difficult to show that properties P f and P 2 are indistinguishable.
Claim 1. Properties P f and P 2 are indistinguishable.
Graph G has P 2 and so there is a bijection satisfying Condition 4 of Definition 8. We therefore add the edges E(i, b(i)) making the isomorphism (from V 1 to V 2 ) explicit.The resulting graph G f has P f .
We have made exactly n/2 modifications, all to non-loops, and The converse is analogous; given a G that has P f , simply remove the n/2 edges from loops to non-loops after using them to construct a suitable b.
Properties P f and P 2 are indistinguishable and so (by Lemma 1), it suffices to show that P 2 is is untestable.Lemma 2 below is stronger than necessary, and actually implies a Ω( √ n) lower bound for testing P f per the discussion following Lemma 1.
Proof (Lemma 2).The proof is via Yao's Principle (cf.Principle 1), and so we define two distributions, D no and D yes and show that all deterministic testers have an error-rate greater than 1/3 for property P 2 when the input is chosen randomly from D no with probability 1/2 and from D yes with probability 1/2.
In the following, we consider a distribution over graphs of sufficiently large size 2n, and an arbitrarily fixed partition of the vertices into

g., let the vertices be the integers
and V 2 := V \V 1 ).
We begin with D no , defined as the following distribution: 1. Place a loop on each vertex in V 1 and place no loops in V 2 .

Place each possible edge (except loops) in
and independently with probability 1/2.
That is, D no is the uniform distribution of graphs (with this particular partition) satisfying the first three conditions of P 2 .Next, we define D yes as the following: 1. Choose uniformly a random bijection π : 2. Place a loop on each vertex in V 1 and place no loops in V 2 .
It is easy to see that D yes generates only positive instances.Next, we show that D no generates negative instances with high probability.Lemma 3. Fix 0 < ε < 1/2 and let n be sufficiently large.Then, Proof (Lemma 3).The distribution D no is the uniform distribution over graphs of size 2n with a particular partition satisfying the first three conditions of P 2 .Let G ε be the set of graphs G of size 2n satisfying these conditions and such that dist(G , P 2 ) ≤ ε (regardless of the partition).
Counting the number of such graphs shows where is the binary entropy function (cf.Lemma 16.19 in Flum and Grohe (2006) for the bound on the summation).Distribution D no produces each of 2 n(n−1) 2 n 2 graphs with equal probability, and so The approximation is asymptotically tight, which suffices.
We have shown that D yes generates only positive instances and that (with high probability) D no generates instances that are ε-far from P 2 .Next, we show that (again, with high probability) the two distributions look the same to testers making only o( √ n) queries.The proof is similar to a proof by Alon and Blais (2010).We begin by defining two random processes, P no and P yes , which answer queries from testers and generate instances according to D no and D yes , respectively.
Process P no is defined in the following way: We define P yes in the same way, except for the final step.When P yes quits or the tester finishes, it fixes the edges that were queried according to its answers, and also fixes the corresponding edges (when relevant) according to π.More precisely, for each fixed E(i, j) with i = j ∈ V 1 , we also fix E(π(i), j) and for fixed E(i, j) with i ∈ V 2 , j ∈ V 1 , we also fix E(π −1 (i), j), in both cases the same as our response to E(i, j) (not randomly).The remaining edges are placed as in P no .Note that P no generates instances according to D no and P yes generates instances according to D yes .In addition, P yes and P no behave identically until they quit or answer all queries.In particular, if a tester does not cause the process to quit, the distribution of responses of its queries is identical for the two processes.We show that, with high probability, a tester that makes o( √ n) queries does not cause either process to quit.

Lemma 4. Let T be a deterministic tester which makes o(
√ n) queries, and let T interact with P yes or P no .In both cases, Pr [T causes the process to quit] = o (1) .
Proof (Lemma 4).The condition causing the process to quit is identical in P yes and P no .The probability that any pair of queries E(i, j) and E(i , j ) cause the process to quit is at most The tester makes at most o( √ n) queries and so Any deterministic tester T which makes o( √ n) queries can only distinguish between D yes and D no with probability o( 1), but it must accept D yes with probability 2/3, and reject D no with probability 2/3 − o (1).It is impossible for T to satisfy both conditions, and the lemma follows from Principle 1.

The Case without Equality
In Section 3, we proved that [∀∃∀, (0, 1)] = is untestable.The formula proved untestable contains equality, and so we now consider the class without equality.The main result in this section is Theorem 2, stating that [∀∃∀, (0, 1)] is not mrtestable.Although this seems to be a tradeoff between the presence of equality and the "degree" of testability, we suspect that this class is not testable under either definition.The proof is very similar to the proof above.
Proof (Theorem 2).The proof is similar to the proof of Theorem 1.We will begin by defining a property P f that is expressible in our class.We will then define a property P which is indistinguishable from P f , and use Yao's Principle to show that P is not mr-testable.
A graph has property P f if it satisfies the following conditions: 1.For every x with a loop, there is an outgoing edge to at least one y without a loop.2. For every x without a loop, there is an incoming edge from at least one y without a loop.3.There are no edges between vertices without loops.4. For every x with a loop, there is an edge to at least one y without a loop such that for all z with loops, the following holds: There is a directed edge from y to z iff there are an odd number5 of directed edges between x and z. 5.For every x without a loop, there is an incoming edge from at least one y with a loop such that for all z with loops, the following holds: There is a directed edge from x to z iff there are an odd number of directed edges between y and z.
More formally, P f is the set of graphs that satisfy the following formula: Next, we define a property P that we will show to be indistinguishable from P f .A graph has property P if it satisfies the following conditions: 1.There is a partition of the vertices into (non-empty) V 1 , V 2 .2. All vertices in V 1 have loops and no vertices in V 2 have loops.3.There are no edges in V 2 × V 2 .4. There exist functions f : V 1 → V 2 and g : V 2 → V 1 satisfying the following: For all x, z ∈ V 1 , there is an edge from f (x) to z iff there are an odd number of directed edges between x and z.For all x ∈ V 2 and z ∈ V 1 , there is an edge from x to z iff there are an odd number of directed edges between g(x) and z.
It is not difficult to show that P and P f are indistinguishable.
Lemma 5. Properties P and P f are indistinguishable.
Proof (Lemma 5).Let G be graph with property P f and let ε > 0 be arbitrarily fixed.Then, G also has property P , that is mrdist(G, P ) = 0.In the other direction, if the graph G has property P , then we can satisfy property P f by adding at most O(n) (non-loop) edges from x to f (x) and g −1 (y) to y.Thus, mrdist(G, 1) < ε for sufficiently large graphs.
Indistinguishability preserves testability (cf.Lemma 1) and so it suffices to show that P is untestable.Lemma 6 below is stronger than necessary and actually implies a Ω( √ n) lower bound for testing P f per the discussion following Lemma 1. Lemma 6.There is an 0 < ε < 1/2 such that any mr-style ε-tester for P must perform Ω( √ n) queries.
Proof (Lemma 6).The proof is via Yao's Principle (cf.Principle 1) and so we must define a distribution of inputs and show that all deterministic ε-testers have an error rate greater than 1/3 for P on inputs from the distribution.For our distribution, we will draw from a distribution D no with probability 1/2 and from a distribution D yes with probability 1/2.
In the following, we consider distributions over graphs with sufficiently large vertex set [2n] and an arbitrarily fixed partition of the vertices into We begin with D no , defined as the following distribution: 1. Place loops on all vertices in V 1 and no loops in V 2 .
2. For each ordered pair in V 1 × V 2 , place a directed edge with probability 1/2.3.For each unordered pair {i, j}, i, j ∈ V 1 and i = j, with probability 1/2 place no edge, and with probability 1/2 place a single directed edge, from i to j if i ≤ j and from j to i if j ≤ i. 4. For each ordered pair in V 2 ×V 1 , with probability 1/2 place the directed edge and with probability 1/2 do not.
Note that D no is the uniform distribution of graphs that satisfy the following conditions: Distribution D yes generates only positive instances for P .Now, we show that with high probability, D no generates instances that are ε-far.Lemma 7. Let ε > 0 be sufficiently small and n be sufficiently large.Then, Proof (Lemma 7).Distribution D no is the uniform distribution over graphs of size 2n with a fixed partition V 1 , V 2 satisfying the following: 1.All vertices in V 1 have loops and no vertices in V 2 have loops.2. There are no undirected edges between vertices with loops.3.There are no edges between vertices without loops.4. All directed edges (x, y) between vertices with loops satisfy x ≤ y.
We want a small upper-bound on the probability of a graph being drawn from D no that is not ε-far from P .Since D no is the uniform distribution over a certain class of graphs, this probability is The number of distinct graphs produced by D no is 2 ( n 2 ) 2 2n 2 = 2 2.5n 2 −n/2 .Let G 2n be the set of graphs with vertices [2n] that have property P and are not ε-far from all graphs in D no .Then, Note that any graph G that is not ε-far from all graphs in D no must have loops on n − εn ≤ j ≤ n + εn vertices.Therefore, Using the (asymptotically tight) estimate 2n n ≈ 4 n / √ πn, we see that ( 2) is approximately (2εn + 1) √ πn 2 2n+(n+εn) 2 +(n+εn)(n+εn−1)+(n+εn) log (n+εn) .
Combining this with Inequality (1) and using that Proof (Lemma 8).The condition causing the process to quit is identical in P yes and P no .The probability that any fixed pair of queries E(i, j) and E(i , j ) cause the process to quit is at most Pr[i = π(i)  Any deterministic tester T which makes o( √ n) queries can only distinguish between D yes and D no with probability o( 1), but it must accept D yes with probability at least 2/3 and reject D no with probability at least 2/3 − o (1).It is impossible for T to satisfy both conditions, so the lemma follows from Principle 1.

Conclusions
Property testing is a kind of randomized approximation, where we take a small, random sample of a structure and seek to determine whether the structure has a desired property or is far from having the property.We focused on the classification problem for testability, wherein we seek to determine exactly which prefix vocabulary classes are testable and which are not.
In particular, we focused on the the testability of first-order properties expressible with quantifier prefix ∀∃∀.In Section 3, we showed that this prefix can express untestable (directed) graph properties when equality is available.Then, in Section 4 we considered the class without equality.There, we showed that this class remains mr-untestable, however testability using dist remains open.We suspect that this class remains untestable using dist.These results sharpen some of the results of Jordan and Zeugmann (2012).
As mentioned in Subsection 1.1, the current classification for testability closely resembles several other classifications (e.g., those for the finite model property, docility and associated second-order 0-1 laws) and it would be interesting to determine whether it coincides with one of these.In particular, determining the testability of variants of the Gödel class would complete the classification for the special case of predicate logic with equality.