Normalized information distance and the oscillation hierarchy

We study the complexity of computing the normalized information distance. We introduce a hierarchy of limit-computable functions by considering the number of oscillations. This is a function version of the difference hierarchy for sets. We show that the normalized information distance is not in any level of this hierarchy, strengthening previous nonapproximability results. As an ingredient to the proof, we demonstrate a conditional undecidability result about the independence of pairs of random strings.


Normalized information distance
The normalized information distance NID is a distance measure for binary strings that is based on prefix-free Kolmogorov complexity K.Here the value K(x) is the minimum length of a string p that describes x in the sense that U (p) = x for some fixed additively optimal Turing machine with prefix-free domain.Observe that such a machine cannot be defined on the empty string, hence all values of K are nonzero.The normalized information distance is defined as NID(x, y) = E(x, y) max K(x), K( y) where E(x, y) = max K(x | y), K( y | x) .
Note that NID, being the ratio of two nonzero functions that are approximable from above, is computable in the limit, i.e., there is a computable rational-valued function f with three arguments such that for all x and y we have lim s→∞ f (x, y, s) = NID(x, y).
Terwijn, Torenvliet, and Vitányi [9] have shown that NID can neither be computably approximated from below nor from above, i.e., such a computable approximation f of NID can neither be increasing nor decreasing in s.In particular, the function NID is not computable.In what follows, we improve on these nonapproximability results by confirming their conjecture [9, Section 5] that for any computable approximation of NID, the number of oscillations is not bounded by a constant, or, equivalently, that NID is not in the oscillation hierarchy.The oscillation hierarchy is defined as the union of the classes −1 1 , −1 2 , . .., where −1 k is the class of all functions that have a computable approximation that initially increases and switches at most k − 1 times between increasing and decreasing.See Section 2 for formal definitions.
Related to the proof of our main result, we demonstrate that given two random strings, it is undecidable whether they are independent.In fact, this conditional undecidability result is derived in the stronger form that there is no enumeration of pairs that includes infinitely many random pairs and where all the random pairs in the enumeration are independent.The stronger result can be viewed as a conditional immunity statement and is used in the proof of our main result.

Related work
The concept of normalized information distance was introduced by Li et al. [4], and subsequently studied in a series of papers, cf.Vitányi et al. [10] and Li and Vitányi [5,Section 8.4].It has both theoretical and practical interest.While the function NID itself is noncomputable, there are computable variants that have a number of surprising practical applications.Such variants are for example defined in terms of standard compression algorithms in place of prefix-free Kolmogorov complexity.
The difference hierarchy over the computably enumerable sets, or c.e. sets, for short, was introduced by Ershov, cf.Odifreddi [6,IV.1.18]and Selivanov [7].It is a fine hierarchy for the 0  2 -sets, sometimes also referred to as the Boolean hierarchy.It can be seen as an effective version of a classical hierarchy introduced by Hausdorff, which is studied in descriptive set theory.An analogous hierarchy defined over NP is studied in complexity theory.When restricting attention to {0, 1}-valued functions, i.e., to sets, the oscillation hierarchy coincides with the difference hierarchy, as follows from the discussion following Definition 2.1.In particular, −1 1 consists of the c.e. sets, −1 2 contains the d.c.e.sets, i.e., the differences of c.e. sets, and in general −1 k contains the k-c.e.sets.These coincidences motivate the choice of our notation for the classes of the oscillation hierarchy, as the same notation has been used for the classes of the difference hierarchy, see e.g.Selivanov [8].
Recall that a hierarchy is proper if each of its levels is strictly included in the next one.Similar to the case of sets, the oscillation hierarchy is proper and does not exhaust the class of all limit-computable functions.As in the case of sets, this can be shown by elementary diagonalization arguments and, in fact, this follows from the analogous results for sets.Theorem 5.1, our main result, asserts that NID is a natural example of a limit-computable function that is not in the oscillation hierarchy.
Note that Bennett et al. [2] have shown that E satisfies the properties of a metric up to a constant additive term.Furthermore, E is minimal among all similar distance functions [10,Theorem 3.7].Note further that somewhat in contrast to the definition of normalized information distance, the name information distance is used for the function D defined as Here U is the universal prefix-free machine used to define K.It can be shown that D and E are equal up to a logarithmic additive term [10, Corollary 3.1], i.e., we have

Notation
Our notation is mostly standard.For further explanations, details and background, in particular about computability theory, we refer to Odifreddi [6] and to Downey and Hirschfeldt [3].A string is a binary word, i.e., a finite sequence over the binary alphabet {0, 1}.We use |x| to denote the length of a string x, the empty string λ is the unique string of length 0.
The set of strings is denoted by {0, 1} * , the set of natural numbers is denoted by ω.The two latter sets are identified by the order isomorphism that takes the length-lexicographical ordering on the set of strings to the standard ordering on the natural numbers.Unless explicitly stated differently, the term set refers to a set of binary strings or, equivalently, to a subset of the natural numbers.We identify such a set A with the infinite binary sequence A(0) A (1) . . .where A(n) is equal to 1 if and only if n is in A. This sequence is referred to as the characteristic sequence of A.
We use + to denote inequality up to a fixed additive constant.For example, f (x) + g(x) means that there is a constant c such that we have f (x) g(x) + c for all x in some specific set that will be clear from the context.Similar notation such as = + is defined likewise.
Enumerations of any type of objects are always meant to be effective.

Prefix-free Kolmogorov complexity
For further use, we compile some standard facts about Kolmogorov complexity.For proofs of these facts, as well as for definitions, details and further background, we refer to Li and Vitányi [5] and to Downey and Hirschfeldt [3].For a string x, we let x * denote the program of minimum length for x that appears first in some fixed enumeration of the domain of the universal machine used to define K.By the latter condition, the string x * can be computed given x and K(x).For a string x of length n, we have (2) Indeed, it holds that K(x) + n + K(n) for all strings x.By Chaitin's counting theorem [3,Theorem 3.7.6],there is a constant d such that for all t for at most a fraction of 2 −t+d of all words x of length n, we have K(x) n + K(n) − t.In the special case where t is equal to K(n) + 1, we obtain that at most a fraction of 2 −K(n)−1+d of all words x of length n is nonrandom in the sense that K(x) < n.By symmetry of information, we refer to the following chain of equations which holds for all strings x and y.In case both strings have the same length, we have K(xy) = + K(x, y), and symmetry of information remains valid with K(x, y) replaced by K(xy).Symmetry of information is due to Levin and Gács and also Chaitin [3,Theorem 3.10.2].

Outline
First, in Section 2 we review the limit-computable functions and introduce the oscillation hierarchy.Then, in Section 3, we derive some basic properties of NID and, in particular, reprove the known results that NID can neither be effectively approximated from below nor from above.Before demonstrating in Section 5 our main result, Theorem 5.1, we collect in Section 4 notation and facts to be used in its proof, including the already mentioned conditional immunity result, which is stated as Theorem 4.7.

Limit-computable functions
In this section we introduce notation that relates to approximations of real-valued functions on the natural numbers.
This notation extends canonically to real-valued functions with a countable domain like {0, 1} * , Q, ω × ω, or similar via the usual identification of such a domain with the set of natural numbers.In particular, this notation will be applied to approximations of the function NID, which maps pairs of strings to a rational number.
For a start, we recall the following notation from computability theory.
The function F is limit computable, if F has a computable approximation.
Given a computable approximation of an ω-valued function F , by rounding the values of the approximation to the nearest natural number, we obtain a computable ω-valued approximation f to F where then, in particular, for each argument x almost all values f (x, s) are equal to F (x).As a consequence, ω-valued limit-computable functions are just the limit-computable functions from computability theory, which are also called computably approximable or 0 2 -functions.By Shoenfield's Limit Lemma [6, IV.1.17],an ω-valued function F is limit-computable if and only if F is computable with the halting problem ∅ .The three following remarks show that this equivalence is false for rational-valued functions in general, but it extends to rational-valued functions such as NID, where for given arguments one can compute a finite set of candidate rational numbers that contains the function value.The reason is that for functions of the latter type, from any effective approximation that converges in distance, we obtain an effective approximation that converges in value by rounding the approximating values to the nearest value in the set of candidates, see Remark 2.3 for details in the case of NID.
Remark 2.2.Let W 0 , W 1 , . . .be the standard enumeration of all c.e. sets.Fix some enumeration (e 0 , n 0 ), (e 1 , n 1 ), . . . of all pairs (e, n) such that n is in W e , and let W e,s be equal to the set of all n i such that e i = e and i < s.If we let F (e) = 1 in case W e is empty, let F (e) = 0 in case W e is infinite and, otherwise, let Note that F (e) is equal to 0 if and only if W e is infinite.The function f is computable, hence F is limit-computable.
However, as a rational function, F is not computable with the halting problem because otherwise, the halting problem would decide the 0 2 -complete index set of all e such that W e is infinite, a contradiction.
Remark 2.3.Let f be a computable approximation of NID, i.e., f converges to NID in distance in the sense that for any arguments x and y, the difference between f (x, y, s) and NID(x, y) goes to zero.By definition of NID and the upper bounds ( 1) and ( 2) on prefix-free Kolmogorov complexity, for some constant c any value of the form NID(x, y) must be contained in the set By rounding any value f (x, y, s) to the nearest value in V (x, y), breaking ties arbitrarily, we obtain an approximation f R that converges to NID not just in distance but also in value, i.e., for all x and y the approximated value f R (x, y, s) is equal to NID(x, y) for almost all s.
Remark 2.4.A rational-valued function F is computable with the halting problem if and only if it has a computable approximation f that converges to F by value in the sense of Remark 2.3, i.e., such that for all x the value f (x, s) is equal to F (x) for almost all s.The proof is essentially the same as the proof of Shoenfield's Limit Lemma, details are omitted.

Increasing and decreasing phases
Given an approximation to some limit-computable function, for any given argument we consider the alternations of the approximation between going up and going down.By bounding the number of such changes by a constant for all arguments, we obtain a fine hierarchy for limit-computable functions.Definition 2.5.Let f be an approximation of some function ω → R and fix some natural number x.With x understood, let be the increase of f at s, and call s increasing in case δ(x, s) > 0, and call s decreasing in case δ(x, s) < 0. Furthermore, a subset of the natural numbers is monotonic in case it does not contain both increasing and decreasing indices.
For any given natural number x, phase t of f on x is defined inductively for all t > 0 as follows.Phase 1 is equal to the maximum initial segment of ω on which f is monotonic.In the induction step, assume that for some t > 1 the phases 1 through t − 1 are already defined.If the union of the latter phases is all of ω, these are the only phases of f on x.Otherwise, let m t be the maximum member in phase t − 1 and let phase t be equal to the maximum initial segment of ω \ {0, . . ., m t } on which f is monotonic.
The approximation f reaches at most phase t on x in case there is no phase t + 1 on x.In case the latter holds for all natural numbers x, the approximation f reaches at most phase t.A phase is increasing if it contains an increasing index, and a phase is decreasing if it contains a decreasing index The next remark states without proofs some straightforward properties of phases.Remark 2.6.Let f be an approximation of some function ω → R and let x be a natural number.The phases of f on x form a partition of the natural numbers into successive contiguous intervals, which are all finite unless the partition is finite, in which case exactly the last phase is infinite.In case the function s → f (x, s) is constant, there is exactly one phase, which is neither increasing nor decreasing.Otherwise, each phase is either increasing or decreasing, and increasing and decreasing phases alternate.With the possible exception of phase 1, a phase is increasing or decreasing if and only if the least index in the phase is increasing or decreasing, respectively.

The oscillation hierarchy
The levels of the oscillation hierarchy introduced next stratify the class of limit-computable functions according to the number of alternations between increasing and decreasing phases.Definition 2.7.Let k be a nonzero natural number.A −1 k -approximation is a computable approximation f of some function N → R such that on every input the approximation in the first phase is either constant or increasing and f reaches at most phase k.The definition of −1 k -approximation is literally the same except that the first phase is required to be decreasing instead of increasing.
of all such functions is defined likewise.The oscillation hierarchy is defined as The functions in −1 1 and in −1 1 are also called approximable from below and approximable from above, respectively.

Normalizing approximations of NID
We write NID s (x, y) for approximations of NID, i.e., we have lim s→∞ NID s (x, y) = NID(x, y) and NID s (x, y) ∈ Q.
Notions relating to approximations are extended to this notation in the natural way, e.g., such an approximation is computable if NID s (x, y) is a computable function of s, x, and y.In the same fashion, let K s be some fixed computable approximation from above to K s with values in the natural numbers, and similar for conditional prefix-free Kolmogorov complexity.
Definition 2.8.The Kolmogorov approximation NID K s to NID is defined by

y) .
Remark 2.9.Let NID s be any effective approximation of NID that reaches at most phase m and, like in Remark 2.3, let NID R s be the version of NID s where the function values have been rounded to the nearest value in the set V (x, y).Then for all x and y and for almost all i, we have because both sides of the equation converge in value to NID(x, y) in the sense of Remark 2.3.So we can fix a computable sequence i(0) < i(1) < . . .such that (3) holds with i replaced by i(s) for all s.
and call NID s the normalized version of NID s .Note that NID s is indeed an effective approximation of NID and reaches at most phase m, too.For a proof of the latter property, observe that NID R i reaches at most phase m since the latter function may only increase in s in case NID s increases, and similarly for decreasing.Thus it suffices to observe that by construction NID 0 (x, y), NID 1 (x, y), . . . is a subsequence of NID R 0 (x, y), NID R 1 (x, y), . ...

Some basic properties of NID
In Section 5, we will show our main result that NID is not in the oscillation hierarchy.Before, we derive in the current section some basic properties of NID and give new proofs for the known facts [9] that NID is approximable from neither below nor above.(5) Proof.For a proof of (4), observe that by definition we have NID(x, x) = K(x | x) K(x) .For fractions of the latter form, with growing length of x the denominator goes to infinity, whereas the numerator is bounded from above by a constant, so NID(x, x) tends to 0. For a proof of (5), observe that by a standard counting argument, for some constant c, all sufficiently large n and some x of length n, we have So by definition of NID and (1), it holds for almost all n and all such x that NID(x, 0 A set is called immune if it is infinite, but it does not contain an infinite c.e. subset.Immune sets were introduced by Post, and they play an important role in computability theory, cf.Odifreddi [6].
Proof.In case the theorem were false, fix an enumeration of some infinite c.e. subset of the set under consideration.Among all strings of length at least t, let x t be the one that is enumerated first.There is a prefix machine with some coding constant c that outputs x 4n when given the string 10 n−1 as input, hence 2n ≤ K(x 4n ) ≤ n + c for all n, a contradiction.Proof.First note that X is infinite by Lemma 3.1.Now suppose for a contradiction that X r has an infinite c.e. subset A.
For each n there are at most finitely many pairs (x, y) where |x| = |y| = n, hence by taking an appropriate effective subsequence of some fixed enumeration of A, we obtain an enumeration (x 0 , y 0 ), (x Proof.By Proposition 3.3, the set X 1/3 defined there is immune.But if NID were approximable from below, this set would be c.e., hence could not be immune.

Lemma 3.5.
There is no computable sequence of pairs k for all k.
Proof.Assume for a contradiction that there is a sequence as in the lemma.Fix a constant c 0 such that for all strings x and y of equal length, the values K(x) and K( y) are both less than or equal to K(xy) + c 0 .There is a prefix-free machine with some coding constant c 1 that outputs x 2k y 2k when given the binary string 10 k−1 as input, hence K(x 2k y 2k ) ≤ k + c 1 for all k.In summary, we have a plain contradiction for all k > c 0 + c 1 .
Proof.Assume for a proof by contradiction that the proposition is false.By Lemma 3.1, the values NID(x, x) tend to 0, thus by dovetailing approximations from above to the values NID(x, x) for all x, for given k one can effectively find a string x k such that NID(x k , x k ) < 1 k .This contradicts Lemma 3.5.
Proof.Suppose for a contradiction that NID is −1 2 , that is, it has a computable approximation that starts with an increasing phase and reaches at most phase 2. Consider the pairs (x, y) of words where |x| = |y|.By Lemma 3.1, there are infinitely many such pairs (x, y) where NID(x, y) > 3  4 .Consequently, we can effectively find infinitely many such pairs (x, y) such that the approximation of NID(x, y) attains a value strictly larger than 3   4 during phase 1.If for some k ≥ 2 and almost all pairs (x, y) of the latter kind it would actually hold that NID(x, y) > 1 k this would contradict Proposition 3.3.As a consequence, for every k ≥ 2 there is a pair (x, y) of words of identical length where the approximation becomes smaller than 1 k during phase 2, and for all such k, x, and y we have NID(x, y) < 1 k because the approximation never reaches phase 3.For given k such x and y can be found effectively, which contradicts Lemma 3.5.

Random and independent pairs
Before we demonstrate in the next section our main result, Theorem 5.1, we collect some notation and facts used in its proof.Proof.By Chaitin's counting theorem, there is a constant d such that for given n, at most 2 n−K(n)−1+d • 2 n many pairs have a nonrandom first component, and the same bound holds for the number of pairs with nonrandom second component.
Consequently, among the 2 2n pairs of words of length n at most 2 2n−K(n)+d are nonrandom, which is a fraction of at most ε for almost all n.
As usual, let an order be a function from the set of natural numbers to itself that is nondecreasing and unbounded.Definition 4.3.With some computable order h understood, a pair (a, a ) of strings of equal length n is independent conditioned on a string x if we have and (a, a ) is independent if it is independent conditioned on the empty string.
Lemma 4.4.Let ε > 0 be a real number and let h be some order.Then there is some n 0 such that for all n ≥ n 0 and for any fixed word x, all but a fraction of at most ε of the pairs (a, a ) of words of equal length n are independent conditioned on x.
Proof.Fix any natural number n and any word x.The number of words of length strictly less than n − h(n) is bounded from above by 2 n−h(n) , hence for given a the latter bounds also the number of words a such that K(a As a consequence, the number of pairs (a, a ) of words of length n that do not satisfy the first inequality in ( 7) is at most 2 n 2 n−h (n) .By symmetry, the same upper bound holds for the number of pairs (a, a ) that do not satisfy the second inequality in (7).Consequently, among the 2 2n pairs of words of length n, at most 2 •2 2n−h(n) many pairs are not independent conditioned on x, i.e., at most a fraction of 2 • 2 −h(n) .The latter bound is at most ε for all n larger than some appropriate number n 0 that does not depend on x.

Conditional immunity
As a further ingredient to the proof of Theorem 5.1, we derive a result about the undecidability of independence of random strings.More precisely, we show that there is no algorithm that, given two random strings of the same length, can decide whether they are independent or not, where it is agreed that the algorithm may fail to converge or to give the right answer if one or both of the strings are not random.In fact, we need a stronger assertion, which will be formulated in terms of the following notion of conditional immunity.

Definition 4.5.
A set A is decidable conditional to a set C if there is a partial computable function ϕ such that for all x in C the value ϕ(x) is defined and equal to A(x).A set A is immune conditional to a set C if the set A ∩ C is infinite but does not contain an infinite set of the form B ∩ C where the set B is c.e. Suppose that A is decidable conditional to C and that this is witnessed by the partial computable function ϕ.Then A cannot be immune conditional to C because the set A is either finite or contains the infinite set B ∩ C where B is the c.e. set of all n where ϕ(n) is equal to 1. Hence, conditional immunity is a strong form of conditional undecidability.
Decidability and immunity conditional to the set of natural numbers are just classical decidability and immunity, respectively.Classically, a set is not immune if it is either finite or has an infinite c.e. subset.Indeed, this c.e. subset can always be assumed to decidable, since every infinite c.e. set contains a decidable subset.The following remark shows that a similar remark is false for conditional immunity.

Remark 4.6.
There are sets A and C with infinite intersection such that A is not immune conditional to C but the latter fact is not witnessed by any computable set.
For a proof, let D be any infinite subset of some c.e. set A, and let C = D ∪ A. Then A witnesses that A is not immune conditional to C .Moreover, for any set B that witnesses the latter fact, by definition the set B ∩ C is an infinite subset of A, hence B is a subset of A that has an infinite intersection with D.
So it suffices to show that any noncomputable c.e. set A has an infinite subset D where B ∩ D is finite for any computable subset B of A. To this end, let B 0 , B 1 , . . .be a (noneffective) list of all computable subsets of A and let F e = i e B i for all e.Each set F e is a computable subset of A, hence A \ F e is infinite by noncomputability of A. This implies that there is an ascending sequence ∈ F e for all e.So the set D = {d e : e 0} is an infinite subset of A such that for all e, the intersection D ∩ B e is contained in {d 0 , . . ., d e−1 }.Theorem 4.7.Let r > 0 be a real number.Let R be the set of random pairs of equal length and let I be the set of pairs of equal length that are not mutually r-compressible, i.e., let Then the set I is immune conditional to R.
Proof.By the discussion of Chaitin's counting theorem in the introduction and Lemma 4.2, it follows easily that the intersection of the sets R and I is infinite, details are left to the reader.So suppose for a contradiction that there exists a c.e. set B such that R ∩ B is an infinite subset of I .Fix any pair (x, y) in R ∩ B and without loss of generality assume K( y | x) r| y|.
If we let n be equal to the length of x and y, we have for some constant c Here the inequalities follow, from top to bottom, by the variant of symmetry of information stated in the paragraph on Kolmogorov complexity, because x * can be computed given x and K(x), because applying (1) twice yields K(K(x)) < c log n for some constant c, and, finally, by assumption on the pair (x, y).Consider any n that is so large that c log n < r 2 n, and such that R ∩ B contains pairs (x, y) of words of length n.Then, on the one hand, for each such pair, we have K(xy) ≥ + n + r 2 n.On the other hand, for each such n there is such a pair (x n , y n ) where K(x n y n ) + n, a contradiction.In order to obtain (x n , y n ) as claimed, let z n be the string of length n that is enumerated last in some fixed enumeration of all nonrandom strings (of all lengths).Then knowing z n one knows all random strings of length n.Thus we can compute from z n the pair (x n , y n ) that among all random pairs of strings of length n is enumerated first into B. Since K(z n ) < n, we have K(x n y n ) + n.

NID is not in the oscillation hierarchy
Our main result Theorem 5.1 asserts that NID is not in the oscillation hierarchy, which confirms a conjecture by Terwijn, Torenvliet, and Vitányi [9].
We begin by giving an informal description of the proof of Theorem 5.1.For a proof by contradiction, we assume that there is a computable approximation NID s to NID that reaches at most phase m for some natural number m.By Remark 2.9, we can assume that this approximation NID s is normalized, i.e., is obtained by approximating prefix-free Kolmogorov complexity.We may thus argue, for example, that the approximated values NID s (x, y) become larger in case the approximations to K(x) and K( y) become smaller while the approximations to K( y | x) and K( y | x) remain the same.By using such formulations we aim at a very rough intuitive description of the phenomena that occur, which is, however, not precise enough to provide a sketch of the formal proof.
In the proof of Theorem 5.1, we fix rational numbers α and β where β < α < and some slightly smaller constant nonzero fraction of all such pairs, both the property and the independence condition hold.
In the induction step, we assume that there is an increasing phase t 0 during which the approximation goes above α for infinitely many pairs (a, a ) and some constant nonzero fraction of all pairs (bc, b c ).Then we argue that this includes infinitely many pairs (ab, a b ) such that at some later stage the pair (b, b ) appears to be at the same time random and mutually highly compressible.By the latter property and the independence condition it follows that NID(abc, a b c ) < β, which in turn implies that there must be a decreasing phase t 1 > t 0 during which the approximation goes below β for infinitely many pairs (ab, a b ) and some constant nonzero fraction of all pairs (c, c ). Next we argue that for infinitely many of these pairs (ab, a b ) it turns out later that the pair (b, b ) is mutually highly compressible, which together with the independence condition implies that NID(abc, a b c ) > α.Consequently, there must be an increasing phase t 2 > t 1 during which the approximation goes above α for infinitely many pairs (ab, a b ) and a nonzero fraction of all pairs (c, c ).
Intuitively speaking, in the induction step it is argued that there are sufficiently many argument pairs (abc, a b c ) for which the approximation NID s first goes above α during phase t 0 , then goes below β during phase t 1 , and finally goes again above α during phase t 2 .This holds because there are sufficiently many pairs b and b that first appear to be random and mutually incompressible, then, second, appear to be random and mutually compressible, and, third, finally appear to be Proof.For a proof by contradiction, assume that NID is in −1 m for some m > 1, i.e., has a computable approximation NID s (x, y) that always starts with an increasing phase and reaches at most phase m on all arguments.In what follows, we speak of increasing phases 1, 3, . . .and decreasing phases 2, 4, . ... Choose the rational r > 0 so small that For the scope of this proof, call a pair (w, w ) of words t-high in case phase t is increasing and contains some s where NID s+1 (w, w ) > α.Observe that all sets of the form A(k, t, ε) are empty in case t > m, as well as in case phase t is decreasing, by choice of NID s and by definition of t-high.
In the remainder of this proof, the notion independent conditioned on a certain word is always meant with respect to the fixed order h(n) = log n.In particular, the values h(n)/n tend to 0, hence for any constant , we have Claim 1.There is some phase t ≤ m such that A(m, t, 1 2m ) is infinite.
Proof.Let n be a natural number, let a = 0 n and let u and u be any words of length (3 m − 1)n.Then we have K(au) = + K(u), where the constants hidden in the notation = + do not depend on n, a, u or u .Thus for some constant d that is again independent of the latter four parameters, in case n is sufficiently large and u and u are independent, we have By the preceding discussion and Lemma 4.4, for almost all n and at least half of all pairs (u, u ) of words of length (3 m − 1)n, we have NID(0 n u, 0 n u ) > α, hence (0 n u, 0 n u ) must be t -high for some phase t , where t ≤ m by assumption on the approximation NID s .Hence there must be some t ≤ m such that for infinitely many n for a fraction of at least and ε > 0 such that the set A(m − j, t, ε) is infinite.This is a contradiction because the latter set must be empty as the approximation NID s is assumed to reach at most phase m < t.Observe that m − j ≥ m − m 2 ≥ 1 since m > 1.
In order to demonstrate Claim 2, fix k, t 0 and ε as in the assumption of the claim.For the remainder of this proof, when using the letters a, b, c, w, and n with or without decoration in the same context, we always assume that we have w = abc, |a| = n, |b| = 2n, |c| = n where = 3 k − 3, i.e., |w| = 3 k n.
In particular, we assume for all pairs of the form (a, a ), (ab, a b ), or similar that the two components of the pair have equal length.By abuse of notation, quantification over words and pairs of words involving the mentioned variable names is restricted to words of the form just described.For example, if we use the phrase for all words a and b, this is meant as abbreviating the phrase for all n and all words a of length n and b of length 2n.Recall the concept of a random pair introduced in Definition 4.1.  is t 0 -high for a fraction of at least ε/2 of all (c, c )}.
Proof.For any pair (a, a ) in A(k, t 0 , ε), the pair (ab, a b ) is in B 0 for a fraction of at least ε/2 of all pairs (b, b ).Otherwise, if this fraction were q < ε/2, the fraction of pairs (bc, b c ) such that (abc, a b c ) is t 0 -high would be strictly less than From the proof of Theorem 5.1 it is obvious that the examples of pairs of strings x, y forcing the changes in the approximation NID s are of rather long length.It would be interesting to have a more careful analysis of these lengths.1 Question 5.2.Relate the number of oscillations of approximations of NID(x, y) to the length of x and y.

CRediT authorship contribution statement
All three authors have contributed equally to this paper.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Lemma 3 . 1 .
The values of NID(x, y) come arbitrarily close to 0 and 1 even if the arguments are restricted to strings x and y of the same length.In fact, the following slightly stronger assertions hold lim n→∞ max{NID(x, x) : |x| = n} = 0, (4) lim n→∞ max{NID(x, 0 n ) : |x| = n} = 1.

Definition 4 . 1 .
Let r > 0 be a real number and let a and a be words.The word a is random if K(a) ≥ |a|, and the pair (a, a ) is random if a and a are both random.The string a is r-compressible if K(a) ≤ r|a|, and the pair (a, a ) is r-compressible if a and a are both r-compressible.The pair (a, a ) is mutually r-compressible if we have K(a | a ) ≤ r|a| and K(a | a) ≤ r|a |.

( 6 ) 4 . 2 .
Lemma Let ε > 0 be a real number.For almost all n, all but a fraction of at most ε of the pairs (a, a ) of words of equal length n are random.

1 .
The proof has an inductive structure where in the induction step we consider approximations NID s (w, w ) for pairs of strings w = abc and w = a b c where a and a , b and b , as well as c and c are of identical length, and where a has length n, b has length 2n, and c has length n for some fixed where 6 ≤ ≤ 3 m − 3. We use an independence condition for pairs of words where the fraction of pairs satisfy the condition among all pairs of words of length n tends to zero when n goes to infinity.Thus if some property holds for almost all n and a constant nonzero fraction of all pairs (c, c ) of words of length n, then for almost all n

Theorem 5 . 1 .
nonrandom and mutually compressible.That is, the maximum of K(b) and of K(b ) and the maximum of K(b | b ) and K(b | b)appear first to be both high, second to be high and low, respectively, and, third, to be both low, where low means close to 0 and high means close to |b|.That such changes, which concern only the strings b and b , result in changes of the value of NID s (abc, a b c ) is because of independence.For an independent pair (c, c ), the prefix-free Kolmogorov complexity of c and c , as well as their mutual conditional prefix-free Kolmogorov complexity conditioned in addition on (ab) * are all so close to |c| that the influence of c on a value of the form NID s (abc, a b c ) can be neglected compared to the influence of a, a , b, and b .Since in addition the two former strings are short compared to the two latter strings, the described changes in prefix-free Kolmogorov complexity relating to b and b , though small compared to |c|, are still large enough to force NID s (abc, a b c ) below β and above α.NID is not in the oscillation hierarchy, i.e., NID is not in−1  m for any m 1.

Claim 4 .Claim 5 .
Let (c, c ) be independent conditioned on (ab, a b ) and let the pair (ab, a b ) be r-compressible.Then NID(abc, a b c ) > α.Proof.For all sufficiently large n, we haveK(abc) ≤ + K(ab) + K(c | (ab) * ) ≤ + r|ab| + K(c | n) ≤ + (3r + )n K(abc | a b c ) ≥ + K(c | (a b ) * c * ) ≥ + n − h( n),where by symmetry the upper bound holds also for K(a b c ) and the lower bound holds also for K(a b c | abc).Similar to the proof of Claim 3, we obtain that there is a constant d such that for all sufficiently large n we haveNID(abc, a b c ) ≥ n − h( n) − d ( + 3r)n + d = − h( n)/n − d/n + 3r + d/nInfinitely many pairs (ab, a b ) where the pair (b, b ) is random and mutually r-compressible are member of the set B 0 = {(ab, a b ) : (abc, a b c 1 , y 1 ), . . . of some infinite c.e. subset of A where |x n | < |x n+1 | for all n.By the latter property and because x n and y n have equal length by definition of X r , the values max{K(x n ), K( y n )} tend to infinity, while the values K(x n | y n ) and K( y n | x n ) are both bounded from above by a fixed constant that does not depend on n.Consequently, the values NID(x n , y n ) tend to 0, a contradiction.
Similarly, call the pair t-low in case phase t is decreasing and contains some s where NID s+1 (w, w ) < β.Given natural numbers k and t, and a real number ε, let A(k, t, ε) = {(a,a ) : |a| = |a | and for a fraction of at least ε of all pairs (u, u ) of words of equal length (3 k − 1)|a|, the pair (au, a u ) is t-high}.
Before we prove Claim 2, we argue that the first two claims imply the theorem.By Claim 1, we can fix t ≤ m such that A(m, t, 1 2m ) is infinite.By applying Claim 2 to the latter set for at most m 2 times, we obtain j ∈ {1, . . ., m 2 }, t > m, 12m of all pairs (u, u ) of words of length (3 m − 1)n the pair (0 n u, 0 n u ) is t-high.For all such n, the pair (0 n , 0 n ) is in A(m, t, 1 2m ).Claim 2. Let k and t be in {2, . . ., m}, and let ε > 0 be a real number such that A(k, t, ε) is infinite.Then A(k − 1, t , ε 4m 2 ) is infinite for some t ≥ t + 2.