MATHEMATICS An Improvement of Kolmogorov's Estimates Related to Random Number Generators and Definition of Randomness in Terms of Complexity A. L. Semenov and An. A. Muchnik Kibernetika Scientific Council, Russian Academy of sciences, ul. Vavilova 40, Moscow, 117333 e-mail: alsemenov@mtu-net.ru, muchnik@lpcs.math.msu.ru Presented by Academician Yu. I. Zhuravlev January 4, 2003 Received January 10, 2003 In the 1930s, A.N. Kolmogorov constructed a substantiation of probability theory by means of measure theory. However, not all problems related to the substantiation were solved. In 1963 [1], Kolmogorov started to develop a new approach to the problem, the theory of description complexity. In [1], he obtained upper and lower bounds for the maximal number of admissible choice rules for which a random number generator surely exists and stated the problem of obtaining an estimate of exact order. In this paper, we solve this problem of Kolmogorov; namely, we show that his lower bound is of exact order. In [1], Kolmogorov defined the notion of a rule of choice from a finite binary sequence @t. Its informal description is as follows (see [1] for the precise definition). Let us imagine that we have a set of cards; the number of cards equals the length of @t. The figures from the sequence @t are written on the card's faces; the backs of all cards are identical. First, the cards lie on their faces in the same order in which the figures are arranged in @t. The rule decides which card is to be overturned and (before the card is overturned) whether the figure written on the map should be included in the subsequence. In making the current decision, the rule can take into account the figures written on the cards already overturned. The subsequence chosen by a rule @r from a sequence @t is denoted by @r[@t]. Sometimes, it is useful to consider narrower classes of rules. The monotonic rules (considered for the first time by Church in 1940) always overturn maps in their initial order. The nonadaptive rules specify at once a set of cards, and the subsequence is formed by the figures written on these cards and arranged in the initial order. @definition 1.@ let @r@l be a set of rules for sequences of length @L. A sequences @t of length @L is called an (@n, @[epsilon])-random number generator for @R@L if each rule from @R@L chooses a subsequence @r[@t] in @t with the following property: If the length of the subsequence is not less than @n, then the fraction of zeros in this subsequences differs from 1@2 by a value smaller than @[epsilon]. The absolute value of the difference between 1@2 and the number of zeros in the sequence is called the deviation (of the fraction of zeros). @Remark.@ The results given below can be generalized to the case where 0 and 1 are encountered at frequencies close to @p and 1 -- @p, respectively, in all long subsequences chosen by simple rules. To give a precise meaning to the notion of not too complex rules, we define the complexity of a finite set to be the binary logarithm of its cardinality. Let us introduce the notation @d(@n, @[epsilon]) @ 2@n@[epsilon]@2log@2@e. @Theorem 1@ (Kolmogorov, 1963). @Consider arbitrary numbers @L (sequence length), @[epsilon] > 0 (the deviation of frequency from probability), and @n @ @[epsilon]@[--4] (sample size). For any set of rules @R@L of complexity less than @d(@n, @[epsilon])(1 -- @[epsilon]), there exists an (@n, @[epsilon])-random number generator.@ In [1], Kolmogorov only gave an outline of the proof of this theorem. The complete proof (which involve some delicacies) is contained in these authors' paper being prepared for publication in journal "Problemy Peredachi Informatsii." The following theorem is an algorithmic analog of Theorem 1 (the notion of conditional entropy was introduced by Kolmogorov in [2]). @Theorem 1@'.@ @Consider arbitrary numbers @L (sequence length), @[epsilon] > 0 (the deviation of frequency from probability), and @n @ @[epsilon]@[--4] (sample size). For the set @R@L consisting of all rules such that their conditional entropies at a known @L are smaller than @d(@n, @[epsilon])(1 -- @[epsilon]), there exists an (@n, @[epsilon])-random number generator.@ @Theorem 2@ (Kolmogorov, 1963). @Consider arbitrary numbers @L (sequence length), @[epsilon] @ (0, 1@20) (the deviation of frequency from probability), and @n @ [@[epsilon]@[--3], @L@2] (sample size). There exists a set @R@L of nonadaptive rules of complexity less than 4@n@[epsilon](1 + 5@[epsilon]) for which there exists no (@n, @[epsilon])-random number generator.@ Theorems 1 and 2 give lower and upper bounds, respectively, for the maximal number @[tau] such that, for any @L and any set of rules of complexity less than @[tau], there exists at least one (@n, @[epsilon])-random number generator of length @L. Since @d(@n, @[epsilon]) = 2@n@[epsilon]@2log@2@e is much smaller than 4@n@[epsilon](1 + 5@[epsilon]) at small @[epsilon], Kolmogorov wanted to remove the gap between the exponents of @[epsilon] in the lower and upper estimates. It turned out that the lower estimate obtained by Kolmogorov is of exact order (even for nonadaptive rules). @Theorem 3.@ @Consider arbitrary numbers @L (sequence length), @[epsilon] @ (0, 1@3) (the deviation of frequency from probability), and @n @ [2@[epsilon]@[--3]log@2@L, @L@2] (sample size). There exists a set @R@L of nonadaptive rules of complexity less than @d(@n, @[epsilon])@ for which there exists no (@n, @[epsilon])-random number generator.@ The existence of such a set of rules is proved probabilistically, as in Theorem 1. However, this time, we consider probability distribution over the rules and show that the event "for a set @R@L of rules, there exists an (@n, @[epsilon])-generator" has probability less than 1. We seek the required set of rules among the nonadaptive rules chosen by subsequences of length precisely @n; i.e., a rule is specified by an @n-element subset of the set 1, 2, ..., @L. The number of such rules is (@L@n); we introduce the uniform probability distribution on the set of these rules. Take a sequence @t of length @L. Let us estimate from below the probability that it not an (@n, @[epsilon])-generator for a randomly chosen rule @r, i.e., that the deviation in @r[@t] is not less than @[epsilon]. Suppose that the number of zeros in @t is not smaller that the number of ones (the opposite case is handled symmetrically). Consider the situation where the numbers @L@2 and @n(1@2 + @[epsilon]) are integer; the general case is easily reduced to this situation. We want to estimate from below the probability of such a deviation for which the fraction of zeros in a sample is larger than the number of ones by at least @[epsilon]; obviously, this probability is minimal when @t contains equally many zeros and ones. It is sufficient to estimate the probability of the deviation equal to precisely @[epsilon]. Obviously, this probability is @. Using the upper and lower bounds for the binomial coefficients implied by the Stirling formula and an estimate of the Shannon entropy, we conclude that the sought probability is larger than @e@[--@K], where @. The probability that, for a fixed sequence, one rule does not choose a subsequence with deviation at least @[epsilon], turns out to be smaller than 1 -- @e@[--K]. Now, let us select independently @N random rules (some of these rules may coincide). The probability that a fixed sequence @t is an (@n, @[epsilon])-generator for the set of these rules is smaller than @ (@*) (we have used the inequality (1 -- 1@x)@x < @e@[--1], which holds for @x > 1). Multiplying the value on the right-hand side of inequality (@*) by the number of sequences of length @L, we obtain a sharp upper bound for the probability of the existence of at least (@n, @[epsilon])-generator for the given set of rules; namely, @, which does not exceed 1 for @N = [@] < @e@K@L. The complexity of the set of rules under consideration is less than @. It remains to show that this value is less than @d(@n, @[epsilon])@ for @[epsilon] < 1@3 and @n > 2@[epsilon]@[--3]log@2@L. The following theorem is an algorithmic analog of Theorem 3. @Theorem 3@'.@ @Consider arbitrary integer @L (sequence length), rational @[epsilon] @ (0, 1@3) (the deviation of frequency from probability), and integer @n @ [2@[epsilon]@[--3]log@2@L, @L@2] (sample size). For the set @R@L(n, @[epsilon]) of all nonadaptive rules whose conditional entropy at given @L, @n, and @[epsilon] is smaller than @, there exists no (@n, @[epsilon])-random number generator. (Here, @C is a constant depending only on the choice of an optimal programming language.)@ By Theorem 3, for some set of nonadaptive rules of complexity less than @d(@n, @[epsilon])@, there exists no (@n, @[epsilon])-generator of length @L. Let us show that, for given @L, @n, and @[epsilon], a set of rules with this property can be constructed algorithmically. Indeed, for any set @R@L of nonadaptive rules and any sequence @t of length @L, we can determine effectively whether @t is an (@n, @[epsilon])-generator for @R@L (for this purpose, we must apply each rule from @R@L to @t and calculate the deviation). Searching through all sequences of length @L, we can determine whether there exist (@n, @[epsilon])-generators for @L. Searching through all sets of nonadaptive rules of a given size, we can find the required set (if there are several sets with the required property, we take the first in the list). The conditional (relative to @L, @n, and @[epsilon]) entropy of each rule from the found set does not exceed the complexity of the set plus the length of the program implementing the procedure described in the preceding paragraph. The addition of the remaining rules with entropy smaller than @d(@n, @[epsilon])@ + @C preserves the absence of an (@n, @[epsilon])-generator. ACKNOWLEDGMENTS The authors are grateful to N.K. Vereshchagin for interesting discussions of Kolmogorov's theory. The authors thank A.V. Chernov for great help in preparation of the paper for publication. The main content of the paper was reported at Kolmogorov seminar at Moscow State University in the spring of 2002. We are grateful to the participants of the seminar for attention. This work was financially supported by the Russian Foundation for Basic Research (project nos. 01-01-00505 and 02-01-10904). POSTSCRIPT In [1], Kolmogorov defined the notion of a table of (@n, @[epsilon], @p)-random numbers for an @R@L, which differs from an (@n, @[epsilon])-random number generator mentioned in Definition 1 in that the fractions of ones in the long samples obtained by rules from @R@L is close to @p rather than to 1@2. By @l(@n, @[epsilon]), Kolmogorov denoted the maximal number @l such that, for any @p, any @L, and any set of rules having complexity less than @l, there exists at least one table of (@n, @[epsilon], @p)-random numbers of length @L. The results obtained in [1] imply that, at sufficiently small @[epsilon] (say at @[epsilon] < 1@20) and @n @ @[epsilon]@[--4], @. Kolmogorov stated the problem of removing the gap between the exponents of @[epsilon] in the lower and upper bounds for @l(@n, @[epsilon]). Our Theorem 3 implies the inequality @, which holds at sufficiently small @[epsilon] and @n @ @[epsilon]@[--4]. (To prove this, it suffices to set @L = [@] in Theorem 3). Thus, both the upper and lower bounds for @l(@n, @[epsilon]) have the form (2log@2@e)@n(@[epsilon]@2 + @o(@[epsilon]@2)). REFERENCES 1. Kolmogorov, A.N., @Ind. J. Stat. Ser. A,@ 1963, vol. 25, part 4, pp. 369--376. Reprinted in @Theor. Comput. Sci.,@ 1998, vol. 207, pp. 387--395. Russian translation with author's addition: @Semiotika Informatika,@ 1982, vol. 18, pp. 3--13; reprinted in collection Kolmogorov A.N., @Teoriya informatsii i teoriya algoritmov@ (The Theory of Information and the Theory of Algorithms), Moscow: Nauka, 1987, pp. 204--213. 2. Kolmogorov, A.N., @Prob. Peredachi Inform.,@ 1965, vol. 1, no. 1, pp. 3--11. 1 11