Proof as a Mathematical Object-Proposals for a Research Program

The problem of representing logical implications and proofs by mathematical objects is considered. The need to develop a theory for measuring value and complexity of mathematical implications and proofs is discussed including motivations, benefits and implementation problems. Examples of mathematical considerations are given. Arguments supporting such an advance and its applications in mathematical research guidance and publication standards are given.


The aim of the article
The main aim of this article is to point out the need and possibilities to interpret and encode logical implications (inferences, consequences) and proofs as mathematical objects focusing on complexity and paying less attention to semantic specifics and syntactic issues. This means finding the simplest mathematical structure faithfully representing mathematical proofs and theories, creativity and development. Motivations and possible benefits of this idea are discussed in a programmatic style. It must lead to important advances such as proof complexity measures. It must have important applications in research guidance, evaluation of mathematical results, these applications seem to be important in their own right. Arguments which show that the proposed program will have new features compared to classical logics and computational logic (automated theorem proving) are mentioned. There are no theorems in this article, most issues are discussed with a certain vagueness. This article is not intended to contribute to literature related to an established problem. The closest established or known problem is the Hilbert's 24th problem. Some computer-theoretic aspects are discussed. The article is oriented mainly towards readers interested in advancing mathematical and computational logic.

History and the current state
Usefulness and the exceptional role of mathematics have been best expressed by the hypothetical Pythagorean saying "all is number". It metaphorically asserts that all physical objects, systems and processes may be precisely mathematically modelled. Using a philosophical point of view and language mathematics can be thought of as a universal epistemological framework created by human intellect to justify knowledge or to perform justification/regress steps, see (Pollock, 1975), for knowledge from various areas in a uniform way. An example of intra-mathematical regress step is the category theory -mapping various mathematical theories to the graph theory with additional structures.
Mathematical activities have been mostly related to knowledge justification steps which are called applied mathematics -mathematical modelling for other sciences and processing of numerical, geometrical and physical data. In this paper, the term 'model' means "mathematical model" (as opposed to "mathematical logic model"). The progress of both pure and applied mathematics has been greatly influenced by advances in formalization and coordinatization of mathematical objects and the mathematical language. Apart from many encoding breakthroughs various semantic and syntactic problems of mathematical statements have been considered and solved.
In most areas of mathematics progress crucially depends on computing resources. Computers are constantly being used by mathematicians also to check and verify theorems, to make computations termed "automated theorem proving" and "mechanical theorem proving" which are equivalent to statement proving in certain areas of mathematics, see (Robinson and Voronkov, 2001) and (Chou et al., 1994). Proposals for computerbased mathematical knowledge management systems such as "QED manifesto", see (Wiedijk, 2007), have been made. See (Avigad et al., 2014) for a recent review of computerized theorem checking/proving. One can notice a major longterm trend in mathematics and its applications. The application areas having precise mathematical models and being served by applied mathematics are constantly enlarging and models are getting more precise and rigorous. Even the most "unmathematical" notions and processes, for example, related to consciousness, cognition and psychological activities, may be subject to mathematical modelling (in particular, due to the emergence of mathematical models of cognitive processes and nervous systems) -justification/regress steps in the philosophical sense. We call this trend the Pythagorean process.

Possible nexts steps
One can ask whether the Pythagorean process will continue and what may be its next steps. One feature that is still missing in mathematical culture is modelling and representation of creative mathematical thinking going beyond its semantic and syntactic content -a precise expression of all mathematical implications and proofs as well defined mathematical objects. In this paper we use the term "implication" to denote also "inference" and "consequence". Going beyond semantics and syntactic content means the development of implicational propositional calculus -mapping implications to simpler mathematical objects. Another missing feature is a mathematical structure of mathematical theories -collections of related mathematical results and proof techniques.
We conjecture that there will be advances of mathematics and its encoding which will allow us to go beyond semantics and syntax of mathematical texts -to comprehensively coordinatize (map into mathematical objects) and precisely interpret proofs and mathematical theories as mathematical objects, known or new ones. Given a mathematical theory T (a structure containing objects of study, first-order or higher-order logic statements, proofs etc.) we may look for a mathematical object ρ(T ) which would be a good model/representation of T : elements of T such as logical implications, proofs and subsets of mathematical statements in T would be defined as substructures or quotient structures of ρ(T ). Additionally, we are interested in interpreting specific (extremal in a suitable sense) elements of of ρ(T ) as important statements in T . Such problems for any established mathematical domain may be hard problems in pure mathematics, problems like this do not seem to have been posed before. Implications and proofs could be also considered as geometrical objects being embedded in some suitable ambient space. It may be called representation theory of logic, the meaning of representation here is different from that of the group representation theory. The transfer from T to ρ(T ) should be thought philosophically as a regress step.
The proposed idea goes beyond the standard algebraic logic and the proof theory which deals with constructions of systems of axioms, correct statements, syntactic and language problems, expressive power problems of axiom and inference systems. The proposed research program also goes beyond programs such as the Hilbert's program and the recent "QED manifesto" program because of its focus on implications and models of theories. Our idea can be metaphorically compared to introducing Cartesian coordinates -assigning implications directions and lengths whereas the standard classical logic is interested mainly in premises and conclusions. For the same reason, it goes beyond computational mathematical logic (e.g. Coq , see (Gonthier, 2008), Isabelle , see (Paulson, 1989) and other automated theorem proving systems) dealing with computerized proving or disproving statements in a given formal language and checking human proofs. These systems may contain "dependency graph" features which exhibit implication dependency between statements. Programs for creation of computerized data systems of mathematical knowledge do not focus on implications. Our proposed program may have links to proof complexity theory, e.g. proof size measuring.
Such models would allow to increase the speed and improve the quality of progress of mathematical theories, improve understanding of various theories, introduce canonical forms of arguments and theories, measure and quantify mathematical results such as theorems and lemmas. They would compare and rank different theories and improve understanding of their relations, classify mathematical theories 'up to isomorphism', consider maps between mathematical theories, find or construct extremal (e.g. minimal) theories. It would also be used to guide researchers, show them the most important research directions, problems and milestones in a rigorous and quantitative way. Problems and proofs which are considered "nice" or "aesthetically pleasing" have mathematically well defined extremal properties in terms of suitable coordinatization models. Both theory building and problem posing/solving have to be formalized. Mathematical creativity, the progress of mathematics and the goal of mathematics itself have to be defined as mathematical objects. Such an advance of the Pythagorean process may generate new encodings and metalanguages for mathematical statements and proofs. It would enable mathematicians to counteract the specialization drive and process bigger amounts of information. This research proposal appears to be related to the lesser-known unpublished 24th Hilbert's problem -find the simplest proof of a given statement, compare different proofs, design criteria for simplicity and rigour etc., see (Thiele, 2003). Finding mathematical models of proofs should be considered the main unsolved problem in mathematics nowadays containing the Hilbert's 24th problem as a subproblem.
These models may provide one more abstraction step -allow to make logical implications without focusing on the semantic content of premises and conclusions. From the computational point of view, it may allow substituting logical implication making by computations.
Recursive usage of implications in defining objects representing implications must be properly handled.
Its successes in pure mathematics may be transferred to other sciences through applied mathematics. Complexity and usefulness of different sciences and their branches should be rigorously analyzed and uniformly compared.
If future generations will be interested in further mathematical research (especially in pure mathematics) then computers or their future descendants will be eventually used to perform it. Therefore we need to create theories which would interpret and model human mathematical thinking using mathematical objects which can be processed by computers, free mathematical research from human semantics, reduce mathematical goal setting and creative theorem proving to computation, define the goal of mathematics as a computational result. The step of passing from computations to proofs and algorithms should be iterated producing new paradigms of proofs and algorithms. It may be impossible to change human thinking but it may be realistic to organize and emulate a mathematical research process which would be performed by computers. Additionally, immense future computational and information storage resources will allow to create and maintain a "logical implication service".
Results of implication coordinatization and modelling will advance our understanding of implication making and thinking itself to new levels, question the role and the very need for implication making, offer possible improvements. It may identify limitations, weaknesses and peculiarities of human thinking. If this approach is successful we may ask fundamental questions. What can be considered an advanced or future form of mathematical or general implication/consequence making? If there is such a form how it can be implemented?
Although it is not within the scope of this paper it can be mentioned that possible results in the proposed direction may be combined with expected advances in biology. Such advences may be related to detailed description and understanding of the organization of the brain functioning starting with the subcellular level. Logical implications must be analyzed as processes and states of brain tissues.
There may already exist scattered examples which are known to experts and the Pythagorean process is proceeding in the proposed direction spontaneously. Nevertheless, relevant activities, results and examples should be integrated into a single program. Regardless of their results the proposed research projects may generate nontrivial mathematical results, new mathematical structures, higher levels of abstraction, new encodings and standards for mathematical language.

Applications
A mathematically sound method for measuring value or complexity of mathematical results would also allow setting rigorous standards for research publications in professionally accepted journals and other information depositories. A rigorous evaluation method based on mathematical analysis of results and techniques must be found. A mathematically justified content evaluation method would allow to establish really valuable mathematical results, proof methods and research directions, to measure and classify creativity of research results. In section 2.4 we give descriptions of these and other possible applications.

Main research and application directions 2.1 Coordinatization of implications and proofs using predicate supports
Proofs of mathematical statements are sequences or, more generally, networks of logical implications. One approach to the study of proofs would be to study relatively simple logical implications and their networks. Research may be needed to determine right definitions of irreducible implications, various types of implications and their linkings, embeddings of the objects corresponding to implications in suitable ambient spaces -a geometrization of logic.
Logical implications can be defined as instances of a relation on logical predicates in first-order or higher-order logic using the material condition connective ⇒. Consequence relation used in mathematical logic is also a relevant notion. Given two predicates P (x) and Q(x) defined for all x ∈ X we say that P implies Q The support supp(A) of a predicate A may be defined as the set of A argument values x for which A(x) = true. Validity of a predicate implication P → Q is equivalent to the set-theoretic inclusion of the support of P (x) into the support of Q(x): P → Q is a true statement if and only if supp(P ) ⊆ supp(Q).
We could try to coordinatise the implication P → Q by set-theoretical, combinatorial, algebro-geometrical, geometrical, topological and complexity-theoretical properties of the sets supp(P ) and supp(Q) such as 1) absolute and relative sizes and shapes of supp(P ), supp(Q) and supp(Q)\supp(P ), 2) properties of the boundaries of supp(P ) and supp(Q). We conjecture that 1) the implication P → Q can be considered easy if supp(P ) is a relatively small, e.g. low-dimensional, subset of supp(Q); 2) implications P → Q 1 and P → Q 2 can be considered distinct if (supp(Q 1 ) ∩ supp(Q 2 ))\supp(P ) is small.
Proofs as sequences of implications P 1 → P 2 → ... → P n may be considered as sequences of set-theoretic inclusions supp(P 1 ) ⊆ supp(P 2 ) ⊆ ... ⊆ supp(P n ). Passing from semantic-specific implication making to constructing sequences of embedded sets should be considered as a computational substitution of implication making.
Coordinatization and modelling of logical implications may also be related or even reduced to computational complexity if computations are involved in determining the inclusion supp(P ) ⊆ supp(Q).
Additional idea is to generalize implications, to define other binary relations in statement sets. Given predicates P (x) un Q(x) we can consider another properties of supp(P ) and supp(Q) (instead of inclusion) for this purpose. For example, we can define that P almost implies Q if supp(P )\supp(Q) is small or simple in a suitable sense.

Irreducible implications in propositional logic
We give a candidate definition for irreducible implications in the case of propositional logic (predicates depending on binary vectors). In this case X can be thought as Z n 2 , n -the number of variables. Implications between formulas correspond to inclusions of Z n 2 -subsets. Suppose p(X 1 , ..., X n ) and q(X 1 , ..., X n ) are formulas in propositional Boolean variables X 1 , ..., X n and the implication p → q is true. We call an implication p(X 1 , ..., X n ) → q(X 1 , ..., X n ) irreducible if the full disjunctive normal form (FDNF) of q has exactly one more full disjunctive term than the FDNF of p. In this case the implication p → q is not a composition of two noninvertible implications.

Complexity of implications in propositional logic
Complexity of formulas in propositional Boolean variables can be measured in terms of their minimal disjunctive or conjunctive forms, the structure of prime implicants, Blake canonical forms, logical depth and other circuit complexity measures, see (Brown, 1990). The complexity of an implication p(X 1 , ..., X n ) → q(X 1 , ..., X n ) can be measured in terms of changes of disjunctive (conjunctive, minimal, Blake etc.) normal forms of p and q.

Modelling implication relations using discrete mathematics, algebra and topology
In this section, we consider a few possible directions for advancing Hilbert's "logical arithmetic".

Category-theoretic approaches
Define a category Math where objects are mathematical statements and morphisms are logical implications (consequences), composition of morphisms may be the standard composition of implications. We can study Math or its subcategories with respect to problems such as concretizations, functors to and from other categories, interpretations of category-theoretic constructions such as natural transformations, adjoint functors, pushouts and pullbacks, limits, initial and terminal objects, quotients etc.

Graph-theoretic approaches
A mathematical theory (its subtheory or quotient theory) can be interpreted as a directed Activity-On-Arc type graph corresponding to the implication relation. We consider the proof graph Π = (Σ, Λ), where elements of Σ are statements (which are not interpreted as implications) and directed edges in the set Λ are relatively simple, irreducible logical implications.
Graphs are widely used in mathematical logic, see (Quispe-Cruz, 2014) for a recent work. Among other approaches, propositional formulas can be interpreted as graphs called cographs. Recently there has been an attempt to encode mathematical logic without syntax using graphs -to define and study combinatorial proofs in propositional logic as graph homomorphisms of a certain kind, see (Hughes, 2006).

Metric properties of proof graphs
Assume that any edge of a proof graph Π is given a weight which measures the complexity or some other well-defined property of the corresponding implication. In the simplest naive cases, weights could be positive numbers. Assume that we are given a directed path between two vertices P and Q having edges e 1 , e 2 , ..., e n with weights w 1 , w 2 , ..., w n which corresponds to a proof P → Q. Complexity or other measure of the proof could be defined as an appropriate function of weights w 1 , w 2 , ..., w n , for example, i w i . Having a proof graph invariant which would correspond to proof weight or metric we could investigate problems such as, for example, the problem of finding all statements within a fixed distance from a given statement or axiom, or, the problem of finding distance between the premise and the conclusion. Analogues of various metric-based subgraphs such as nearest neighbour graphs can be studied. Vertices having extremal eccentricity values should be studied.

Vertices with special/extremal properties as valuable or low value statements
Proof graph models and other proof coordinatization ideas should rigorously identify extremal statements and extremal implication steps which are relatively more or less important than others. In particular, vertices of proof graphs having extremal properties related to connectivity, metric, centrality or other invariants may be considered as valuable "theorems". The same arguments should identify statements which can be considered of low value.
Path systems Different paths in a proof graph between vertices P and Q represent different proofs between the corresponding statements. Having fixed vertices P and Q we can study all (P, Q)-paths, e.g. we can pose the problem of finding all (P, Q)-proofs. We can also try to find vertices with special properties, e.g. vertices which are in more than one (P, Q)-path.
In topological models for proof spaces topological ideas such as homotopy classes of path systems and homology-type invariants should be considered.
Shortest paths Given two statements P and Q in a proof graph we could look for (P, Q)-paths with some special or extremal properties such as the paths having minimal weight. That would correspond to finding (P, Q)-proofs with some special properties, for example, proofs of minimal total or stepwise complexity. These ideas again remind us of the Hilbert's 24th problem and the "simplest proof".

Motifs and forbidden structures
We can study typical subgraphs (motifs), forbidden subgraphs and minors of proof graphs.

Algebraic approaches
Various algebraic approaches to mathematical logic are currently being pursued, see for example (Font and Jansana, 1996). The composition of implications can be interpreted as a partially defined binary associative operation on the set of implications. The implication set Λ thus has a monoid structure, algebraic questions may be asked and algebraic methods may be used to study Λ.
Another algebraic approach is to study ring homomorphisms of coordinate rings of algebraic varieties. Inclusion of predicate support sets, see 2.1, may be interpreted as morphisms of algebraic varieties which by duality induce morphisms of their coordinate rings going in the opposite direction.

Algebro-geometric approaches
The well-known coordinate method used for solving problems in Euclidean geometry, algebraic geometry used in the mechanization of problem-solving, see (Chou et al., 1994), can be thought as precursors of more advanced theories to come.

A topological approach
A mathematical theory (Σ, Λ) can also be endowed a topological space structure as follows. The implication binary relation → is a preorder relation -it is obviously reflexive and transitive. We can view the implication relation as a specialization preorder for the Alexandrov topology τ on Σ corresponding to ←: the open sets for τ are the upper sets with respect to the relation ←. We remind the reader that a set U is an upper set with respect to ← if the conjunction of Q ∈ U and P → Q implies P ∈ U , see (Barmak, 2011). We can investigate the given mathematical theory (Σ, Λ) using topological experience and intuition -study the topology τ with respect to standard problems of general and algebraic topology such as interpretations of continuity, separability, metrizability, homotopy or (co)homology invariants. Certain properties of proofs which vaguely are linked with continuity should be defined.

Proof bundles
If we have predicates P (x), Q(x) where x ∈ X and an implication or proof f : P → Q for which P (x) ⇒ Q(x) is true for every x ∈ X then the complexity of proofs and proofs themselves may be different for different x ∈ X. A topological analogy with topological bundles can be used, the set X being the base and the proof f x for each x ∈ X being the fibre.

Induction analysis of mathematical results and theories
Mathematical results and theories should be analyzed with respect to the existence of Noetherian induction proofs. Suppose the statement ∀ x ∈ X P (x) is true, does there exist a wellfounded relation R ⊆ X × X such that the statement can be proved using Noetherian (structural) induction on R? The complexity of involved well-founded sets and induction steps can be considered as complexity and value measures.

Complexity-theoretic approaches
Given an implication or a proof f : P → Q we can measure the (deterministic) complexity of f as some computational complexity measure (time or space-related) of a computational process producing f . An example of such a measure can be proof size considered in proof complexity branch of proof theory. Value of mathematical results can be estimated considering their impact on computation complexities (time, space, parallelability etc.). A result can be considered valuable if it has a computational value such as reduction of complexity classes of computational and decision problems. On the contrary, a result may be considered easy if it amounts to a polynomial-time reduction. History of mathematics should be studied as a network of complexity reductions.

Metamathematical aspects -a mathematical justification/regress step
Since the research in this program has not even started it may be too early to speculate about metamathematical and philosophical problems related to the regress step discussed here such as the mathematical "problem of the criterion", see (Chisholm, 1989), (Cling, 2014), or the Münchhausen's (Agrippa's) trilemma, see (Albert, 1991). The Münchhausen trilemma case determination (i.e. whether the proposed intra-mathematical regress is cyclic, infinite non-cyclic or finite) seems to be an important problem. Mapping (justifying) a mathematical theory T to a simpler mathematical object ρ(T ) may induce another justification step -from the theory of ρ(T ) to another mathematical object ρ(ρ(T )).

Applications -research guidance and value of mathematical texts
A mathematically sound method for measuring value or complexity of mathematical results would also allow setting rigorous standards for mathematical research and discourse. It should involve research directions and problems, the value of mathematical results and complexity of proofs.

Research guidance
Research problems and new mathematical objects are often insufficiently motivated. Mathematical research processes, problems, conjectures and research interests should be motivated by rigorous analysis based on an implication and proof modelling/coordinatization theory. Such a theory would show valuable problems, computations and/or directions which need to be studied to advance the understanding of a given domain, missing or optimal concepts that need to be introduced, proofs that need to be modified, mathematical regress steps (mappings to simpler objects) that need to be done etc. It would direct the development of mathematics, link and rank various areas of mathematics.

Control of the publishing process
Apart from guiding mathematical research and improving mathematical texts new advances in proof modelling and the complexity theory could impact the dissemination of mathematical texts.
The current competition-oriented, trend-based and partially dogmatic evaluation of results and merits can not be considered justified in mathematics -the very source and centre of the culture of unbiased logical reasoning and numerical analysis. The lack of a rigorous evaluation theory is a sign of backwardness in the same way as the lack of mathematical modelling is such a sign in any other area. A rigorous evaluation method based on mathematical analysis of results and techniques must be found.
A rigorous proof complexity and value theory would allow to define and determine values of correct results more rigorously and set standards for them. Research result evaluation should be reduced to computation. Some of the current features of publishing which are used to cover social processes, would become redundant. A theory may allow to rigorously compare and uniformize different areas, projects and activities of mathematics. It would be a helpful research tool for working mathematicians. Information about all known mathematical results could be stored as a single database. The author suggests to complement or replace the existing publication system (journals) by a single international database which would openly, in a certified way, evaluate sufficiently motivated and complex results.

Some concrete proposals
We formulate a few specific initial research proposals: 1) analyze the body of facts of the Euclidean geometry with respect to the implication modelling and Hilbertian simplicity idea, create a database of all nonequivalent logical steps, 2) analyze the body of combinatorics with respect to structural induction, create a database of all nonequivalent induction arguments, 3) analyze the body of graph theory with respect to the proof bundle idea, 4) classify invariants and object properties in a mathematical domain with respect to computational complexity (e.g. polynomial or NP-complete) of decision problems, study the network of computational reductions, 5) introduce measures of cognitive complexity of mathematical activities in school mathematics courses.

Conclusion
We have given some arguments which describe a proposal for possible future research in mathematical logic. It can be defined as a faithful mathematical representation of implications, proofs and theories. The main argument is a possibility to formalize, map into simpler mathematical objects and measure logical implications, to make nontrivial and creative mathematical theorem proving a computation. Another argument is a possibility to rigorously measure mathematical results and to guide the mathematical research rigorously and optimally. We consider it the most important unsolved problem of modern mathematics.