1 INTRODUCTION

The paper is devoted to the prospects of a radical renewal of approaches to the study of mathematics. The renewal is based on systematic use of computer experiments (sometimes abbreviated as CEs) primarily in the work of students.

The main thesis supported by the authors and discussed in detail in [1] is as follows:

In today’s world, for the majority of students, in terms of their motivation and mathematical and general intellectual development, the mastery of general mathematical methods should have a higher priority than knowledge of theorem formulations, their proofs, and algorithms and strategies for solving known classes of problems.

In the work of mathematicians and mathematical students, we distinguish an experimental phase of mathematical activities, which involves the planning and setting of an experiment, observation, and suggestion, confirmation, and refutation of hypotheses. The world’s existing practices convince us that the computer is an extremely powerful tool for mathematical experimentation. It makes the experimental work of every student a reality. Of particular importance is the visual environment of an experiment and presentation of its results.

Attempts to prove something should logically follow from an experimental material and should be motivated and constantly correlated with it. Proofs constructed individually by every student at their level are also created with visual computer support: proofs are illustrated by drawings, certain calculations are carried out on a computer, etc.

We argue that mathematical experiments at the school level can be as spectacular as physical, chemical, and biological ones. In such experiments, the computer usually plays a crucial role in helping students to understand the beauty of mathematics.

Finally, even in the traditional approach to teaching, when statements and proofs are told to students and they have to memorize them (at best, with an understanding of them), a “demonstration experiment” or illustration presented by the teacher is useful (as well as demonstrations in natural sciences).

This paper gives examples of relevant educational situations and strategies both for standard topics of school and university “general” mathematics and for research projects that supplement these standard topics. Within our approach, the answer to the question “What do we teach?” is tautological: we teach mathematics, rather than skills in solving standard problems. As to the question why? we refer to [2]:

Mathematics is a wonderful discipline and, no matter who is taught, our main goal is to convince them of it. … However, it is impossible to convey our feelings just by allowing the student to observe other people’s reasoning and actions, even admirably, like the dance of Maya Plisetskaya (“I’ll never make it anyway”); it is also not enough to confine student to ready patterns, incomprehensible and uninteresting. It is necessary to help every student to build their own relationships with mathematics, honest, meaningful, and enjoyable; within these relationships, they must learn to do something, understand something, and formulate something.

In a well-planned series of computer experiments (with teacher’s participation), students of all levels can learn all this, having fun and feeling proud in the case of success.

More should be said about the role of proofs. The traditional objection to the expansion of mathematical experiments is that proof-based reasoning—the basis of mathematics—is overshadowed or disappears altogether.

These objections are partially justified. However, first, the notion of proof in mathematics is not something absolute and perfectly clear (see [3], [4]). We will return to this point later. Second, a fundamental mistake is to use a uniform approach to the task of teaching proof-based reasoning to different categories of students.

For students who are far from mathematics, the today’s situation is reduced at best to thoughtless memorization of other people’s proofs, without any understanding, let alone “appropriation.” One of the authors (Shabat), who has worked at the Russian State University for the Humanities (RSUH) for more than 30 years, can say this with full responsibility. For these learners, the understanding of mathematics as an experimental discipline in which the truth of at least some of the statements allows verification is preferable to the perception of mathematics as a set of texts sometimes in special languages (some of which are labeled with words, such as theorem, consequently, necessary, etc.) and to the mastering of mathematics as the ability to reproduce these texts. The view that some of these texts actually prove something and make it more grounded and convincing is also almost exclusively a repetition of the text and a thoughtless reproduction of the teacher’s point of view. Thus, we get a picture completely opposite to the desired one: instead of respecting the truth and discovering it ourselves, we get following authority backed by nothing (except the exam mark).

A more serious issue is about keeping proof-based learning (sometimes boring) for future mathematicians, physicists, IT scientists, engineers, etc. The approach we propose combines experiments as a source of hypotheses and a tool for their verification with proofs. Of course, the proof can be a complete enumeration of a set if the completeness of the enumeration is proved (for example, obvious). Developing the ability to prove something is part of the professional qualification. The lack of proof of some claim can result in the ineffectiveness of work or even danger to life.

The proposed approach looks natural to students if both experiments and proofs emerge, are used, give pleasure, and provide confidence, in other words, they become part of mathematical culture even in elementary school. At a later age, it is useful to undermine the credibility of “obvious” “experimental facts” by refuting plausible statements, for example, that he sequence \({{x}_{{n + 1}}} = {{x}_{n}} - \frac{{f({{x}_{n}})}}{{f{\kern 1pt} '({{x}_{n}})}}\) generated by Newton’s method always (and very quickly) converges to the solution of the equation \(f(x) = 0\). Due to this “undermining,” we go back to the necessity of proof.

However, even with enough time and attention paid to classical proofs, we should not idealize them and believe that difficulties in teaching rigorous thinking can be overcome by returning to pre-computer standards. The fact is that the notion of the full rigor of even venerable axiomatic disciplines, such as Euclidean plane geometry, requires serious reservations. A classic example of a formal gap in Euclid’s Elements (see [5]) is the statement on a nonempty intersection of circles (see Fig. 1), which underlies an “algorithm” for constructing an equilateral triangle with a given side (see, e.g., [6]). A successful attempt to fill gaps of this type was made in [7], but this book in no way can be used as a basis for a school course (even for most advanced students).Footnote 1 An improvement of the situation was outlined in [8], but that attempt needs to be advanced and completed.

Fig. 1.
figure 1

Intersection of circles.

Fig. 2.
figure 2

Distances from colored points to the black one.

Fig. 3.
figure 3

First congruence criterion.

Fig. 4.
figure 4

Second congruence criterion.

Fig. 5.
figure 5

Third congruence criterion.

Even in the best schools of the pre-computer era, different levels of rigor of proof were allowed, which certainly contradicts the “purist” professional view of proof (cf. Mikhail Bulgakov’s dialogue of Woland with the bartender about the sturgeon’s first- and second-grade fresh). Of course, the value of school geometry proofs is not in reproducing “the whole Euclid,” but rather in giving every student experience of their own activity in the most important field—mathematical proof—supported by visuality. From this point of view, the kind of visibility (paper or computer) is not as important, but computer visibility ensures a faster search for proofs for more students, expands the circle of participants, and makes their work more creative and exciting. Another argument in favor of a computer experiment is that any proofs of nontrivial facts are better understood from experiments with proof steps verified in the same computer environment where the experiments were carried out.

Summarizing, we can formulate our position. Mathematics is not just a set of formulas, formulations, and methods whose knowledge and application skills are necessary for receiving different certificates and diplomas. This is an area of intellectual activities of humankind, some understanding of which based on human’s own experience is highly desirable for modern cultured people. A computer experiment makes it possible to gain such experience.

Various aspects of computer experiments are discussed in the main part of this paper.

2 GEOMETRY

Traditional Euclidian geometry seemed to be a natural field for school mathematical experiments with the use of a computer as early as the 1980s. During the past decades, many thousands of teachers and students, including in Russia, have used experimental environments, such as Cabri Geometry, Live Mathematics (Geometer’s Sketchpad), Mathematical Constructor, and GeoGebra (a freeware system). Note that in the 1980s the creation of high-quality dynamic (experimental) geometry environments was a nontrivial task from both mathematical and programming points of view and required high-level qualifications and talent, which were demonstrated by J.-M. Laborde and N. Jackiw.

In 1993, on Semenov’s initiative, Shabat became the head and, in 1993–1996, was the main developer (with participation of N.Kh. Rozov, A.V. Pantuev, et al.) of a large-scale project undertaken at the Institute of New Educational Technologies (INET) (the main promoter of the educational philosophy of constructionism in Russia) concerning the implementation of dynamic geometry in Russian education. Within this project, all definitions, theorems, proofs, and exercises in basic sets of Russian textbooks on geometry (coauthored by teams led by Atanasyan and Kolmogorov) were converted into dynamic geometry format “Live geometry” implemented at the INT on the basis of Geometer’s Sketchpad. This work was continued in the Mathematical Constructor headed by V.N. Dubrovskii. Simultaneously, educational activities with school and college students based on GeoGebra were performed by M.V. Shabanova’s team at the Lomonosov Northern (Arctic) Federal University in cooperation with a Bulgarian team of educators. Finally, an approach to dynamic (call it algorithmic) geometry based on Logo versions was developed at the INT under the direction of S.F. Soprunov in cooperation with Canadian (LCSI) and Bulgarian (PGO) researchers and teachers.

A traditional objection to geometric courses based entirely on computer experiments reads as a picture cannot replace logical reasoning. This objection was discussed above in the general form. In fact, this implies that memorizing someone else’s proof (with a doubtful understanding) is fundamentally more important than student’s own hypothesis about the truth of a geometric statement. This position, which was basic and obvious in mass school in the 19th and 20th centuries, today becomes progressively less justified and merely archaic.

We believe, as a number of other well-known mathematicians, that independent analysis of mathematical reality (good, if visually represented) is a necessary element of work of a mathematician and a student learning mathematics.

It should be emphasized once again that, in the clearest form, a geometric statement is proved in the same dynamic environment in which this statement was discovered experimentally (prior to and in the course of the proof). It is preferable that the proof be also found by students: with the help of the teacher or independently. However, any amount of proofs we deem necessary for a particular student to master will require less time and effort of the teacher and the student if these proofs are done in a visual environment, namely, on the screen in the classroom or on students' individual tablets.

Summarizing, visuality is the backbone of school geometric proofs, and the evidence of facts always helps to prove them. The use of dynamic geometry expands the scope of this help.

Deciding on the priority of independent observation, hypothesizing, and the construction of proofs about geometric reality, we get a basis for revising the set of theorems in the school course of geometry. The need for such a revision of the course, which is overloaded for mass school students, is clear today.

Below, we consider only planimetric topics. However, due to computer experiments, even more significant changes can be made in teaching school stereometry as facilitated by the modern high-definition computer graphics, processors' computing power, and the application of virtual and augmented reality and 3D printers.

Below, we give several illustrative examples, starting with simple ones.

(a) Circle. One of the main children’s impressions of the great mathematician Alexander Grothendieck was the definition of a circle (see his famous text in [10]). Grothendieck was fascinated with the possibility of expressing perfect roundness by a rigorous formulation (according to the thirty-year experience of one of the authors, rather few modern students of a nonmathematical university know the definition of a circle as the locus of points equidistant from a given center).

Grothendieck wrote thatFootnote 2

… around the age of twelve, I was interned in the Rieucros concentration camp (near Mende). It was there that I learned, from an inmate, Maria, who was giving me voluntary private lessons, the definition of a circle. This one had impressed me by its simplicity and its obviousness, whereas the property of “perfect roundness” of the circle appeared to me before as a mysterious reality beyond words. It was at this moment, I believe, that I glimpsed for the first time (of course, without formulating it in these terms) the creative power of a “good” mathematical definition, a formulation describing the essence. Even today, it seems that the fascination exercised over me by this power has lost none of its force.

These words were written by a mathematician famous for his “abstract” definitions in various fields of mathematics (primarily in algebraic geometry). His views on proper definitions and their understanding, sometimes emotionally charged, should be regarded as extremely authoritative.

Our discussion is concerned not so much with the relationship of learners with texts of definitions, but rather with computer experiments clarifying these definitions. In this case, we mean the not very obvious (according to both Grothendieck and the authors’ pedagogical experience) relationship between roundness and the distance to a fixed point.

The measurement of distances between a fixed point and numerous randomly selected ones, comparisons of these distances with fixed ones, and different colorings resulting from these comparisons visually and gradually form the definition of a circle in the memory of even most “non-mathematical” students.

(b) Theory of triangles. Congruence criteria for triangles can be learned experimentally as problems of constructing triangles with given elements.

An experiment shows that constructions associated with the first and second congruence criteria are possible for various combinations of initial data, whereas in the case of the third congruence criterion, the triangle disappears from the drawing as soon as the length of any side exceeds the sum of the other two. Accordingly, the possibility, impossibility, and uniqueness of the result (the last follows from the congruence criterion) look obvious. Moreover, the causes of this become evident as well. School “proofs” of these facts pale in the light of this evidence. However, this is due not so much to the convincingness of visuality, but rather to the blurred bases of school geometry.

In the same drawings, it is easy for students to experimentally discover the necessity of the “between” condition for the first criterion and the “adjacent” condition for the second criterion.

In dynamic environments, the congruence of triangles can also be examined in terms of isometrics, including orientation-reversing ones. A triangle can be actually (not mentally) superimposed on another one either directly or by reflection.

(c) Isoperimetric problem for polygons. The ratio of the squared perimeter of a polygon to its area is invariant under homothety (similarity transformations). This fact, which is unobvious for school students that are far from mathematics (squared perimeter seems an intangible abstraction), is conclusively justified with the help of a computer experiment. The minimization of this ratio can be interpreted as Dido’s problem (see, e.g., [8]). In a dynamic environment, the problem of finding the best, in terms of this ratio, n-gon for a fixed n is very useful.

Given a polygon P, we introduce the notation

$${\text{Dido}}(P): = \frac{{{\text{perimeter}}{{{(P)}}^{2}}}}{{{\text{area}}(P)}}.$$

Experiments in dynamic geometry can yield results, for example, as shown in Fig. 6.

Fig. 6.
figure 6

“Dido number” experiments.

These results show than the Dido number is smaller for a “rounder” polygon. On the contrary, for polygons that, for example, “wrinkled” or “flattened,” the Dido number can be arbitrarily large, which can be experimentally by drawing all possible polygons and observing how the Dido number changes for them. After finding some pattern, it is possible to suggest a sequence of polygons for which the Dido number is greater than a given bound. The task is accessible to interested eighth graders, and it is important that the computer take over the calculations, while the students do the research and creative part.

A somewhat simpler problem is widely known: is there a triangle with sides longer than one meter (kilometer) and an area less than one square centimeter (millimeter). In this case, similarly, if the problem is not solved immediately, we can start experimenting by drawing a triangle, calculating its area (the computer will do this), and then moving its vertices, so that the sides “remain long,” while the area decreases.

In the context of Dido’s problem, as the study continues, the question arises about the minimum possible value of the Dido number over all n-gons for a fixed n. Experiments suggest that this value is reached for a regular n-gon, although this is rather difficult to prove.

The experiments also prompt us to consider the problem as \(n \to \infty \), i.e., for polygons with an arbitrarily large number of sides. An experiment suggests that the champion among the “polygons” will be the circle. For a circle of radius r, the Dido number is given by

$${\text{Did}}{{{\text{o}}}_{{{\text{min}}}}}\frac{{{{{(2\pi r)}}^{2}}}}{{\pi {{r}^{2}}}} = 4\pi \approx 12.56,$$

and this number bounds from below all Dido number values, which the experimenter can observe (in dynamics!). A theoretical reflections concerning of this fact involves much good mathematics, in particular, the definition of the length of a circle.

Dido’s problem is also remarkable in that it introduces the student into resource optimization tasks, which play a fundamental role in the modern world.

(d) Ceva’s theorem. Suppose that the Experimenter (student) draws an arbitrary triangle, chooses three arbitrary points on its sides, and join them to the opposite vertices (the joining segments are called cevians; in Fig. 7, they are depicted by dashed lines). Then the side segments are colored as shown in Fig. 7 in three colors with alternating “thick” and “thin” segments. This is a preparation for an experiment called “design of an experimental setup” with the use of elementary graphical and computational capabilities of dynamic geometry.

Fig. 7.
figure 7

Ceva’s theorem.

Fig. 8.
figure 8

Napoleon’s theorem.

Fig. 9.
figure 9

Varignon’s theorem.

Fig. 10.
figure 10

Quadratoid theorem.

The next step is to measure parameters of interest, namely, segment lengths in “experimental environment” within “the experimental setup.” The results can be described as follows.

In this text and its illustrations, we simplify the exposition: in actual dynamic geometry, as in usual geometric considerations, all points, segments, etc., can be named. Next, we can operate with the lengths of named segments. After that, the unnecessary notation, computations, and other details can be hidden on the screen. The result is presented in the illustrations.

The task posed by the teacher for the students is to write two products: of all thick segments and of all thin segments.

The next phase of the experiment can be to consider well-known segments in the triangle, such as medians, bisectors, and heights, as cevians and compare two products (thick and thin) for them. It is easy to see that these products are equal to each other.

It is possible to do differently, namely, to experiment with three arbitrary segments, trying to achieve an equality.

Now, returning to our experimental setup, we can again consider the general situation and raise the question about the conditions for the equality of two products. Many students make an empirical discovery and suggest a simple hypothesis: “the products are equal when three cevians pass through a single point.”

The experiment can motivate the proof of Ceva’s theorem and, additionally, can provide an opportunity to surge into a historical digression about Italy and its mathematicians and engineers.

The proof, in different variants, can be constructed together with students with the help of dynamic geometry. Someone can begin with proving the special cases of medians, heights, and bisectors: after these lines have been built, the resulting sextuplets of segments are displayed on the screen. Here, in each of the cases, a new experiment begins with thinking about its result: properly named (even better colored) segments are measured, algebraic expressions for them are found (possibly, with someone’s help), and the products of these expressions are repeatedly calculated in the dynamic environment, after which the corresponding algebraic identity is checked ONCE. The corresponding expressions for medians, heights, and bisectors are given by

$$\frac{a}{2} \cdot \frac{b}{2} \cdot \frac{c}{2} = \frac{a}{2} \cdot \frac{b}{2} \cdot \frac{c}{2},$$
$$a\cos \gamma \cdot b\cos \alpha \cdot c\cos \beta = b\cos \gamma \cdot c\cos \alpha \cdot a\cos \beta ,$$
$$\frac{{ab}}{{a + c}} \cdot \frac{{bc}}{{b + a}} \cdot \frac{{ca}}{{c + b}} = \frac{{bc}}{{a + c}} \cdot \frac{{ca}}{{b + a}} \cdot \frac{{ab}}{{c + b}}.$$

Thinking about the special cases shows, for example, how strong the generalizing formulation is.

(e) Napoleon’s theorem and its generalizations. The following theorem is attributed to the emperor Napoleon Bonaparte.

Theorem 1 (Napoleon’s). The centers of the equilateral triangles erected externally on the sides of an arbitrary triangle form an equilateral triangle.

This theorem is an example of an entertaining and deep “mathematical trick”: starting with an arbitrary irregular object and applying beautiful and clear operations to it, we obtain an absolutely regular, symmetric object (of the same kind). Of course, the beauty of this fact can be fully appreciated (and independently discovered prior to its proof!) by mass school students only in the environment of dynamic geometry. It is there that the student, first slightly shifting the triangle vertices and, then, getting a taste for it and trying to move them to the most exotic places of the screen, sees a miracle: Napoleon’s triangle is always equilateral.

At present, we cannot expect that mass school students would prove Napoleon’s theorem by themselves even with the help of a teacher. However, many would be satisfied with observation and suggestion of hypotheses. The next step is to consider equilateral triangles erected internally with respect to the original one.

Related to Napoleon’s theorem, the following result can be not only seen, but also proved by an ordinary school student.

Theorem 2 (Varignon’s). Given an arbitrary quadrilateral, the midpoints of its sides form a parallelogram.

Following our general principle of treating any school student as a working mathematician, we invite the student to construct a “midpoint” quadrilateral for various original quadrilaterals and to answer the question as to whether all middle quadrilaterals have a common property.

Someone can even be asked to make the original quadrilateral invisible, leaving only its vertices, which can be used to shift the whole configuration.

Then the stage of proof begins. Students who fail to do it immediately can be advised to draw a diagonal and consider part of the drawing, a triangle, etc.

The ones who have succeeded in doing the proof can be asked to consider the case of a nonconvex quadrilateral, trying to find experimentally and with proof the area of midpoint quadrilateral, etc.

This supposition was largely justified. Here is a detailed account of those events. Zhenya Lisitsyn, the eighth-grader of school no. 45, experimentally discovered a “quadratoid theorem” (by quadratoid, we mean a quadrilateral with equal and perpendicular diagonals). This theorem is stated as follows.

Theorem 3 (on the quadratoid). The centers of squares constructed externally on the sides of any quadrilateral are the vertices of a quadratoid.

This result has a peculiar history. In 2002–2003, one of the authors (Shabat) participated in teaching math classes with the use of “Live Geometry.” Specifically, he worked with 8th grade students of Moscow school no. 45 (now named after L.I. Milgram, its principal of that time) together with teacher V.V. Kulagina. Along with learning the basic material, the students were invited to implement research projects, which included Napoleon’s theorem. It was natural to consider some more general situations. The other author (Semenov) supposed that, due to the use of powerful tools based on visual computer experiments, it would be possible to find new theorems unknown not only to school students or school teachers, but also to the mathematicians participating in the work and, possibly, even “absolutely new” results.

In 2004, a Russian team of 100 people led by Semenov participated in the 10th International Congress on Mathematical Education in Copenhagen (see [11], [12]). During a special “Russian day,” a Russian mathematical exhibition was arranged on an area of 400 m2, where we promoted the research work of schoolchildren, illustrating its fruitfulness by Lisi-tsyn’s result. Quadratoids were even depicted on the delegation shirts. Shabat gave a presentation on this topic, which was well received. This mathematical result was treated as new, being essentially so.

Later, Shabat developed techniques for geometric research in dynamic environments together with the MSPU student Polina Makarova and the MSPU teacher Teslya [13]. An important example was generalizations of Napoleon’s theorem. It was then that Polina found that “Lisitsyn’s theorem” is a result known as van Aubel’s theorem, and it was proved in the 19th century (see [14]).

Some other generalizations of Napoleon’s theorem can be found in [13].

The next section (until the transition to algebra), written by Shabat, shows how higher algebra, namely, complex numbers and matrices, can be used to prove theorems of elementary geometry.

This circumstance makes the field of elementary geometry especially valuable in training mathematics teachers. Today, this training is overloaded with mathematics branches that have nothing to do with school even in their formulations, and students are given materials that they never will need and poorly assimilate and that do not contribute to their mathematical, intellectual, and cultural development.

Our approach is associated with considering transformations of sets of polygons; the same approach was developed in [15], which is a wonderful work combining simplicity and depth.

3 NAPOLEON–DAVIS TRANSFORMATION

For any positive integer \(n \in \mathbb{N}\) we introduce a (simplicial) cone of polygons

$$\begin{gathered} {{\mathcal{P}}_{n}}: = \{ z = ({{z}_{1}}, \ldots ,{{z}_{n}})|{{z}_{1}}, \ldots ,{{z}_{n}} \\ ~{\text{are the vertices of a convex}}\,\,n{\text{ - gon}}\} \subset {{\mathbb{C}}^{n}}. \\ \end{gathered} $$

Obviously, the set \({{\mathcal{P}}_{n}}\) is nonempty only for \(n \geqslant 3\).

For an arbitrary angle \(\alpha \in (0,\pi )\), the Napoleon–Davis transformation is defined as

$$\mathcal{N}{{\mathcal{D}}_{{n,\alpha }}}:{{\mathcal{P}}_{n}} \to {{\mathcal{P}}_{n}},$$

and it maps every polygon to a new one with vertices lying outsideFootnote 3 the initial polygon at vertices of isosceles triangles constructed on the sides of the initial one and having equal angles \(\alpha \) at the vertices opposite to the sides of the initial polygon (see Fig. 11).

figure 11

Fig. 11.

This transformation is attributed to Napoleon. In the introduced notation, it is given by \(\mathcal{N}{{\mathcal{D}}_{{3,\frac{{2\pi }}{3}}}}\), and Napoleon’s theorem states that the image of \(\mathcal{N}{{\mathcal{D}}_{{3,\frac{{2\pi }}{3}}}}({{\mathcal{P}}_{3}})\) consists of points of \({{\mathbb{C}}^{3}}\) corresponding to equilateral triangles.

Proposition. The Napoleon–Davis transformation extends to a linear mapping

$$\mathcal{N}{{\mathcal{D}}_{{n,\alpha }}}:{{\mathbb{C}}^{n}} \to {{\mathbb{C}}^{n}}.$$

If wk is a vertex of an isosceles triangle based on the side \([{{z}_{{k - 1}}}{{z}_{k}}]\), then this mapping is given by the formulas

$${{w}_{k}} = \frac{{\text{i}}}{2}\frac{{{{{\text{e}}}^{{ - {\text{i}}\frac{\alpha }{2}}}}{{z}_{{k - 1}}} - {{{\text{e}}}^{{{\text{i}}\frac{\alpha }{2}}}}{{z}_{k}}}}{{\sin \frac{\alpha }{2}}},$$

where \(k \in \frac{\mathbb{Z}}{{n\mathbb{Z}}}\).

Proof. By definition of the Napoleon–Davis transformation, we have the equalities

$$\frac{{{{z}_{{k - 1}}} - {{w}_{k}}}}{{{{z}_{k}} - {{w}_{k}}}} = {{{\text{e}}}^{{{\text{i}}\alpha }}},$$

for all \(k \in \frac{\mathbb{Z}}{{n\mathbb{Z}}}\), which yield the proposition. Q.E.D.

Corollary. In the standard basis, the Napoleon transformation is given by a matrix proportional to the matrix

$$N{{D}_{{n,\alpha }}}: = \left( {\begin{array}{*{20}{c}} { - {{{\text{e}}}^{{\frac{{{\text{i}}\alpha }}{{\text{2}}}}}}}&0& \ldots & \ldots & \ldots &{{{{\text{e}}}^{{ - \frac{{{\text{i}}\alpha }}{{\text{2}}}}}}} \\ {{{{\text{e}}}^{{ - \frac{{{\text{i}}\alpha }}{{\text{2}}}}}}}&{ - {{{\text{e}}}^{{\frac{{{\text{i}}\alpha }}{{\text{2}}}}}}}&0& \ldots & \ldots &0 \\ 0&{{{{\text{e}}}^{{ - \frac{{{\text{i}}\alpha }}{{\text{2}}}}}}}&{ - {{{\text{e}}}^{{\frac{{{\text{i}}\alpha }}{{\text{2}}}}}}}&0& \ldots &0 \\ \ldots & \ldots & \ldots & \ldots & \ldots & \ldots \\ \ldots & \ldots & \ldots & \ldots & \ldots & \ldots \\ 0&0& \ldots &0&{{{{\text{e}}}^{{ - \frac{{{\text{i}}\alpha }}{{\text{2}}}}}}}&{ - {{{\text{e}}}^{{\frac{{{\text{i}}\alpha }}{{\text{2}}}}}}} \end{array}} \right).$$

Geometrically meaningful generalizations of Napoleon’s theorem are associated with finding parameters of the Napoleon–Davis transformation for which the results of this transformation have special properties, i.e., the transformation is not surjective. Since it is linear, it is not surjective if and only if it is degenerate.

Main theorem. The Napoleon–Davis transformation \(\mathcal{N}{{\mathcal{D}}_{{n,\alpha }}}\) is degenerate if and only if \({{{\text{e}}}^{{{\text{i}}n\alpha }}} = 1\).

Proof. We apply a simple auxiliary result.

Lemma. The determinant of an \(n \times n\) matrix is given by the formula

$$\det \left( {\begin{array}{*{20}{c}} { - p}&0& \ldots & \ldots & \ldots &q \\ q&{ - p}&0& \ldots & \ldots &0 \\ 0&q&{ - p}&0& \ldots &0 \\ \ldots & \ldots & \ldots & \ldots & \ldots & \ldots \\ \ldots & \ldots & \ldots & \ldots & \ldots & \ldots \\ 0&0& \ldots &0&q&{ - p} \end{array}} \right) = ( - {{1)}^{n}}({{p}^{n}} - {{q}^{n}}).$$

Proof. It is based on induction on \(n\) with the cofactor expansion along the first column. Q.E.D.

To prove the main theorem, we need to set \(p = {{{\text{e}}}^{{\frac{{{\text{i}}\alpha }}{{\text{2}}}}}}\) and \(q = {{{\text{e}}}^{{ - \frac{{{\text{i}}\alpha }}{{\text{2}}}}}}\) in the lemma. Q.E.D.

The main theorem implies that, when we deal with transformations of n-gons, it suffices to consider only angles multiple of \(\frac{{2\pi }}{n}\); according to the historical tradition, we interpret these angles as the central angles of regular polygons. In other words, on sides of arbitrary n-gons, we will construct regular m-gons.

Regular m-gons and angles \(\alpha = \frac{{2\pi }}{m}\). According to our main theorem, the transformation \(\mathcal{N}{{\mathcal{D}}_{{n,\frac{{2\pi }}{m}}}}\) is not surjective if and only if \({{{\text{e}}}^{{\frac{{2\pi {\text{i}}n}}{m}}}} = 1\), i.e., m divides n.

We indicate several pairs \((m,n)\) for which the polygons from the image of the Napoleon–Davis transformation \(\mathcal{N}{{\mathcal{D}}_{{n,\frac{{2\pi }}{m}}}}\) can be described geometrically.

Degenerate case m = 1. The transformation \(\mathcal{N}{{\mathcal{D}}_{{n,\alpha }}}\) with α = 2π has the form

$${\text{N}}{{{\text{D}}}_{{n,2\pi }}} = \left( {\begin{array}{*{20}{c}} 1&0& \ldots & \ldots & \ldots &{ - 1} \\ { - 1}&1&0& \ldots & \ldots &0 \\ 0&{ - 1}&1&0& \ldots &0 \\ \ldots & \ldots & \ldots & \ldots & \ldots & \ldots \\ \ldots & \ldots & \ldots & \ldots & \ldots & \ldots \\ 0&0& \ldots &0&{ - 1}&1 \end{array}} \right).$$

For example,

$${\text{N}}{{{\text{D}}}_{{3,2\pi }}} = \left( {\begin{array}{*{20}{c}} 1&0&{ - 1} \\ { - 1}&1&0 \\ 0&{ - 1}&1 \end{array}} \right),$$

i.e.,

$${\text{N}}{{{\text{D}}}_{{3,2\pi }}}: = \left( \begin{gathered} {{z}_{1}} \\ {{z}_{2}} \\ {{z}_{3}} \\ \end{gathered} \right) \mapsto \left( \begin{gathered} {{z}_{1}} - {{z}_{3}} \\ {{z}_{2}} - {{z}_{1}} \\ {{z}_{3}} - {{z}_{2}} \\ \end{gathered} \right).$$

This mapping takes a triangle to a sequence of its directed sides. The mapping \({\text{N}}{{{\text{D}}}_{{n,2\pi }}}\) has a similar interpretation for n-gons with an arbitrary \(n > 3\).

Degenerate case m = 2. Although the construction of a regular 2-gon on sides of an arbitrary n-gon has no traditional geometric meaning, the angle \(\alpha = \pi \) can be substituted into the transformation formula to obtain

$${{w}_{k}} = \frac{{{{z}_{{k - 1}}} + {{z}_{k}}}}{2},$$

i.e., we obtain a transformation of a polygon into the one formed by the midpoints of the sides of the original polygon!

This construction is discussed in detail in [15]. The main result concerning the image of the Napoleon–Davis transformation in this case is known as Varignon’s theorem, which was mentioned above.

In our language, the Napoleon–Davis transformation \({\text{N}}{{{\text{D}}}_{{2,\pi }}}\) maps an arbitrary quadrilateral to a parallelogram.

Case \(m = n = 3\). It is where we meet what is know as Napoleon’s theorem.

In our notation, it is verified as follows:

$${\text{N}}{{{\text{D}}}_{{3,\frac{{2\pi }}{3}}}} = \left( {\begin{array}{*{20}{c}} {{{{\text{e}}}^{{ - \,\frac{{\pi i}}{3}}}}}&0&{ - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}} \\ { - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}}&{{{{\text{e}}}^{{ - \,\frac{{\pi i}}{3}}}}}&0 \\ 0&{ - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}}&{{{{\text{e}}}^{{ - \,\frac{{\pi i}}{3}}}}} \end{array}} \right),$$

so if we introduce the notation

$${\text{N}}{{{\text{D}}}_{{3,\frac{{2\pi }}{3}}}} \cdot \left( \begin{gathered} {{z}_{1}} \hfill \\ {{z}_{2}} \hfill \\ {{z}_{3}} \hfill \\ \end{gathered} \right) = \left( \begin{gathered} {{{\text{e}}}^{{ - \frac{{\pi i}}{3}}}}{{z}_{1}} - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}{{z}_{3}} \\ - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}{{z}_{1}} + {{{\text{e}}}^{{ - \frac{{\pi i}}{3}}}}{{z}_{2}} \\ - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}{{z}_{2}} + {{{\text{e}}}^{{ - \frac{{\pi i}}{3}}}}{{z}_{3}} \\ \end{gathered} \right) = :\left( \begin{gathered} {{w}_{1}} \\ {{w}_{2}} \\ {{w}_{3}} \\ \end{gathered} \right),$$

then

$$\frac{{{{w}_{3}} - {{w}_{2}}}}{{{{w}_{2}} - {{w}_{1}}}} = \frac{{( - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}{{z}_{2}} + {{{\text{e}}}^{{ - \frac{{\pi i}}{3}}}}{{z}_{3}}) - ( - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}{{z}_{1}} + {{{\text{e}}}^{{ - \frac{{\pi i}}{3}}}}{{z}_{2}})}}{{( - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}{{z}_{1}} + {{{\text{e}}}^{{ - \frac{{\pi i}}{3}}}}{{z}_{2}}) - ({{{\text{e}}}^{{ - \frac{{\pi i}}{3}}}}{{z}_{1}} - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}{{z}_{3}})}}$$
$$\begin{gathered} = \frac{{{{{\text{e}}}^{{\frac{{\pi i}}{3}}}}{{z}_{1}} - {{z}_{2}} + {{{\text{e}}}^{{ - \frac{{\pi i}}{3}}}}{{z}_{3}}}}{{ - {{z}_{1}} + {{{\text{e}}}^{{ - \frac{{\pi i}}{3}}}}{{z}_{2}} + {{{\text{e}}}^{{ - \frac{{\pi i}}{3}}}}{{z}_{3}}}} \\ {\text{ = }}\,{{{\text{e}}}^{{\frac{{\pi i}}{3}}}}\frac{{{{z}_{1}} - {{{\text{e}}}^{{ - \frac{{\pi i}}{3}}}}{{z}_{2}} + {{{\text{e}}}^{{ - \frac{{2\pi i}}{3}}}}{{z}_{3}}}}{{ - {{z}_{1}} + {{{\text{e}}}^{{ - \frac{{\pi i}}{3}}}}{{z}_{2}} + {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}{{z}_{3}}}} = - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}} \\ \end{gathered} $$

(the last reduction holds, since \({{{\text{e}}}^{{ - \frac{{2\pi i}}{3}}}} = - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}\)). This is Napoleon’s theorem.

Case \(m = n = 4\). We have

$${\text{N}}{{{\text{D}}}_{{4,\frac{\pi }{2}}}}: = \left( {\begin{array}{*{20}{c}} { - {{{\text{e}}}^{{\frac{{\pi {\text{i}}}}{{\text{4}}}}}}}&0&0&{{{{\text{e}}}^{{ - \,\frac{{\pi {\text{i}}}}{{\text{4}}}}}}} \\ {{{{\text{e}}}^{{ - \frac{{\pi {\text{i}}}}{{\text{4}}}}}}}&{ - {{{\text{e}}}^{{\frac{{\pi {\text{i}}}}{{\text{4}}}}}}}&0&0 \\ 0&{{{{\text{e}}}^{{ - \,\frac{{\pi {\text{i}}}}{{\text{4}}}}}}}&{ - {{{\text{e}}}^{{\frac{{\pi {\text{i}}}}{{\text{4}}}}}}}&0 \\ 0&0&{{{{\text{e}}}^{{ - \,\frac{{\pi {\text{i}}}}{{\text{4}}}}}}}&{ - {{{\text{e}}}^{{\frac{{\pi {\text{i}}}}{{\text{4}}}}}}} \end{array}} \right).$$

Now, introducing the notation

$${\text{N}}{{{\text{D}}}_{{4,\frac{\pi }{2}}}} \cdot \left( \begin{gathered} {{z}_{1}} \hfill \\ {{z}_{2}} \hfill \\ {{z}_{3}} \hfill \\ {{z}_{4}} \hfill \\ \end{gathered} \right) = \left( \begin{gathered} - {{{\text{e}}}^{{ - \,\frac{{\pi i}}{4}}}}{{z}_{1}} + {{{\text{e}}}^{{\frac{{\pi i}}{4}}}}{{z}_{4}} \\ {{{\text{e}}}^{{ - \,\frac{{\pi i}}{4}}}}{{z}_{1}} - {{{\text{e}}}^{{\frac{{\pi i}}{4}}}}{{z}_{2}} \\ {{{\text{e}}}^{{ - \,\frac{{\pi i}}{4}}}}{{z}_{2}} - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}{{z}_{3}} \\ {{{\text{e}}}^{{ - \,\frac{{\pi i}}{4}}}}{{z}_{3}} - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}{{z}_{4}} \\ \end{gathered} \right) = :\left( \begin{gathered} {{w}_{1}} \\ {{w}_{2}} \\ {{w}_{3}} \\ {{w}_{4}} \\ \end{gathered} \right),$$

we can calculate

$$\frac{{{{w}_{4}} - {{w}_{2}}}}{{{{w}_{3}} - {{w}_{1}}}} = \frac{{({{{\text{e}}}^{{ - \,\frac{{\pi i}}{4}}}}{{z}_{3}} - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}{{z}_{4}}) - ({{{\text{e}}}^{{ - \,\frac{{\pi i}}{4}}}}{{z}_{1}} - {{{\text{e}}}^{{\frac{{\pi i}}{4}}}}{{z}_{2}})}}{{({{{\text{e}}}^{{ - \,\frac{{\pi i}}{4}}}}{{z}_{2}} - {{{\text{e}}}^{{\frac{{\pi i}}{3}}}}{{z}_{3}}) - ( - {{{\text{e}}}^{{ - \,\frac{{\pi i}}{4}}}}{{z}_{1}} + {{{\text{e}}}^{{\frac{{\pi i}}{4}}}}{{z}_{4}})}}$$
$$ = \frac{{ - {{{\text{e}}}^{{ - \,\frac{{\pi i}}{4}}}}({{z}_{1}} - {\text{i}}{{z}_{2}} - {{z}_{3}} + {\text{i}}{{z}_{4}})}}{{{{{\text{e}}}^{{\frac{{\pi i}}{4}}}}({{z}_{1}} - {\text{i}}{{z}_{2}} - {{z}_{3}} + {\text{i}}{{z}_{4}})}} = - {{{\text{e}}}^{{ - \,\frac{{\pi i}}{2}}}} = {\text{i}}.$$

We have established the following analogue of Napoleon’s theorem:

The Napoleon–Davis transformation \({\text{N}}{{{\text{D}}}_{{4,\frac{\pi }{2}}}}\) maps an arbitrary quadrilateral to a quadratoid.

The geometric section is completed with the comment to Napoleon’s theorem and its generalizations sent by V.N. Dubrovskii, associate professor in the Department of Mathematics of AESC MSU, who read the draft of this paper.

There is a geometric proof of the main theorem on the Napoleon–Davis transformation that is comprehensible to strong school students. In fact, it is contained in the solution to problem 19 from Yaglom’s classic book [16]. Let us try to construct an inverse transformation: given any \(n\)-gon \(P\) and an arbitrary point \({{z}_{1}}\), we begin to construct its preimage, i.e., the polygonal line \({{z}_{1}}{{z}_{2}} \ldots .\) Each vertex \({{z}_{{k + 1}}}\) of this polyline must be the image of \({{z}_{k}}\) under rotation by \(\alpha \) around the corresponding vertex \(P.\) We obtain an \(n\)-segment polyline \({{z}_{1}}{{z}_{2}} \ldots {{z}_{{n + 1}}},\) not necessarily closed. If \(P\) is the image of the \(n\)-gon \({{z}_{1}}{{z}_{2}} \ldots {{z}_{{n + 1}}},\) then \({{z}_{{n + 1}}} = {{z}_{1}}\) (the polyline is closed), i.e., \({{z}_{1}}\) is a fixed point of the composition of \(n\) rotations by the angle \(\alpha .\) If \(n\alpha \ne \) 0 mod \(2\pi \), then this composition is a rotation, i.e., \({{z}_{1}}\) is uniquely defined and, hence, the transformation is nondegenerate. However, if \(n\alpha \) \( = \) 0 mod \(2\pi \), then the composition is a translation by a vector depending on \(P\). Then the following two cases are possible. For “most” polygons \(P\), the vector is nonzero; these polygons do not belong to the image of the Napoleon–Davis transformation. However, there are special \(P\) for which the vector is zero. Then the composition is the identity mapping and, for any choice of \({{z}_{1}}\), there is a (single) polyline with vertex \({{z}_{1}},\) mapped under the Napoleon–Davis transformation to \(P,\) i.e., this transformation is degenerate. The next task is to describe these special polygons \(P\). In the case of triangles, we see that \(P\) is an equilateral triangle (Napoleon’s theorem). In the case of quadrilaterals, \(P\) is a quadratoid (van Aubel’s theorem).

School students can be led to the formulations of these theorems and their geometric proofs through an experiment. The task is to construct an \(n\)-gon from its image \(P\) for various values of \(\alpha .\) After completing the above-described construction in the dynamic geometry program, we try to superpose the ends \({{z}_{1}}\) and \({{z}_{{n + 1}}}\) of the polyline \({{z}_{1}}{{z}_{2}} \ldots {{z}_{{n + 1}}}\) with a mouse. If we succeed, then we have the nondegenerate case; otherwise, we see that \({{z}_{1}}\) and \({{z}_{{n + 1}}}\) are connected via parallel translation, but they can be superposed by moving the vertices of \(P.\) Moreover, it suffices to move only one vertex: this leads to both the Napoleon case and a quadratoid. Tasks based on this experiment have been implemented in the Mathematical Constructor [17]. Continuing this study, one can answer the more difficult question as to when an \(n\)-gon \(P\) is regular for \(\alpha = 2\pi {\text{/}}n\) (Napoleon–Barlotti theorem).

An important advantage of these experiments is that they form a series beginning with the simple case \(\alpha = \pi \) and lead students to the formulation of results gradually in the course of examining solutions to a not very complicated construction problem.

4 ALGEBRA

Computer algebra allows us to pass from the prescription school theory of quadratic polynomials and quadratic equations to research generalizations.

(a) Graphs of polynomials. Few of the humanities students met by the authors (and even some mathematics teachers) know that, along with the axial symmetry of graphs of quadratic polynomials (see Fig. 12) there is a central symmetry of graphs of cubic parabolas (see Fig. 13).

Fig. 12.
figure 12

Axial symmetry of the parabola.

Fig. 13.
figure 13

Central symmetry of the cubic parabola.

Fig. 14.
figure 14

Central symmetry of the cubic parabola.

Fig. 15.
figure 15

Newton’s method.

Fig. 16.
figure 16

Secant line.

Students working in a dynamic mathematics environment are tasked to determine visually whether the graphs of polynomials have symmetry. The situation is especially illustrative if there are sliders on the screen, so that the coefficients of a polynomial can change gradually.

The next step is, by moving the graph, to try to superpose the axis of symmetry on the vertical axis and the center of symmetry on the origin. In doing so, the students can see how the coefficients vary in value.

The next step is to understand what the algebraic transformation is that corresponds to such a shift in the various special cases.

After that, stronger students might devise a general translation formula, while the others can assimilate the general nature of the symmetries under discussion from teacher’s explanations: given the graph of the polynomial

$$y = {{x}^{n}} + {{a}_{1}}{{x}^{{n - 1}}} + \ldots + {{a}_{n}}$$

with \(n \in \{ 2,3\} \), the substitution \(x = x'\, - \frac{{{{a}_{1}}}}{n}\) for n = 2 turns it into the graph of an even function, while, for n = 3, into the graph of an odd function shifted in the vertical direction.

It is useful to note that the graph of a cubic polynomial has a center of symmetry at the inflection point of this graph (see Fig. 14).

In the preceding example, we productively combined formulas with visuality. However, not every, (even school) computer experiments are built on visuality, although it is desirable.

What is the tool for computer experiments that expands the possibilities of dynamic geometry? Of course, this is computer algebra. Computer algebra tools are now successfully used wherever complicated analytical computing was once required. They may seem to “trivialize” school algebra. To some extent, it is true, because these tools trivialized professional analytical calculations that were discussed above. However, as in other cases, the computer opens up opportunities for creative research activities of students supported by the teacher.

In the following examples, we sometimes drop a clear and detailed description of the computer experiment conducted by students and omit teacher’s explanations, tasks, and research aims.

In detailed planning of an educational situation, it is advisable to highlight the following points:

0. motivation: why the problem is of interest to professional mathematicians and can be of interest to students;

1. the experiment to be conducted;

2. observation of results;

3. the discovery to be expected and hypotheses that may arise;

4. proofs of the hypotheses and related NON-experimental mathematics.

(b) Cubic Cardano formula over \(\mathbb{R}\). The issue of solving equations of degree higher than the second has drawn attention since ancient times (cf. doubling the cube). Closer to our time, its solution marked a renaissance in mathematics (we are able not only to reproduce ancient results, but also to yield our own results inaccessible to ancient people), which occurred later than the Renaissance in literature, architecture, and painting. The solution of cubic equations by Italian mathematicians in the 16th century led to the origin of modern algebra.

Note that cubic equations are adjacent to quadratic ones and, in individual problems of increased difficulty, they are systematically found in school mathematics, for example, in modern England. Accordingly, a natural “extracurricular” question asked by interested students is as follows:

for cubic equations, is there a formula analogous to the one for quadratic equations?

Answer: yes, but it is much more complicated.

The general cubic equation has the form

$$a{{x}^{3}} + b{{x}^{2}} + cx + d = 0$$
(1.2a)

with \(a \ne 0\). By making the substitution \(x \leftarrow x - \frac{b}{{3a}}\) (which was mentioned above in the context of explaining the central symmetry of graphs of cubic polynomials) with the subsequent division by a, Eq. (1.2a) takes the more compact form

$${{x}^{3}} + px + q = 0.$$
(1.2b)

In contrast to Italians of the Renaissance, for a modern human, it is sufficient to give the command solve to a computer algebra system. Using MAPLE as such a system, we obtain the solution

$$\begin{gathered} x = \frac{1}{6}{\kern 1pt} \sqrt[3]{{ - 108{\kern 1pt} q + 12{\kern 1pt} \sqrt {12{\kern 1pt} {{p}^{3}} + 81{\kern 1pt} {{q}^{2}}} }} \\ \, - 2{\kern 1pt} \frac{p}{{\sqrt[3]{{ - 108{\kern 1pt} q + 12{\kern 1pt} \sqrt {12{\kern 1pt} {{p}^{3}} + 81{\kern 1pt} {{q}^{2}}} }}}}. \\ \end{gathered} $$
(1.2c)

The experimenter receives a preliminary answer to the question: there is a formula with radicals, and it should be analyzed.

One of the arising difficulties (which complicated the life of Italians in the 16th century as well) is that negative numbers may appear under the sign of the square root, but there always exists at least one root. This fact can also be discovered experimentally: the same computer algebra systems involve tools for constructing graphs. Observations of them lead to a hypothesis, which can be proved by combining operations with inequalities and arguments of analysis or topology. This experiment can be conducted even before calculating the roots in computer algebra.

Systematic algebraic experiments are connected with substitution of numbers for variables in the Cardano formula. Cases are distinguished when the formula numerically gives the correct answer; it can be checked by substitution. However, in these cases, there is a field for experiments by applying a giveaway strategy: for given \({{x}_{1}},{{x}_{2}} \in \mathbb{Q}\), we should work with the polynomial \(P(x) = (x - {{x}_{1}})(x - {{x}_{2}})(x + {{x}_{1}} + {{x}_{2}})\). Application of the Cardano formula yields mysterious equalities connecting seemingly irrational numbers with rational ones. These equalities are easy to verify numerically, and the task is to UNDERSTAND then. It is useful to create our own structured collection.

\( \bullet \) Can the denominator on the right-hand side of (1.2c) vanish? A simple analysis shows that it is possible, but only if p = 0, in which case Eq. (1.2b) is solved via simple extraction of the cube root.

Further observations show that the Cardano formula sometimes gives a correct and meaningful answer, but sometimes fails. Inspecting the graphs of cubic polynomials, we can determine the boundary between several types of real cubic equations (as in the case of quadratic equations). Special attention should be payed to the boundaries between the types (once again a parallel with quadratic polynomials!), namely, polynomials with MULTIPLE roots. In our case, these are the polynomials \(P(x) = (x - \alpha {{)}^{2}}(x + 2\alpha )\), and the Cardano formula for them should be analyzed separately.

\( \bullet \) Does formula (1.2c) remain valid if the number under the square root sign \(\sqrt {12{{p}^{3}} + 81{{q}^{2}}} \) is negative? The answer within rigorous school mathematics is no. However, it is in this case that a computer experiment can clarify the situation. Indeed, substituting (random) numerical values of \(p,q \in \mathbb{R}\) into the left-hand side of Eq. (1.2b), we can see the graph of this polynomial and, with its help, one or three approximate solutions of Eq. (1.2b). Substitution of the chosen values of p, q into formula (1.2c) will convince a school student that formula (1.2c) has an out-of-school sense and, in some cases, will prompt them to master the complex numbers.

\( \bullet \) Why is the irrationality \(\sqrt[3]{{ - 108q + 12\sqrt {12{{p}^{3}} + 81{{q}^{2}}} }}\) involved in the terms on the right-hand side of (1.2c) in an asymmetric way? In contrast to the first two questions, this one is inaccurate, and it does not have the exact answer. More precisely, the answer is that MAPLE gives formula (1.2c), which is valid; moreover, it can be verified by the program in symbolic form, in contrast to the more beautiful and traditional one (see, e.g., [18]), which derivation we will recall now.

Substituting

$$x = u + {v}$$
(1.2d)

into Eq. (1.2b) (to create an additional degree of freedom), we obtain \({{(u + {v})}^{3}} + p(u + {v}) + q = 0\), which can be transformed into

$${{u}^{3}} + {{{v}}^{3}} + (3u{v} + p)(u + {v}) = - q.$$
(1.2e)

We postulate (using the above-mentioned freedom)

$$3u{v} + p = 0.$$
(1.2f)

Simplifying (1.2e) with the help of (1.2f) and cubing the slightly transformed equation (1.2f), we obtain the system of equations

$$\left\{ {\begin{array}{*{20}{l}} {{{u}^{3}} + {{v}^{3}} = - q} \\ {{{u}^{3}}{{v}^{3}} = - \frac{{{{p}^{3}}}}{{27}},} \end{array}} \right.$$
(1.2g)

which shows that \({{u}^{3}}\) and \({{{v}}^{3}}\) are the roots of a quadratic equation constructed using the coefficients of the original cubic equation, i.e.,

$$(\lambda - {{u}^{3}})(\lambda - {{{v}}^{3}}) \equiv {{\lambda }^{2}} + q\lambda - \frac{{{{p}^{3}}}}{{27}}$$
(1.2h)

and, therefore,

$$\{ {{u}^{3}},{{{v}}^{3}}\} = \left\{ { - \frac{q}{2} \pm \sqrt {\frac{{{{q}^{2}}}}{4} + \frac{{{{p}^{3}}}}{{27}}} } \right\},$$
(1.2i)

whence, finally, in view of (1.2d),

$$\boxed{x = \sqrt[3]{{ - \frac{q}{2} + \sqrt {\frac{{{{p}^{3}}}}{{27}} + \frac{{{{q}^{2}}}}{4}} }} + \sqrt[3]{{ - \frac{q}{2} - \sqrt {\frac{{{{p}^{3}}}}{{27}} + \frac{{{{q}^{2}}}}{4}} }}}\,.$$
(1.2j)

A comparative analysis of formulas (1.2j) and (1.2c) requires a computer experiment (CE). Formula (1.2j) is nicer and clearer than (1.2c), but (for reasons unknown to the authors) it cannot be verified in the general form with the help of MAPLE, in contrast to (1.2c). Accordingly, it is reasonable to use wide CEs with consideration of numerous special cases and verification of both nontrivial equalities between irrationalities and approximate numerical values of the roots produced by both formulas. Moreover, it is useful to apply the giveaway strategy of analyzing the formulas with known roots, i.e., to choose numbers \({{x}_{{1,2,3}}}\) satisfying \({{x}_{1}} + {{x}_{2}} + {{x}_{3}} = 0,\) and to examine the cubic polynomials \((x - {{x}_{1}})(x - {{x}_{2}})(x - {{x}_{3}}) \equiv {{x}^{3}} + px + q\) with given roots. This work leads to a deeper understanding of Vieta’s theorem than in the standard school approaches.

Finally, the validity of both formulas depends on the sign of the number

$$12{{p}^{3}} + 81{{q}^{2}} = 324\left( {\frac{{{{p}^{3}}}}{{27}} + \frac{{{{q}^{2}}}}{4}} \right).$$

This number is defined (as in the case of quadratic trinomials) up to multiplication by a squared rational number, and it deserves the name of the discriminant of the cubic polynomial. It is useful to understand its relation to \({{(({{x}_{1}} - {{x}_{2}})({{x}_{1}} - {{x}_{3}})({{x}_{2}} - {{x}_{3}}))}^{2}}\) for the cubic polynomial

$$a{{x}^{3}} + d{{x}^{2}} + cx + d \equiv a(x - {{x}_{1}})(x - {{x}_{2}})(x - {{x}_{3}}).$$

All these operations are labor-intensive in manual calculations, but they can be quite convincing within properly organized computer experiments.

After the corresponding classification and specifications, the Cardano formula always yields ONE root. In the case of extracting the square root of a negative number, the formula should be transformed and related to angle trisection.

The rigorous proofs carried out in two cases (known as reducible and irreducible according to the today impossible archaic Italian terminology) leave a sense of dissatisfaction: how can these different cases be treated so differently? This suggests the transition to complex numbers. After the theory of decomposing cubic trinomials into linear factors has been completely clarified, a natural question is concerned with fourth-degree polynomials. Probably, a surprising fact is that the Italians reduced it to third-degree polynomials by considering, for a fourth-degree polynomial with roots \({{x}_{1}},{{x}_{2}},{{x}_{3}},{{x}_{4}}\), a third-degree polynomial with roots

$${{y}_{1}} = {{x}_{1}}{{x}_{2}} + {{x}_{3}}{{x}_{4}},$$
$${{y}_{2}} = {{x}_{1}}{{x}_{3}} + {{x}_{2}}{{x}_{4}},$$
$${{y}_{3}} = {{x}_{1}}{{x}_{4}} + {{x}_{2}}{{x}_{3}}.$$

In the course of extra-experimental arguments, it is useful to address examples of their experimental part (here, the computer is absolutely necessary, since the calculations are terribly cumbersome). After this material has been mastered, the learners are ready to deal with the fundamental theorem of algebra and Galois theory.

(c) Pell’s equation. This nameFootnote 4 belongs to the following equation in integers (see, e.g., [19]):

$${{x}^{2}} - D{{y}^{2}} = 1;$$

our consideration is restricted to the case D = 2. This is one of the simplest equations of (the least nontrivial) degree 2 in integers, but it is associated with interesting mathematics going back to ancient times [20].

The first task is to find some solutions experimentally. The next task is to find as many solutions as possible.

A natural question arises: are there arbitrarily large solutions to the equation \({{x}^{2}} - 2{{y}^{2}} = 1\)? With the help of CE, progress in this task can be achieved even by learners with minimal mathematical training and programming skills: solutions can be sought by simple enumeration. For students, this is a fine introduction into the use of a computer in number theory.

Next, interested learners can be told about the number ring \(\mathbb{Z} + \mathbb{Z}\sqrt 2 \), and we can introduce the conjugate \(\overline {x + y\sqrt 2 } : = x - y\sqrt 2 \) and the norm \({\text{N}}(z): = z\overline z \), so that Pell’s equation can be interpreted as

$${\text{N}}(z) = 1.$$

The multiplicative property of the norm (i.e., the identity \({\text{N}}({{z}_{1}}{{z}_{2}}) \equiv {\text{N}}({{z}_{1}}){\text{N}}({{z}_{2}})\), which can be directly checked by learners with minimal training) allows us to present an infinite sequence of solutions with the initial pair \({{x}_{1}} = 3,{{y}_{1}} = 2\) and the recurrence

$$\left\{ {\begin{array}{*{20}{l}} {{{x}_{{n + 1}}}: = 3{{x}_{n}} + 4{{y}_{n}}} \\ {{{y}_{{n + 1}}}: = 2{{x}_{n}} + 3{{y}_{n}}.} \end{array}} \right.$$

Here, CE not only helps to conclusively answer the above question about infinitely many solutions, but also offers a wide range of opportunities for establishing and checking numerous provable estimates and asymptotics.

Additionally, it is natural to discuss one application of Pell’s equation: after establishing the relation

$$\mathop {\lim }\limits_{n \to \infty } \frac{{{{x}_{n}}}}{{{{y}_{n}}}} = \sqrt 2 $$

(which is almost obvious, but nevertheless requires proof), the introduced sequence makes it possible to approximate the number \(\sqrt 2 \). Table 1 lists the values of the above-defined positive integers \({{x}_{n}},{{y}_{n}}\) and decimal approximations of the rational numbers \(\frac{{{{x}_{n}}}}{{{{y}_{n}}}}\) computed in MAPLE (which, in contrast to classic programming languages, works with arbitrarily large positive integers and yields arbitrarily accurate decimal approximations of rational numbers).

Table 1.

\(n\)

\({{x}_{n}}\)

\({{y}_{n}}\)

\(\frac{{{{x}_{n}}}}{{{{y}_{n}}}}\)

 1

3

2

1. 50000000000000000000000000000000

 2

17

12

1.41666666666666666666666666666667

 3

99

70

1.41428571428571428571428571428571

 4

577

408

1.41421568627450980392156862745098

 5

3363

2378

1.41421362489486963835155592935240

 6

19601

13860

1.41421356421356421356421356421356

 7

114243

80782

1.41421356242727340249065385853284

 8

665857

470832

1.41421356237468991062629557889013

 9

3880899

2744210

1.41421356237314199715036385699345

 10

22619537

15994428

1.41421356237309643083203725697474

 11

131836323

93222358

1.41421356237309508948486370619374

 12

768398401

543339720

1.41421356237309504999928957890286

 13

4478554083

3166815962

1.41421356237309504883694280179329

 14

26102926097

18457556052

1.41421356237309504880272650735873

 15

152139002499

107578520350

1.41421356237309504880171927369328

 16

886731088897

627013566048

1.41421356237309504880168962350253

 17

5168247530883

3654502875938

1.41421356237309504880168875068241

 18

30122754096401

21300003689580

1.41421356237309504880168872498898

The stabilizing digits in the approximations are shown in red. It is useful to prove that these are significant decimals of the number \(\sqrt 2 \). One look at the color table is enough to notice that the number of correct digits is approximately proportional to the approximation index. Refinements and proof of this statement (which resulted from a simple CE!), together with mastering the constructive definition of the limit, are much more useful than memorizing the standard definition (which is not constructive).

In the following presentation, we no longer highlight the various stages of student’s and teacher’s computer experiment, but give illustrative examples demonstrating experience gained from such an experiment, while the reader can build an experiment to their own taste.

(d) Numerical solution of polynomial equations. The above-mentioned Newton methodFootnote 5 is highly effective and nearly universal. It is applicable to arbitrary smooth functions

$$w = f(z)$$

(we use the notation of variables typical of complex analysis, since we are going to compare the results of this and preceding sections) and relies on the recursion (see Fig. 15)

$${{z}_{{n + 1}}}: = {{z}_{n}} - \frac{{f({{z}_{n}})}}{{f'({{z}_{n}})}},$$

which has a clear geometric interpretation.

As an example, we consider the polynomial

$$f(z) = {{z}^{2}} - 2$$

and find its root \(\sqrt 2 \) by applying Newton’s method starting from the (rather rough) initial guess \({{z}_{1}}: = 2\). We have \({{z}_{{n + 1}}}: = {{z}_{n}} - \frac{{z_{n}^{2} - 2}}{{2{{z}_{n}}}}\), or

$${{z}_{{n + 1}}}: = \frac{{z_{n}^{2} + 2}}{{2{{z}_{n}}}}.$$

The corresponding decimal approximations are given in Table 2.

Table 2.

\(n\)

\({{z}_{n}}\)

 1

2

 2

1.50000000000000000000000000000000

 3

1.41666666666666666666666666666667

 4

1.41421568627450980392156862745098

 5

1.41421356237468991062629557889013

 6

1.41421356237309504880168962350253

 7

1.41421356237309504880168872420970

Obviously, Newton’s method as applied to the calculation of \(\sqrt 2 \) is much more efficient than the above-described method based on solving Pell’s equation. As was noted above, the number of significant digits was approximately proportional to the index of the approximation. In the case of Newton’s method, the law is less obvious:

the number of correct digits is approximately proportional to the squared approximation index.

A more careful check of this law and its more accurate formulation and justification require further CEs. Students’ interest in mathematics will manifest itself in that they would want to consider similar issues for other approximations, at least for \(\sqrt 3 \).

Finally, we note that the efficiency of algorithms, which naturally arises during work in the spirit of the considered examples, is a modern concept of importance from both theoretical and practical points of view. It is not included in the modern Unified State Examination program, but can be introduced and studied with the help of CE.

5 CALCULUS

Below, we will only discuss the possibility of illustrating the basic concepts of calculus with the help of CE.

(a) Derivative. Here, the concepts of secant and tangent lines have to be mastered graphically.

The formulation saying that the tangent line is the limit position of the secants is preserved in the memory of many graduates from modern school, but is usually not related to visual images or rigorous mathematical concepts. Possibly, the matter is that the graphs of elementary functions (except for parabolas and hyperbolas) are difficult to draw accurately on the chalkboard; the same is true of secant and tangent lines.

The secant line of a graph is given by

$$y = f({{x}_{0}}) + \frac{{f({{x}_{0}} + h) - f({{x}_{0}})}}{h}(x - {{x}_{0}})$$
(1.3a)

which is an important formula in calculus (see Fig. 16).Footnote 6 The difficulty in its understanding is that the involved letters

$$f;{{x}_{0}};h;x,y$$

correspond to four different levels of “constancy.”

(1) f is the function for which we discuss secant lines to its graph;

(2) x0 is the abscissa of a point through which a family of secants pass;

(3) h is a parameter of this family;

(4) x, y are the coordinates in the plane (where we work) in terms of which the equations of secants are written.

Can computer experiments help understand formula (1.3a)? We think they can. For this purpose, we need to give the students the opportunity to see numerous families of secants, first, ready ones and, then, families constructed on their own with suitable tools.

In the example given in Fig. 17, f(x) = sinx, x0 = 1, \(h \in \left\{ {1,\frac{3}{4},\frac{1}{{42}}} \right\}\).

Fig. 17.
figure 17

Secant lines.

The understanding of the definition of the derivative

$$f'({{x}_{0}}): = \mathop {\lim }\limits_{h \to 0} \frac{{f({{x}_{0}} + h) - f({{x}_{0}})}}{h}$$
(1.3b)

is complicated by the same factors as those indicated in the discussion of formula (1.3a), but there is an additional factor, namely, the definition of the limit. Memorizing the “\(\varepsilon \)\(\delta \)” definition lies beyond the mnemonic capabilities of a modern nonmathematician with secondary education (see, though, [21]). However, the above-mentioned combination tangent is the limit position of secants, which can be rigorously defined, together with another one

derivative is the slope of the tangent,

can be well understandable with enough illustrations based on CE. For example, the above figure with several secants can be supplemented with the tangent line (see Fig. 18) given by the equation

$$y = \sin 1 + (\cos 1)(x - 1),$$

which involves unpopular, but easily computable sin1 = \(0.841470984 \ldots \) and \(\cos 1 = 0.540302305 \ldots \).

Fig. 18.
figure 18

Secants and a tangent line.

As the experience of one of the authors shows, after constructing a number of pictures of this type, a person most distant from mathematics (e.g., an RSUH student, i.e., a linguist or psychologist) masters derivatives with no less confidence than negative numbers.

The general equation of a tangent line realized from the described considerations is

$$\boxed{y = f({{x}_{0}}) + f'({{x}_{0}})(x - {{x}_{0}})\,.}$$
(1.3c)

(b) Taylor series. This branch of classical calculus—a beautiful and deep topic of great application importance—lies far beyond the current school curriculum.

Meanwhile, after learners have mastered the construction of tangents, i.e., linear approximations of elementary functions, it is natural to consider their approximations by polynomials of higher degrees. Answers were obtained as early as the 18th century, and this topic is easily mastered with the help of CE.

As a striking example, we consider several approximations of the sinusoid, i.e., the graph of the function \(y = \sin x\), by segments of the Taylor series

$$\sin x = x - \frac{{{{x}^{3}}}}{{3!}} + \frac{{{{x}^{5}}}}{{5!}} - \frac{{{{x}^{7}}}}{{7!}} \pm \ldots $$

(see Figs. 19–22).

Fig. 19.
figure 19

Approximation of the sine function by a Taylor series, degree 1.

Fig. 20.
figure 20

Approximation of the sine function by a Taylor series, degree 3.

Fig. 21.
figure 21

Approximation of the sine function by a Taylor series, degree 5.

Fig. 22.
figure 22

Approximation of the sine function by a Taylor series, degree 7.

Demonstrating these approximations to linguistic students, one of the authors usually met with their desire to try to approximate something on their own and several times heard the question of why we were not shown this in school? The answer remains unknown.

Computer algebra systems can produce an arbitrary number of truncated Taylor series for elementary functions, and the answer is usually guessed easily, especially by people familiar with factorials. Exceptions are Taylor series for the tangent function

$$\begin{gathered} \tan x = x + \frac{{{{x}^{3}}}}{3} + \frac{{2{{x}^{5}}}}{{15}} + \frac{{17{\kern 1pt} {{x}^{7}}}}{{315}} + \frac{{62{\kern 1pt} {{x}^{9}}}}{{2835}} \\ + \frac{{1382{\kern 1pt} {{x}^{{11}}}}}{{155\,925}} + \frac{{21\,844{\kern 1pt} {{x}^{{13}}}}}{{6\,081\,075}} + \frac{{929\,569{\kern 1pt} {{x}^{{15}}}}}{{638\,512\,875}} + \ldots \\ \end{gathered} $$

and the secant function

$$\begin{gathered} \frac{1}{{\cos x}} = 1 + \frac{{{{x}^{2}}}}{2} + \frac{{5{\kern 1pt} {{x}^{4}}}}{{24}} + \frac{{61{\kern 1pt} {{x}^{6}}}}{{720}} + \frac{{277{\kern 1pt} {{x}^{8}}}}{{8064}} \\ + \frac{{50\,521{\kern 1pt} {{x}^{{10}}}}}{{3\,628\,800}} + \frac{{540\,553{\kern 1pt} {{x}^{{12}}}}}{{95\,800\,320}} + \frac{{199\,360\,981{\kern 1pt} {{x}^{{14}}}}}{{87\,178\,291\,200}} + \ldots \\ \end{gathered} $$

These enigmatic numbers are related to a question that was only briefly touched upon earlier while we discussed the graphs of quadratic and cubic polynomials, namely, the number of possible shapes of graphs of real polynomials of arbitrary degrees. The relationship between these numbers and the coefficients of Taylor series expansions of the tangent and secant functions can be found in [23].

After mastering several Taylor series expansions of well-known functions and checking these expansions graphically and numerically, the computer experimenter realizes the continuation of formula (1.3c),

$$\boxed{y\, = \,f({{x}_{0}})\, + \,\frac{{f'({{x}_{0}})}}{{1!}}(x\, - \,{{x}_{0}})\, + \,\frac{{f''({{x}_{0}})}}{{2!}}{{{(x\, - \,{{x}_{0}})}}^{2}}\, + \, \ldots }$$
(1.3d)

as an indisputable pinnacle of classical calculus.

(c) Some integrals. We say a few words only about definite integrals, since the development of the (rather complicated) technique for indefinite integration seems like an outdated teaching approach in the age when computer algebra systems give an answer by pressing a button. This point of view is concerned with teaching mathematics to (future) nonmathematicians, who nevertheless should learn to understand and check answers produced by a computer.

The first nontrivial (transcendental) integral is

$$\int\limits_{ - 1}^1 {\frac{{{\text{d}}x}}{{\sqrt {1 - {{x}^{2}}} }} = 3.141592...} \,\,.$$

Together with a rather nontrivial interpretation of this integral as the area of the unit disk, it is useful to teach a rigorously thinking student to understand the equality

$$\pi : = \int\limits_{ - 1}^1 {\frac{{{\text{d}}x}}{{\sqrt {1 - {{x}^{2}}} }}} $$

as the definition of the number \(\pi \). CEs are associated with its approximate computation.

The other two definite integrals to be considered are not taken in elementary functions. For future mathematicians, this impossibility result as important as the unsolvability of fifth-degree equations in radicals or the impossibility of doubling the cube with a compass and ruler; for others, this small inconvenience does not interfere with the numerical study of integrals.

The integral

$$\int\limits_{ - \infty }^\infty {{{{\text{e}}}^{{ - {{x}^{2}}}}}{\text{d}}x = 1.772453851...} $$

plays a central role in probability theory. The approximate equality

$${{1.772453851}^{2}} \approx 3.141592654$$

puzzles the experimenter. Apparently, only the idea of working with the multiple integral

$$\int\limits_{ - \infty }^\infty {\int\limits_{ - \infty }^\infty {{{{\text{e}}}^{{ - {{x}^{2}} - {{y}^{2}}}}}{\text{d}}x{\text{d}}y} } $$

can help clarify the equality

$$\int\limits_{ - \infty }^\infty {{{{\text{e}}}^{{ - {{x}^{2}}}}}{\text{d}}x = \sqrt {\pi \,} .} $$

A separate issue is concerned with rigorous proof of this equality, which can be accessible to school students.

The other integral, also important for probability theory, depends on the parameter x; we temporarily denote it by

$$x?: = \int\limits_0^\infty {{{t}^{x}}{{{\text{e}}}^{{ - t}}}{\text{d}}t} .$$

It is easy to see that the integral converges for \(x \geqslant 0\). With some integration skills, it is found that this integral can be taken in the case \(x \in \mathbb{N}\). Moreover, it turns out that

$$x? = x!$$

Now we can extrapolate the function

$$n \mapsto n!: = 1 \cdot 2 \cdot \ldots \cdot n,$$

by defining

$${{\mathbb{R}}_{{ \geqslant 0}}} \to \mathbb{R}:x \mapsto \int\limits_0^\infty {{{t}^{x}}{{{\text{e}}}^{{ - t}}}{\text{d}}t} .$$

MAPLE can construct the graph of this function (see Fig. 23) and compute

$$\left( {\frac{1}{2}} \right)! = 0.8862269255 \ldots .$$
Fig. 23.
figure 23

Graph of the factorial.

Once again, the experimenter can discover the mysterious approximate equality

$${{4(0.8862269255)}^{2}} \approx 3.141592654,$$

but this time the authors are able to explain the exact equality

$$\left( {\frac{1}{2}} \right)! = \int\limits_0^\infty {\sqrt t {{{\text{e}}}^{{ - t}}}{\text{d}}t = \frac{{\sqrt \pi }}{2}} $$

only with the help of the theory of the Euler gamma function, which relies heavily on the theory of functions of a complex variable.

6 PROBABILITY THEORY

Contrary to the dominant traditions, we believe it is possible to teach probability theory as a branch of mathematics rather than as its applied field. However, in any case, computer experiments are undoubtedly a natural part of the course.

(a) Random numbers. Stochastic CEs based on random number generators correspond to the spirit of probability theory. However, the results they produce should be interpreted as hypotheses relying on random samples and should be compared with exact results of CEs based on exhaustive search. Let us give two examples.

Monte Carlo calculation of \(\pi \). Here, for a positive integer \(r \in \mathbb{N}\), we estimate the cardinality of the set

$$\begin{gathered} {{{\text{Q}}}_{r}}: = \{ (x,y) \in \{ 0,1,2, \ldots ,n\} \\ \, \times \{ 0,1,2, \ldots ,n\} |{{x}^{2}} + {{y}^{2}} \leqslant {{r}^{2}}\} ; \\ \end{gathered} $$

it is useful to prove rigorously that

$$\pi = 4\mathop {\lim }\limits_{r \to \infty } \frac{{\# {{{\text{Q}}}_{r}}}}{{{{r}^{2}}}},$$

even though the equality appears evident. This sequence converges very slowly; here, it is also useful to obtain rigorous bounds. However, this inefficient method agrees well with an intuitive idea of the number π as the area of the unit disk. A curious experimenter can get interested in the search for faster methods (including the above-mentioned integral), which are abundant in both classical and modern literature.

According to the Monte Carlo method, the points of the square \(\{ 0,1,2, \ldots ,n\} \times \{ 0,1,2, \ldots ,n\} \) are chosen not consecutively one after another, but rather at random. With luck, we obtain better results.

What is the probability that a fraction taken at random is reducible? Here, by applying exhaustive search, we can obtain numerical results in a similar manner to the preceding example, but to understand them, we should rely on our probabilistic intuition.

[The fraction \(\frac{a}{b}\) is reducible] \( \Leftrightarrow \)

$$\begin{gathered} \Leftrightarrow [[[a \in 2\mathbb{Z}] \wedge [b \in 2\mathbb{Z}]] \vee [[a \in 3\mathbb{Z}] \\ \, \wedge [b \in 3\mathbb{Z}]] \vee [[a \in 5\mathbb{Z}] \wedge [b \in 5\mathbb{Z}]] \vee \ldots ] \\ \end{gathered} $$
(1.4a)

(disjunction of a countable set of statements parameterized by prime numbers).

Taking the negation of \((1.4a)\) and using de Morgan’s laws, we obtain

[The fraction \(\frac{a}{b}\) is irreducible]

$$\Leftrightarrow \overline {[[[a \in 2\mathbb{Z}] \wedge [b \in 2\mathbb{Z}]] \vee [[a \in 3 \mathbb{Z}] \wedge [b \in 3\mathbb{Z}]] \vee [[a \in 5\mathbb{Z}] \wedge [b \in 5\mathbb{Z}]] \vee \ldots ]}$$
$$ \Leftrightarrow \overline {[[a \in 2\mathbb{Z}] \wedge [b \in 2\mathbb{Z}]]} \wedge \overline {[[a \in 3\mathbb{Z}] \wedge [b \in 3\mathbb{Z}]]} \wedge \overline {[[a \in 5\mathbb{Z}] \wedge [b \in 5\mathbb{Z}]]} \wedge \ldots $$
(1.4b)

(conjunction of a countable set of statements parameterized by prime numbers).

It is at this moment that we should turn on our probabilistic intuition. Assuming that the probability P on the set of pairs of positive integers makes sense (as the limit of probabilities on a finite set of fractions with bounded numerators and denominators) and that the conjuncts in (1.4b) are pairwise independent, we obtain P (random fraction is irreducible)

$$\begin{gathered} = (1 - {\mathbf{P}}([a \in 2\mathbb{Z}] \wedge [b \in 2\mathbb{Z}])) \\ \times \,\,(1 - {\mathbf{P}}([a \in 3\mathbb{Z}] \wedge [b \in 3\mathbb{Z}])) \\ \times \,\,(1 - {\mathbf{P}}([a \in 5\mathbb{Z}] \wedge [b \in 5\mathbb{Z}])) \ldots \\ \end{gathered} $$
$$ = \left( {1 - {{{\left( {\frac{1}{2}} \right)}}^{2}}} \right)\left( {1 - {{{\left( {\frac{1}{3}} \right)}}^{2}}} \right)\left( {1 - {{{\left( {\frac{1}{5}} \right)}}^{2}}} \right)\left( {1 - {{{\left( {\frac{1}{7}} \right)}}^{2}}} \right) \ldots \,\,.$$
(1.4c)

“Inverting” formula (1.4c) and, for each prime number p, using the formula for the sum of an infinite geometric progression, i.e.,

$$\frac{1}{{1 - \frac{1}{{{{p}^{2}}}}}} = 1 + \frac{1}{{{{p}^{2}}}} + \frac{1}{{{{p}^{4}}}} + \frac{1}{{{{p}^{6}}}} + \ldots ,$$

we obtain

$$\frac{1}{{{\mathbf{P}}({\text{random fraction is irreducible}})}}$$
$$\begin{gathered} = \left( {1 + \frac{1}{{{{2}^{2}}}} + \frac{1}{{{{2}^{4}}}} + \ldots } \right)\left( {1 + \frac{1}{{{{3}^{2}}}} + \frac{1}{{{{3}^{4}}}} + \ldots } \right) \\ \times \left( {1 + \frac{1}{{{{5}^{2}}}} + \frac{1}{{{{5}^{4}}}} + \ldots } \right) \ldots \,\,. \\ \end{gathered} $$
(1.4d)

According to Euler’s brilliant idea, on the right-hand side of (1.4d), it is possible to expand all brackets and, according to the fundamental theorem of arithmetic, to obtain the answer to the considered question:

$$\frac{1}{{{\mathbf{P}}({\text{random fraction is irreducible}})}} = \sum\limits_{n = 1}^\infty \frac{1}{{{{n}^{2}}}}.$$
(1.4e)

With the help of modern computational tools, the answer can be obtained numerically: since \(\sum\nolimits_{n = 1}^\infty \frac{1}{{{{n}^{2}}}}\, \approx \,1.645\), we have

$${\mathbf{P}}({\text{random fraction is irreducible}}) \approx \frac{1}{{1.645}} \approx 0.608.$$

Thus, the random fraction is irreducible with probability of about 61% and, therefore, it is reducible with probability of about 39%. These theoretical results can be verified in CEs with the use of random number generators.

This result could be considered exhaustive, especially since the numerical result was obtained with accuracy traditional for probability theory.

However, as we went through, we encountered a remarkable number, namely, the sum of inverse squares \(\sum\nolimits_{n = 1}^\infty \frac{1}{{{{n}^{2}}}}\). We cannot ignore the classical mathematics associated with this number, especially taking into account the enormous possibilities of CEs in explaining it.

The problem concerning the exact value of the sum of inverse squares was posed in the 17th century by Pietro Mengoli, a little known student of famous Bonaventura Cavalieri (see [24]). It is now known as the Basel problem after Basel, hometown of the Bernoulli family, who worked much on this problem (see [25]). The answer was not obtained until the 18th century (see [26]). Euler applied the identity

$$\sin (\pi x) \equiv \pi x\prod\limits_{n = 1}^\infty \left( {1 - \frac{{{{x}^{2}}}}{{{{n}^{2}}}}} \right),$$
(1.4f)

which he understood as equality of “polynomials of infinite degree” based on the coincidence of their sets of roots; in the given case, this is the set of integers. This identity, together with the above-discussed Taylor series expansion of the sine function, is a wonderful field for CEs, both graphical and numerical. For example, Fig. 24 shows the graphs of three functions:

$$y = \sin x,$$
$$y = \pi x - \frac{{{{{(\pi x)}}^{3}}}}{{3!}} + \frac{{{{{(\pi x)}}^{5}}}}{{5!}} - \frac{{{{{(\pi x)}}^{7}}}}{{7!}} + \frac{{{{{(\pi x)}}^{9}}}}{{9!}},$$
$$\begin{gathered} y = \pi x(1 - {{x}^{2}})(1 - {{x}^{2}})\left( {1 - \frac{{{{x}^{2}}}}{{{{2}^{2}}}}} \right)\left( {1 - \frac{{{{x}^{2}}}}{{{{3}^{2}}}}} \right) \\ \times \;\left( {1 - \frac{{{{x}^{2}}}}{{{{4}^{2}}}}} \right)\left( {1 - \frac{{{{x}^{2}}}}{{{{5}^{2}}}}} \right)\left( {1 - \frac{{{{x}^{2}}}}{{{{6}^{2}}}}} \right) \\ \end{gathered} $$

(of course, the parameters can be varied and improved). Applying “Vieta’s theorem” to an identity following from (1.4f), namely, to

$$\sin (\pi x)\sum\limits_{n = 1}^\infty {{( - 1)}^{{n - 1}}}\frac{{{{{(\pi x)}}^{{2n - 1}}}}}{{(2n - 1)!}} \equiv \pi x\prod\limits_{n = 1}^\infty \left( {1 - \frac{{{{x}^{2}}}}{{{{n}^{2}}}}} \right),$$
(1.4g)

more precisely, equating the coefficients of x3 in (1.4g), we obtain the solution to the Basel problem:

$$\sum\limits_{n = 1}^\infty \frac{1}{{{{n}^{2}}}} = \frac{{{{\pi }^{2}}}}{6}.$$
(1.4h)
Fig. 24.
figure 24

Decomposition of the sine function into a sum and a product.

Combining the above results yields the answer to the original question:

$$\begin{gathered} fraction \,\, taken \,\, at \,\, random \,\, is \,\, reducible \,\, with \,\, probability \\ 1 - \frac{6}{{{{\pi }^{2}}}} = 0.392072 \ldots \,\,. \\ \end{gathered} $$

(b) Bernoulli trials. Along with heuristics, this time not CE, but rather actual tossing of a coin a sufficiently large number n of times,Footnote 7 we suggest considering an exact mathematical model with the space of equiprobable outcomes

$$\Omega = \{ {\text{head,tail}}{{\} }^{n}}$$

and a random variable (for its agreement with the theory of Bernoulli trials, it is convenient to call it success)

$$\begin{gathered} \xi = {\text{su}}{{{\text{c}}}_{n}}:\Omega \to \{ 0, \ldots ,n\} :({{\omega }_{1}}, \ldots ,{{\omega }_{n}}) \\ \mapsto \# \{ i|{{\omega }_{i}} = {\text{head}}\} , \\ \end{gathered} $$

which assigns to a series of trials the number of resulting heads.

The upper envelope of the histogram of this random variable is the “graph” of the \(n\)th row of the Pascal’s triangle

$$k \mapsto \left( {\begin{array}{*{20}{c}} n \\ k \end{array}} \right),$$

and the central limit theorem in its simplest form states that the properly scaled form of this “graph” stabilizes as \(n \to \infty \). In view of the above-considered extrapolation of the factorial, we can talk about the graph of the actual function

$$k \mapsto \frac{{n!}}{{k!(n - k)!}}$$

defined on the interval \([0,n]\) (an alternative is CE-mastering of Stirling’s formula).

The probability-theoretic aspect of the above stabilization effect is associated with the normalization of the random variable \(\xi \):

$${\text{N}}\xi : = \frac{{\xi - {\mathbf{M}}\xi }}{{\sqrt {{\mathbf{D}}\xi } }},$$

where M is the expectation and D is the variance. In the case under consideration, \(\xi = {\text{su}}{{{\text{c}}}_{n}}\) and we have \({\mathbf{M}}\xi = \frac{n}{2}\) and \({\mathbf{D}}\xi = \frac{{\sqrt n }}{2}\), so the graph of \(z = \left( {\begin{array}{*{20}{c}} n \\ k \end{array}} \right)\) is transformed according to the formulas

$$x = \frac{{k - \frac{n}{2}}}{{\frac{{\sqrt n }}{2}}},\quad y = \frac{{\sqrt n }}{2} \cdot \frac{z}{{{{2}^{n}}}}.$$

We obtain the graph of the function

$$y = \frac{{\sqrt n }}{{{{2}^{{n + 1}}}}}\left( {\begin{array}{*{20}{c}} n \\ {\frac{{\sqrt n }}{2}x + \frac{n}{2}} \end{array}} \right).$$

According to Stirling’s formula,

$$N! \approx \sqrt {2\pi N} {{\left( {\frac{N}{{\text{e}}}} \right)}^{N}}$$

and, as \(n \to \infty \), these graphs are approximated by a Gaussian (see Fig. 25). This figure depicts

$$y = \frac{1}{{\sqrt {2\pi } }}{{{\text{e}}}^{{ - \frac{{{{x}^{2}}}}{2}}}},$$
$$y = \frac{1}{{\sqrt {2\pi } }}{{{\text{e}}}^{{ - \frac{{{{x}^{2}}}}{2}}}}$$

for \(n = 4,10,100\).

Fig. 25.
figure 25

Gaussian approximations for n = 4, 10, 100.

As we can see, the approximations are very accurate: for \({\text{|}}x{\text{|}} > 3\) and \(n \geqslant 10\), the blue and red curves are nearly indistinguishable, which reflects the three-sigma rule.

Returning from analytical and graphical considerations to numerical probability-theoretic ones, we should realize that the above-mentioned (on a slightly different scale) integral

$$1 = \frac{1}{{\sqrt {2\pi } }}\int\limits_{ - \infty }^\infty {{{{\text{e}}}^{{ - \frac{{{{x}^{2}}}}{2}}}}{\text{d}}x} $$

is the total probability and should get familiar both graphically and numerically with the probabilities for \(0 \leqslant {{k}_{1}} \leqslant {{k}_{2}} \leqslant n\)

$${\mathbf{P}}({{k}_{1}} \leqslant {\text{su}}{{{\text{c}}}_{n}} \leqslant {{k}_{2}}) = \frac{1}{{{{2}^{n}}}}\sum\limits_{k = {{k}_{1}}}^{{{k}_{2}}} \left( {\begin{array}{*{20}{c}} n \\ k \end{array}} \right) \approx \frac{1}{{\sqrt {2\pi } }}\int\limits_{\frac{{{{k}_{1}} - \frac{n}{2}}}{{\frac{{\sqrt n }}{2}}}}^{\frac{{{{k}_{2}} - \frac{n}{2}}}{{\frac{{\sqrt n }}{2}}}} {{{{\text{e}}}^{{ - \frac{{{{x}^{2}}}}{2}}}}{\text{d}}x,} $$

which can be verified in both computer and actual experiments.

It is useful and possible to extend the above results to both Bernoulli trials with nonequiprobable outcomes and to series of pairwise independent trials with a larger number of outcomes.

7 COMBINATORICS

We discuss two traditional topics.

(a) Catalan numbers. They can be defined in many different ways, and the equivalence of these definitions is good mathematics requiring CE (see [27]).

One of the definitions of the Catalan numbers is as follows:

$${{c}_{n}}: = \# \frac{{n{\text{ - edge plane trees}}}}{{{\text{isotopy}}}}.$$

The corresponding enumeration problem is instructive and is solvable with the help of CE. To make lists of trees, it is convenient to use “Live mathematics” or GeoGebra.

The generating function

$$C(x): = 1 + \sum\limits_{n = 1}^\infty {{c}_{n}}{{x}^{n}}$$

can be defined in several equivalent ways. It is elementary, and its values can be checked using CE. Very promising are various generalizations of the Catalan numbers.

(b) Particio numerorum. The number of representations of a positive integer by the sum of smaller positive integers (partition of a pile of objects into smaller piles…) can be studied even by preschool children, but it is associated with serious “adult” mathematics. We introduce the standard notation

$$\begin{gathered} {\text{p}}(n): = \# \{ ({{x}_{1}},{{x}_{2}}, \ldots )|{{x}_{i}} \in \mathbb{N}, \\ {{x}_{1}} \geqslant {{x}_{2}} \geqslant \ldots ,n = {{x}_{1}} + {{x}_{2}} + \ldots \} \,. \\ \end{gathered} $$
(2.1a)

Compiling complete lists of partitions for small n is a nontrivial enumeration problem. As in the preceding case, for this purpose, it is convenient to use “Live mathematics” or GeoGebra. Making complete lists \({\text{p}}(n)\) at least for \(n \leqslant 100\) is a nontrivial computer problem.

This problem can be effectively solved with the help of the (Euler) generating function

$${\text{E}}(x): = 1 + \sum\limits_{n = 1}^\infty {\text{p}}(n){{x}^{n}}.$$
(2.1b)

A crucial role is played by the factorization of this function:

$$\begin{gathered} {\text{E}}(x): = (1 + {{x}^{1}} + {{x}^{{1 + 1}}} + \ldots ) \\ \times \,(1 + {{x}^{2}} + {{x}^{{2 + 2}}} + \ldots )(1 + {{x}^{3}} + {{x}^{{3 + 3}}} + \ldots ) \ldots , \\ \end{gathered} $$
(2.1c)

which implies that

$$\frac{1}{{{\text{E}}(x)}} = (1 - x)(1 - {{x}^{2}})(1 - {{x}^{3}}) \ldots .$$
(2.1d)

Manually expanding (an infinite number) brackets on the right-hand side is a rather labor-consuming task, but it can easily be implemented in CE with the help of, say, MAPLE. The result, as in Euler’s times, is amazing: the series is “quadratically” sparse! More precisely, it is true that

$$\begin{gathered} \frac{1}{{{\text{E}}(x)}} = 1 - x - {{x}^{2}} + {{x}^{5}} + {{x}^{7}} - {{x}^{{12}}} - {{x}^{{15}}} + {{x}^{{22}}} \\ \, + {{x}^{{26}}} - {{x}^{{35}}} - {{x}^{{40}}} + {{x}^{{51}}} + {{x}^{{57}}} - {{x}^{{70}}} - {{x}^{{77}}} \ldots . \\ \end{gathered} $$
(2.1e)

It is not so easy to guess the pattern on the right side of (2.1e). For some combinatorial reasons, it is called Euler’s pentagonal number theorem, which states that

$$\prod\limits_{n = 1}^\infty (1 - {{x}^{n}}) = \sum\limits_{k \in \mathbb{Z}} {{( - 1)}^{k}}{{x}^{{\frac{{k(3k + 1)}}{2}}}}$$
(2.1f)

Once again inverting series (2.1f), we can effectively make lists of values of p(n).

Concerning CE with partitions (see [28]), the sparsity of some powers on the left-hand side of (2.1f) is explained in [29].

8 TOPOLOGY

We mention two indeed research (not only educational) problems.

(a) Harer–Zagier numbers. Here, we mean random gluings of polygons. Let \({{\varepsilon }_{g}}(n)\) denote the number of orientable gluings of a 2n-gon (results of pairwise identification of sides from which a Möbius strip cannot be cut off). Enumeration of such gluings is a good computer problem, which can be accompanied by various CEs over Gaussian words.

Tables of Harer–Zagier numbers can be constructed using the well-known recurrence:

$$\begin{gathered} {{\varepsilon }_{g}}(n) = \frac{{4n - 2}}{{n + 1}}{{\varepsilon }_{g}}(n - 1) \\ + \frac{{(n - 1)(2n - 1)(2n - 3)}}{{n + 1}}{{\varepsilon }_{{g - 1}}}(n - 2). \\ \end{gathered} $$

This recurrence is important for many areas of mathematics, and several of its “professional” proofs are available. However, presumably, there are no transparent proofs with a clear combinatorial-topological meaning.

Most likely, no fundamental structures underlying the theory of graphs \(\Gamma \) on orientable surfaces S with complements \(S{{\backslash }}\Gamma \) being homeomorphic to disks (children’s one-cell drawings) are known to modern science. CEs that group gluings in a meaningful way for transparent proof of the recurrence can help discover such structures. An elementary introduction to the theory can be found in [30].

(b) Homotopy groups of spheres. Somewhat mysterious mappings of spheres

$${{{\mathbf{S}}}^{{\mathbf{m}}}} \to {{{\mathbf{S}}}^{{\mathbf{n}}}}$$

for m > n that are nonhomotopic to identity maps have been known for more than half a century. They begin with the complex and quaternion Hopf fibrations

$${{{\mathbf{S}}}^{{\mathbf{3}}}} \to {{{\mathbf{S}}}^{{\mathbf{2}}}}{\text{ and }}{{{\mathbf{S}}}^{{\mathbf{7}}}} \to {{{\mathbf{S}}}^{{\mathbf{4}}}}.$$

It is unlikely beginning mathematicians can put things in order (i.e., compute all homotopy groups of spheres) in multidimensional topology, in which renowned mathematicians have worked since the middle of the 20th century.

However, modern computer technologies provide tools for working with multidimensional objects that were not available 70 years ago, for example, CEs with piecewise linear mappings of spheres divided into small pieces (simplices). Possibly, young mathematicians who start thinking early on questions of these kinds will develop a multidimensional topological intuition that previous generations did not possess.

Some materials can be found, for example, in [31].

9 DYNAMICS

Hard work, especially experimental, has been conducted in the proposed directions in recent decades. Nevertheless, there are open questions, and further CEs are desirable.

(a) Collatz conjecture (3n + 1 problem). We mean iterations of the mapping

$$\mathbb{N} \to \mathbb{N}:n \mapsto \left\{ {\begin{array}{*{20}{l}} {\frac{n}{2}{\text{ if }}n \in 2\mathbb{N}} \\ {3n + 1{\text{ if }}n \in 2\mathbb{N} + 1.} \end{array}} \right.$$

It is believed that any orbit comes to a cycle \(1 \mapsto 4 \mapsto 2 \mapsto 1\), but nobody has been able to prove this for decades. A direct computer study of this system is easy to perform, so we should begin with it. Further CEs would presumably be associated with statistical data processing. Specifically, they would be concerned with how the orbit lengths are distributed before reaching the basic cycle, how far a randomly taken number can go, etc.

In doing this project, student should probably learn what 2-adic numbers are and should understand the analogy between the considered map and the tent one. Probably, it is useful to read [32]. We do not think that significant progress can be made by students in this area, but the simplicity of the task and the possibility of experiments, visualization, etc., are fascinating.

(b) Iterations of quadratic maps. With the capabilities of modern computers, it is worth drawing a variety of Julia sets and the unique Mandelbrot set, just to admire them.

A serious mathematical issue is the study of the Feigenbaum constant (for quadratic maps). Today its existence can be rigorously proved (see [33]). However, we do not even know whether it is rational, as we were not sure about the irrationality of the number π for thousands of years.

Apparently, one of the main conditions for a successful reflection on the Feigenbaum constant is to never cease to wonder at the phenomenon of universality that it governs.

10 CONCLUSIONS

Having been teaching mathematics with the use of computer experiments for many years, the authors can try to draw some conclusions, particularly because one of them in the same years intensively taught mathematics in a more traditional way.

A striking property of reasonable CEs is that they save efforts spent on routine operations and ensure some certainty of results, provided that they are checked repeatedly preferably by different people on different computers. More important, of course, is the possibility of obtaining results that are inaccessible at all with manual calculation, enumeration, or drawing.

An important pedagogical aspect of CE is that it is not sufficient for future mathematicians to get results with CE as many times as they like: for a full understanding, they need something else (hardly only a formal proof; rather an understanding of the picture of the world whose fragment was seen with the help of CE). Apparently, based on this parameter, it is possible to judge the prospects that a young human would be professionally engaged in mathematics. In any case, this parameter is no less important than the ability to solve same-type problems quickly and correctly.

Concerning the prospects of CEs in teaching, the authors are rather cautious. Many of the theses presented in this paper express the authors’ opinion about the topic. In recent decades, the popularity of CE in geometry teaching has increased markedly. It is not clear whether this process will continue (rather yes than no) and how soon CEs will begin to spread to other areas of mathematics.

Based on many years of experience, we can say with absolute confidence that teachers get great pleasure resulting from successful CEs and this pleasure is usually transmitted to students. This pleasure (or sometimes lack of it) is a baseline parameter in the relationship between the human and Mathematics.