The expansion of chemical space in 1826 and in the 1840s prompted the convergence to the periodic system

Significance The number and diversity of substances constituting the chemical space triggered, in two important steps, the convergence of the periodic system toward a stable backbone structure eventually unveiled in the 1860s. The first step occurred in 1826, and the second was between 1835 and 1845. Interestingly, the salient features of the periodic system of the 1860s can be detected as early as the 1840s, even when considering the effect of disagreement regarding the determination of atomic weights. The methods presented here become instrumental to study the further evolution of the periodic system and to ponder its current shape.

We retrieved 21,521 single-step reactions with publication year before 1869 from Reaxys, accounting for 11,451 substances. By eliminating substances with unreliable formulae, e.g.
holding intervals as stoichiometric coefficients, such as Ta 1.15−1.35 S 2 and by manually curating 245 formulae with non-integer amounts of crystallisation species, e.g. CdCO 3 *0.5H 2 O, curated as CCdHO 3.5 , we ended up with 11,356 substances. We associated each of these substances with its earliest publication year (in a chemical reaction) and with its molecular formula.

Disregarded elements and their separations
Er and Yt, along with In, were elements whose identity was questioned by Mendeleev and expressed as ?Er, ?Yt and ?In in his table (1). Yt was the symbol used until 1920 for Y (2) and the first Y (or Yt) reaction is from 1872. Thus, neither Mendeleev nor Meyer had clear information about the element. Er was also problematic. By 1868 it was unknown that Er was actually a mixture of an element later (1878) coined Er and Yb. Er was separated one year later into Ho and the current Er and Tm. The same year, Yb was found to be accompanied by current Sc. In 1886 Ho was separated into the current Ho and Dy. The 1879 Yb was found to be a mixture of current Lu and Yb in 1907 (3). It is now known that Di, reported by Mendeleev as an element, was found as a mixture of Di and Sm in 1879. One year later, Sm was separated into Sm and the current Gd; this Sm was found by 1901 to be made of current 2 Eu and Sm. The 1879 Di turned out to be a mixture of current Pr and Nd in 1885. Therefore, we excluded Er, Yt and Di from our analysis and all the study is based on our findings for the 60 elements shown in Figure 1a (main text).

Evolution of some molecular fragments
We determined the temporal appearance of the molecular fragments depicted in Figure S4 by exploring the connection tables of the compounds reported in the database between 1800 and 1869. A connection table is a "listing of atoms and bonds, and other data, in tabular form" (5).  Figure S4. In the inset, M stands for a metal, with M={Li, Be, Al, Si, Fe, Co, Zn, As, Rh, Sb, Pt, Hg, Tl, Pb, Bi} .

Quantifying similarity among chemical elements
We quantified the similarity of element x regarding element y as the fraction of substances of x in whose formulae x can be replaced by y yielding a formula that is part of the chemical space.
Hence, for an element x having s x substances in the chemical space, which are gathered as is the arranged formula of substance i containing element x. Arranged formulae are assigned to a reference element, whose similarity regarding other elements is to be calculated.
x whose formula multiplicity is m x (i). By multiplicity of a formula is meant the number of times the formula shows up in the multiset, that is the number of times the formula is found in the chemical space of element x.
With the list of arranged formulae for elements x and y, we can calculate s(x → y) as: As |F x | amounts to counting the multiplicities of arranged formulae of x, then |F x | = m x (i).
6 Similarity values among chemical elements   Figure S7: Systems of chemical elements by Meyer (a: 1864, gathering together his three separate tables; b: 1868; d: 1869/70) (6,7,4) and Mendeleev (c: 1869, rotated and reflected for the sake of comparison with the other SCEs) (1). Element symbols are updated to current notation. Lines and boxes indicate similarities. The complete list of similarities for Mendeleev is found in Table S2. Line widths are proportional to the number of times the similarity is discussed by each author. Line colours are used only for the sake of clarity   (8). Red entries correspond to similarities Mendeleev thought did not exist and the blue one to "not so well studied." Size of chemical space sample (s%)  Figure S9: Stability of similarities regarding chemical space size. Each row contains a given similarity observed by considering the chemical space in year y. The stability of each similarity corresponds to the percentage of appearance of such similarity in the sampled space of size s%. Colours associated to this percentage are shown on the right bar. Further details in Materials and Methods (main document).

Contrasting Meyer and Mendeleev' systems of chemical elements with those of the chemical space (presentist approach)
We took the three systems by Meyer, which were formulated in 1864 (6), 1868 (7) and 1869/70 (4); and the first Mendeleev' system published in 1869 (1). We extracted the similarities among the elements out of these systems and contrasted them with the "most similar" relationships of the systems of elements of the respective years 1863, 1867 and 1868.
The time difference of one year between the system of elements of each author and the system of elements of the chemical space is to regard the time required for a chemist to be updated with the literature in the nineteenth-century.  (7,9). Similarities in his 1869 table are further discussed in the paper where they were published. Therefore, besides the usual vertical similarities (in the 1869 published representation corresponding to rows), we also included those similarities mentioned by Meyer (4), plus the transition metal ones: Mn, Ru, Os, Fe, Rh, Ir and Co=Ni, Pd, Pt ( Figure S7). Mendeleev discussed thoroughly the similarities and even some lack of similarities (8), both of them listed in Table S2.     Figure S10: Chemical elements used to build up the systems of elements of the nine nineteenth-century chemists. Elements known by the year discussed in each table are shown in black, while undiscovered elements and known by 1869 in grey. In red mixtures that were thought to be elements.  Figure  S10). Disregarded elements correspond to those not having substances in the database participating in single step chemical reactions (Section 1).  So, the ratios we are interested in approximating as simple ratios are either 0.05077 or 1.9695.
We say that 0.5077 must be expressed by any fraction f of the form x/y, such that 0 < f ≤ 1.
Likewise, that 1.9695 can be decomposed and approximated by a fraction of the form Having selected the order of the Farey sequence to work with, we then proceed to devise a way to quantify the accuracy of the approximation of the ratio r by the fraction f . We, therefore, calculate the relative error of the approximation.
As F 200 has 12,233 fractions (the number of fractions |F n | = n(n+3) 2 − n k=2 |F n k | (22), we set up an order to explore those fractions, based on the aim of finding simple fractions. That is, we need fractions x/y such that both x and y are small whole numbers. We quantify such a "simplicity" of fractions by their associated "area" x × y. The smaller the area of a fraction, the simpler the fraction is. Hence, we order F 200 fractions by non decreasing order of their area, that is F 200 is arranged as (0/1, 1/1,1/2, 2/1, 1/3, 3/1,..., 199/200, 200/199). According to this order, we quantify the relative error of the approximation. To decide which fraction better 46 approximates the ratio in study (in this example either 0.5077 or 0.9695), we further need a stopping criterion indicating the amount of error to be allowed (tolerance). We selected 20 different values of tolerance τ, from 1% to 20% of relative error.
Hence, the best fraction approximating the given ratio is that simple fraction whose error(r, f ) ≤ τ. For each τ we have a best approximating fraction.
14 Similarities in the SCE of 1868 and their relationships with those of each chemist's SCE For every chemist publishing a set of atomic weights in year y, known Reaxys substances (S y−1 ) up to year y − 1 (inclusive) were retrieved and the corresponding SCE P y−1 was obtained (see Figure 2 (main text)). Formulae of substances S y−1 were approximated with 20 different tolerance values (τ), each τ yielding a SCE with similarities gathered in P τ y−1 (Section 13).   Figure S12: Fraction of similarities observed by chemist' space with tolerance τ in year y − 1 that are observed in 1868, calculated as |P τ y−1 ∩ P 1868 |/|P τ y−1 |. The 20 similarity values (coloured dots) for each chemist are gathered together in a violin plot. For the sake of comparison the similarity |P y−1 ∩ P 1868 |/|P y−1 | is depicted as a black dot.
According to Figure S12, differences in the systems of atomic weights appear as a major issue in the construction of SCEs during the early years of the century. For instance, in Dalton's case (1810) strong perturbations on the formulae accounting for differences in atomic weights (low tolerances) did not produce any 1868 similarity, while tiny perturbations associated with high tolerances made that about 30% of the resulting similarities matched those observed in 1868 ( Figure S12). Interestingly, this is a larger similarity than that of the SCE obtained with the unperturbed chemical space (our modern formulae, black dot in Dalton's violin in Figure   S12), which means that Dalton's data and assumptions regarding atomic weight could actually have been used to improve the SCE. A similar behaviour is observed for Berzelius (1819) ( Figure S12). These behaviours especially occurred before 1830, as a consequence of the large number of similarities resulting from those exploratory times. For instance, the black dot in