Forbidden Families of Minimal Quadratic and Cubic Configurations

A matrix is \emph{simple} if it is a (0,1)-matrix and there are no repeated columns. Given a (0,1)-matrix $F$, we say a matrix $A$ has $F$ as a \emph{configuration}, denoted $F\prec A$, if there is a submatrix of $A$ which is a row and column permutation of $F$. Let $|A|$ denote the number of columns of $A$. Let $\mathcal{F}$ be a family of matrices. We define the extremal function $\text{forb}(m, \mathcal{F}) = \max\{|A|\colon A \text{ is an }m-\text{rowed simple matrix and has no configuration } F\in\mathcal{F}\}$. We consider pairs $\mathcal{F}=\{F_1,F_2\}$ such that $F_1$ and $F_2$ have no common extremal construction and derive that individually each $\text{forb}(m, F_i)$ has greater asymptotic growth than $\text{forb}(m, \mathcal{F})$, extending research started by Anstee and Koch.


Introduction
The investigations into the extremal problem of the maximum number of edges in an n vertex graph with no subgraph H originated with Erdős and Stone [13] and Erdős and Simonovits [12] . There is a large and illustrious literature. A natural extension to general hypergraphs is to forbid a given trace. This latter problem in the language of matrices is our focus. We say a matrix is simple if it is a (0,1)matrix and there are no repeated columns. Given a (0,1)-matrix F , we say a matrix A has F as a configuration, denoted F ≺ A, if there is a submatrix of A which is a row and column permutation of F . Let |A| denote the number of columns in A. A simple (0,1)-matrix A can be considered as vertex-edge incidence matrix of a hypergraph without repeated edges. A configuration is a trace of a subhypergraph of this hypergraph. Let A c denote the 0-1-complement of a (0,1)-matrix A. It is easy to see that forb(m, F ) = forb(m, F c ).
We recall an important conjecture from [10]. Let I k denote the k × k identity matrix, let I c k denote the (0,1)-complement of I k , and let T k denote the k × k upper triangular matrix whose ith column has 1's in rows 1, 2, . . . , i and 0's in the remaining rows. For p matrices m 1 × n 1 matrix A 1 , an m 2 × n 2 matrix A 2 ,. . . , an m p × n p matrix A p we define A 1 × A 2 × · · · × A p as the (m 1 + · · · + m p ) × n 1 n 2 · · · n p matrix whose columns consist of all possible combinations obtained from placing a column of A 1 on top of a column of A 2 on top of a column of A 3 etc. For example, the vertex-edge incidence matrix of the complete bipartite graph K m/2,m/2 is I m/2 × I m/2 . Define 1 k to be the k × 1 column of 1's and 0 ℓ to be the ℓ × 1 column of 0's. Conjecture 1.1. [10] Let F be a k × ℓ matrix with F = 0 1 . Let X(F ) denote the largest p such that there are choices A 1 , A 2 , . . . , A p ∈ {I m/p , I c m/p , T m/p } so that F ⊀ A 1 × A 2 × · · · × A p . Then forb(m, F ) = Θ(m X(F ) ).
We are assuming p divides m which does not affect asymptotic bounds. It is natural to extend the concepts of Avoid(m, F ) and forb(m, F ) to the case when not just a single configuration, but a family F = {F 1 , F 2 , . . . , F r } of configurations is forbidden.
One important result in this area is the following theorem of Balogh and Bollobás [11]. Theorem 1.2 (Balogh and Bollobás, 2005). For a given k, there is a constant BB(k) such that forb(m, {I k , T k , I c k }) = BB(k). The best current estimate for BB(k) is due to Anstee and Lu [8], BB(k) ≤ 2 ck 2 where c is absolute constant, independent of k. It could be tempting to extend Conjecture 1.1 to the case of forbidden families, as well. However, as it was shown in [5] forb(m, {I 2 × I 2 , T 2 × T 2 }) is Θ(m 3/2 ) despite the only products missing both I 2 × I 2 and T 2 × T 2 .are one-fold products. An even stronger observation is made in Remark 5. 10.
In the present paper we continue the investigations started in [7]. Anstee and Koch determined forb(m, {F, G}) for all pairs {F, G}, where both members are minimal quadratics, that is both forb(m, F ) = Θ(m 2 ) and forb(m, G) = Θ(m 2 ), but no proper subconfiguration of F or G is quadratic. We take this one step further. That is, we consider cases when one of F or G is a simple minimal cubic configuration and the other one is a minimal quadratic or minimal simple cubic. Our results are summarized in Table 3. We solve all cases when the minimal simple cubic configuration has four rows. If Conjecture 8.1 of [3] is true, then there are no minimal simple cubic configurations on 5 rows. The six-rowed ones are discussed in Section 8. The remaining case is forb(m, Q 8 , F 14 ), where we believe that nonexistence of common quadratic product construction indicates that the order of magnitude is o(m 2 ).
The structure of the paper is as follows. In Section 2 product constructions and bounds implied by them are treated. Then in Section 3 upper bounds implied by the standard induction technique ( [3], Section 11) are given. These combined with product constructions give asymptotically sharp bounds for many pairs of configurations. Sections 4, 5, 6 and 7 deal with specific configurations. In Section 4 a stability theorem is proven for matrices avoiding the configuration Q 3 (t), which is a generalization of the configuration Q 3 (see Table 1), and this theorem is applied to prove forbidden pairs results involving Q 3 (t). Section 5 contains cases when one member of the forbidden pairs is a block of 1's. This naturally involves extremal graph and hypergraph results, as forbidding 1 k,1 restricts the hypergraph corresponding to our simple (0,1)-matrix to be of rank -(k − 1), that is edges are of size at most k − 1. Interestingly enough, in one case we use a very recent theorem of Alon and Shikhelman [1] combined with an old fundamental result of Füredi [14]. Section 6 considers F 9 (see Table 2). Interestingly, some exact results are also obtained. Section 7 deals with Q 9 of Table 1 based on the characterization of Q 9 avoiding matrices of [4]. Finally, in Section 8 we observe that forb(m, {F, G}) is quadratic if F is a minimal quadratic and G is a 6-rowed minimal cubic in all but one case.
Throughout the paper we use standard extremal graph and hypergraph notations, such as ex(m, G) to denote the largest number of edges a graph on m vertices can have without containing a subgraph isomorphic to G, or ex (k) (m, H) for the largest number of edges a k-uniform hypergraph can have without containing a subhypergraph H. The complete k-partite k-uniform hypergraph on partite sets of sizes s 1 , . . . , s k , respectively is denoted by K(s 1 , . . . , s k ). Also, when forbidden pairs of configurations are considered, we use the notational simplification forb(m, {F, G}) = forb(m, F, G) for typesetting convenience. We allow ourselves the ambiguity of writing I × I c instead of the technically precise I m/2 × I c m/2 in product constructions.

Product Constructions
What follows are tables of all minimal quadratic configurations and simple minimal cubic configurations with 4 rows. In addition to the configurations, we have included a list of all 2-fold and 3-fold products of I, I c and T that avoid these configurations. The list of constructions avoiding quadratic configurations comes from [7], and the lists for cubic configurations are proved in Section 2, with the statement that proves the result listed under "Proposition." Note that we have not included the complements of 1 3,1 , 1 2,2 , and I 3 in this table, even though these are also minimal quadratic configurations. This is because if Q denotes any of these configurations then forb(m, Q, F ) = forb(m, Q c , F c ), which is already included in Table 3.
In addition to this, the compliment of 1 4,1 (which we denote by 0 4,1 ), F c 9 , F c 10 , and F c 12 are minimal simple cubic configurations, and the products avoiding these configurations are the complements of the products avoiding their complements. Table 3 contains the asymptotic values for all pairings of the configurations mentioned above when at least one of the configurations is cubic. We note that all exact results stated below hold for m sufficiently large. 1 In this section we determine all product constructions that avoid the minimal cubic configurations mentioned above, where we note that if a configuration A is avoided by the product B then A c is avoided by the product B c . We will then be able to obtain most of our lower bound results from the following observation: Remark 2.1. If F and G are both avoided by the same p-fold product construction then forb(m, F, G) = Ω(m p ).
We note that proving forb(m, F, G) = Ω(m 2 ) when either F or G is a minimal quadratic configuration implies that forb(m, F, G) = Θ(m 2 ), and similarly if forb(m, F, G) = Ω(m 3 ) for F or G a minimal cubic configuration then forb(m, F, G) = Θ(m 3 ). Proposition 2.4. F 9 and F 10 are avoided by every 2-fold product not involving I, and they are contained in every 2-fold product involving I. The only 3-fold product avoiding F 9 and F 10 is I c × I c × I c .
Proof. Note that I 3 is avoided by every 2-fold product not involving I by [7], and because I 3 ≺ F 9 , F 10 it follows that these products must also avoid F 9 and F 10 . Observe that F 9 , F 10 ≺ [01] × I 3 , and hence F 9 and F 10 will be contained in any 2-fold product involving I. It follows from Lemma 2.3 that F 9 , F 10 will be contained in any 3-fold product involving T , so the only 3-fold product that can avoid these configurations is I c × I c × I c , and [3] notes that this is indeed the case. Proposition 2.6. F 11 ⊀ I × T, I c × T, T × T and it is contained in all other 2-fold products. The only 3-fold product that avoids F 11 is T × T × T .
Proof. Note that Q 9 ≺ F 11 and that Q 9 ⊀ I ×T, I c ×T , so it follows that this is also the case for F 11 . Because F 11 = I 2 × I 2 and I 2 ≺ I, I c , it follows that every 2-fold product consisting only of I's and I c 's contains F 11 . [3] notes that F 11 ⊀ T × T × T , so it also follows that F 11 ⊀ T × T . It follows from Lemma 2.5 that every 3-fold product involving an I or I c contains F 11 , so the only 3-fold product that can avoid F 11 is T × T × T . Lemma 2.7. All 2-fold products of I, I c and T avoid F 13 . All 3-fold products avoid F 12 and F c 12 Proof. Every two rows of the first three rows of F 13 contains 1 0 0 1 0 1 0 1 , and as no two rows of I, I c , or T contains this configuration, the first three rows of F 13 can not be found in any 2-fold product of these matrices. Any two rows of F 12 contains 0 1 1 1 0 1 , which again is contained in no two rows of I, I c or T , so this can not be found in any 3-fold product of these matrices. Similar logic holds for F c 12 .
Proposition 2.8. The only 3-fold product that avoids F 13 is T × T × T .
Proof. By Lemma 2.5 every 3-product involving I or I c contains F 13 , and [3] notes that F 13 ⊀ T × T × T .

Inductive Results
In this section we prove a variety of upper bounds by using two standard techniques: Theorem 1.2 and the following standard induction method. Let F be a k-rowed matrix. Suppose we have A ∈ Avoid(m, F ) such that |A| = forb(m, F ). Consider deleting a row r. Let C r (A) be the matrix that consists of the repeated columns of the matrix that is obtained when deleting row r from A. If we permute the rows of A so that r becomes the first row, then after some column permutations, A looks like this: where B r (A) are the columns that appear with a 0 on row r, but don't appear with a 1, and D r (A) are the columns that appear with a 1 but not a 0. We have that We let 1 k,ℓ denote the k × ℓ matrix where every entry is 1. Similarly, we define 0 k,ℓ to be the k × ℓ matrix where every entry is 0. We use the notation C r := C r (A) when it is clear from context what the underlying matrix A is.
Proof. As Q c 8 = Q 8 we see that these two values are equal, so we only address the 1 k,ℓ case. Note that I m gives the lower bound. For the upper bound, note that Q 8 = [01] × I 2 . It follows that when we apply the standard induction that C r can not contain I 2 = I c 2 . But by Theorem 1.2 if |C r | > BB(k + ℓ) we must have T k+ℓ ≺ C r , which would contradict 1 k,ℓ ⊀ A. Thus we must have |C r | ≤ BB(k + ℓ), so we can inductively assume a linear bound for forb(m, Q 8 , 1 k,ℓ ).  In Table 3 Proposition 3.3 is frequently quoted to prove Θ bounds. This is done so when common quadratic lower bound exists for F and G by product constructions listed in Table 2.
Proof. This follows from Lemma 3.2, along with the observations that 1 4  Proof. The lower bound follows from the construction T ×T , and the upper bound is a consequence of Lemma 3.2 and the observations that F 9 , F 10 , F

Avoiding Q 3 (t)
We consider a slight generalization of Q 3 where we always assume t ≥ 2 when we write Q 3 (t). We have the following result from [7]. Proof. Each of these F is contained in either I k or I c k for sufficiently large k, so Theorem 4.1 gives the upper bound, and either I m or I c m gives the lower bound. Our main result for this section will be a stability theorem which says that large Q 3 (t) avoiding matrices "look like" I × I c , and from this we will be able to prove an upper bound for forb(m, Q 3 , F 11 ), and more generally for forb(m, Q 3 (t), I r × I s ). We first introduce some terminology for the proof.
We will say that a row r is sparse when restricted to a set of columns C if, restricted to C, r has at least one 0 but fewer than t 0's (i.e. r has few 0's but is not identically 1), and we will say that a row r is dense when restricted to a set of columns C if r has at least one 1 and at least t 0's within the columns of C (i.e. r has many 0's but is not identically 0). We will say that a column c ∈ C is identified by a sparse row r if r has a 0 in column c.
If A is a matrix and C is a set of columns (not necessarily a subset of the columns of A), then A \ C will denote the set of columns in A that are not in C. We define the matrix Q 3 (t; 0) to be Q 3 (t) without its column of 1's. Lastly, we restate Theorem 4.1 as follows: for any fixed k and t there exists a constant c k,t such that if A is an m-rowed simple matrix with |A| > c k,t m and Q 3 (t) ⊀ A, then t · I k ≺ A. Theorem 4.3. Let A ∈ Avoid(m, Q 3 (t)) with |A| = ω(m log m). There exists a set of integers {k 1 , . . . , k y } and a set A ′ = A ′ 1 , . . . , A ′ y , of configurations A ′ j ≺ A such that: (1) k j+1 ≤ 1 2 k j for all j, and y ≤ log m. (2) There exists k j rows of A such that the columns of A ′ j restricted to these rows are columns of I kj .
(3) If i is a column of I kj and C j i is the set of columns in A ′ j that are an i column in the rows mentioned above, then no row restricted to C j i is dense, and every column of C j i is identified by some sparse row. (4) |A| = Θ( |A ′ j |). We first present an outline of the proof before going into the details. We are given a large Q 3 (t) avoiding matrix A 0 , and as a first step we remove all rows from A 0 that have few 1's (for technical reasons) to get a new matrix A 1 . We then find the largest t · I k in A 1 , and our goal is to use this as the I k1 base for A ′ 1 . To do so, we trim A 1 by getting rid of all columns of C 1 i that are not identified by a sparse row, as well as all rows that are dense restricted to some C 1 i . This gives us A ′ 1 , and we repeat the process on the remaining columns of A 1 , A 2 (after again removing rows with few 1's). It turns out that the largest t·I in A 2 , I k2 , will satisfy k 2 ≤ 1 2 k 1 , and thus we can repeat this process at most log m times. At each step we remove only O(m) columns, so in total only O(m log m) columns of A 0 were removed. As |A 0 | = ω(m log m), the columns that remain (those of A ′ ) must be asymptotically as large as our original A 0 .
Proof. Let A 0 ∈ Avoid(m, Q 3 (t)) with |A 0 | = ω(m log m). Let R 1 denote the set of rows of A 0 that have fewer than 3t − 2 1's, and let A 1 denote A 0 with these rows removed. Note that A 1 need not be a simple matrix, but if C R1 denotes the set of columns that have a 1 in some row of R 1 , then A 1 \ C R1 will be simple. As Note that we will be working with the matrix A 1 , not its simplification A 1 \ C R1 , in order to use the fact that every row has at least 3t − 2 1's.
Define k 1 to be the largest integer such that t · I k1 ≺ A 1 . As |A 1 \ C 1 | = ω(m), Theorem 4.1 tells us that we have t · I k ≺ A 1 \ C 1 ≺ A 1 for any fixed k (so in particular we can assume that k 1 ≥ 3). Rearrange rows so that this t · I k1 appears in the first k 1 rows of A 1 .
Note that no column of A 1 can have two 1's in the first k 1 rows. Indeed, any two rows of t · I k1 for k 1 ≥ 3 induce a Q 3 (t; 0), and hence if a column had 1's in two of these rows we would have Q 3 (t) ≺ A 1 . We can thus partition the columns of A 1 as follows. We will say that a column c belongs to the set C 1 i for 1 ≤ i ≤ k 1 if c has a 1 in row i, and we will say that c ∈ C 2 if c has no 1's in these rows. We will make the additional assumption that the t · I k1 we placed in the first k 1 rows was such that |C 2 | is minimal. Note that |C 1 i | ≥ 3t − 2 for all i, as otherwise the ith row would belong to R 1 and hence not be in A 1 .
We now examine the rows that are dense in some C 1 i . Lemma 4.4. If a row r restricted to C 1 i is dense, then restricted to A 1 \ C 1 i , r has at most t − 1 1's or r is identically 1.
Proof. Assume r is dense restricted to C 1 i , i.e. it has at least t 0's and one 1 restricted to C 1 i . If r had t 1's and a 0 in A \ C 1 i , then by looking at the ith row, row r, and the relevant columns, we would find a Q 3 (t).
We would like to strengthen the above lemma to say that dense rows are either identically 0 or identically 1 outside of their C 1 i , and to do so we'll have to ignore a small number of columns of A 1 . We will say that a column c is "bad" if there exists a row r and integer i such that r is dense restricted to C 1 i , r is not identically 1 in A \ C 1 i , and c has a 1 in row r. Let C 1 denote the set of bad columns.
Proof. Each dense row r contributes at most t − 1 columns to C 1 by Lemma 4.4, and hence We now wish to ignore the dense rows of A 1 , as well as any rows of C 1 i that are not identified by a sparse row. Rearrange rows so that the bottom ℓ rows of A 1 consist of all rows that when restricted to some C 1 i are dense. Let C 1 i denote the columns of C 1 i that are not identified by a sparse row and that are not in C R1 or C 1 . Let A 1 denote A 1 restricted to the top k 1 rows, the bottom ℓ rows, and the columns of C 1 i .
Proof. Letĉ andd be columns of A 1 with corresponding columns c, d in A 1 \ C R1 (as no C 1 i columns are in C R1 ). Ifĉ =d, then clearly we must have c, d ∈ C 1 i for some i. As c = d (because A 1 \ C R1 is a simple matrix), we must have c and d differing in some row r above the bottom ℓ rows, say c has a 0 in row r and d has a 1. But this means that r must be sparse (as every row between the top k 1 rows and bottom ℓ rows is either identically 0, identically 1, or sparse), and hence c is identified by a sparse row, contradictingĉ belonging to A 1 .
Proof. By Lemma 4.4 (and the fact that A 1 contains no columns of C 1 ), we know that each row r restricted to C 1 i can be one of four types: r can be identically 0 restricted to A 1 \ C 1 i (in which case we will say it is a row of B i,0 ), r can be identically 1 restricted to A 1 \ C 1 i (in which case we will say it is a row of B i,1 ), or r can itself be either identically 0 or identically 1. We thus have that the matrix B i formed by restricting A 1 to the columns C 1 i and to the rows of

then these rows and columns together with any column of
, then one can find a t · I k1+1 in A 1 . Indeed, in A 1 (note that we are no longer ignoring the columns of C 1 and C R1 ), take the two rows from B i,0 that contain a Q 3 (t; 0), ignore the at most 2t − 2 columns that have 1's in these rows outside of C 1 i , and swap these rows with rows i and k 1 + 1. After performing these steps, no column of A 1 has two 1's in any of the first k 1 + 1 rows (since we removed the at most 2t − 2 columns that could pose a problem), rows i and k 1 + 1 by assumption have at least t 1's, and as every other row had at least 3t − 2 1's before ignoring the at most 2t − 2 columns, they all still have at least t 1's. Hence we have t · I k1+1 ≺ A 1 , contradicting our definition of k 1 . Thus we must have proving the statement.
We now let A ′ 1 be C 1 i after removing the columns of A 1 , C R1 , and C 1 (which in total are only of size O(m)), along with the bottom ℓ rows.
1 } meets all of the conditions of the theorem. Otherwise we can repeat our argument.
Let R 2 denote the set of rows below the first k 1 rows such that if r ∈ R 2 then r has fewer than 3t − 2 1's when restricted to C 2 , and let C R2 be the set of columns where one of these rows has a 1 in C 2 . Let A 2 be A 1 restricted to C 2 after ignoring the rows of R 2 and let k 2 be the largest integer such that t · I k2 ≺ A 2 . Note that we can assume k 2 ≥ 3.
Lemma 4.8. k 2 ≤ 1 2 k 1 . Proof. Note that any row r that is part of this t · I k2 must appear above the bottom ℓ rows (as restricted to C 2 the bottom ℓ rows either have fewer than t 1's or they are identically 1). Thus restricted to any C 1 i , r is either identically 0, identically 1 or sparse. We will say that a row r is "mostly 1" restricted to C 1 i if r is identically 1 or sparse restricted to C 1 i (i.e. r has fewer than t 0's restricted to these columns). Rearrange rows so that this t · I k2 appears in the first k 2 rows.
Note that because k 2 ≥ 3, no column can have two 1's in the first k 2 rows. As any two rows that are mostly 1 restricted to any C 1 i must contain a column with 1's in both of these rows. Hence restricted to any C 1 i and the first k 2 rows, there can be at most one mostly 1 row. If row 1 ≤ j ≤ k 2 is not mostly 1 when restricted to any C 1 i , then we could use row j to create a t · I k1+1 ≺ A 1 by swapping it with our original k 1 + 1th row, contradicting the definition of k 1 . If there is precisely one i such that j restricted to C 1 i is mostly 1, then swapping row j with the original ith row gives a t · I k1 that would have given us a smaller value for |C 2 | (as at least 3t − 2 1's get added from C 2 and at most t − 1 1's are replaced by 0's of the mostly 1 row), which contradicts our choice of t · I k1 ≺ A 1 . Hence every row 1 ≤ j ≤ k 2 must be mostly 1 restricted to at least two different C 1 i , but as each C 1 i can only contribute at most one mostly 1 row we must have k 2 ≤ 1 2 k 1 . We then perform identical arguments for the corresponding C 2 i columns as we did with the C 1 i columns to get an A ′ 2 . If C 3 is defined analogous to C 2 and if We suspect that the statement of Corollary 4.9 can be strengthened to O(max |Ã|, m ), but as stated the Corollary can still be used to prove near optimal results. It is possible to get tighter upper bounds for certain configurations by using some of the additional structure provided by Theorem 4.3.
We first prove this for the case t = 2. Let A ∈ Avoid(m, Q 3 (2), I r × I c s ) with |A| = ω(m log m) and let A ′ be the corresponding set obtained from Theorem 4.3. We focus our attention on bounding |A ′ 1 |. Note that restricted to C 1 i , there must exist |C 1 i | rows that are distinct rows of I c |C 1 i | (one to identify each column of . Denote a set of such rows by R i . If there exists a set of integers {i 1 , . . . , i r } such that |R i1 ∩ · · · ∩ R ir | ≥ s, then by taking these s rows, the rows i 1 , . . . , i r and the relevant columns we can find an I r × I c s in A ′ 1 (since we have an I c s occurring simultaneously under r different I k1 columns). How large can |A ′ 1 | = |C 1 i | be given this restriction?
We rephrase this problem in terms of graph theory. We form a bipartite graph i columns, and r ∈ R corresponding to each row below the first k 1 rows. G will contain the edge v i r iff r ∈ R i . Our restriction of no set {i 1 , . . . , i r } such that |R i1 ∩ · · · ∩ R ir | ≥ s means that G does not contain a K r,s , the complete bipartite graph with vertex sets of size r and s, with the r vertices coming from C and the s vertices coming from R. Using standard arguments from extremal graph theory, this graph can have at most c|R||C| 1−1/s + d|C| ≤ cmk 1/s 1 + dk 1 edges for some constants c and d. Hence in total we have that and thus this is an asymptotic upper bound for |A| = Θ( |A ′ i |). We wish to generalize this argument for arbitrary t. The key idea is that for each set C j i we must find a set of rows Once we have this, we can perform the same graph argument on these R j i rows as we did for the R i rows above and get the same asymptotic results. The following lemma accomplishes this goal by taking B = C j i after ignoring rows that are identically 0.
Lemma 4.11. Given an integer t, let B be a matrix consisting of rows with fewer than t 0's such that every column of B has a 0 in some row. Then there exists a set of rows R of B such that: Proof. The t = 2 case is obvious (for every column take a row that has a 0 in the column), so inductively assume the statement holds up to t − 1. We wish to partition the columns of B into two sets, B 1 and B 2 . Remove the leftmost column c of B and add it to B 1 , and remove all columns c ′ of B where there exists a row r such that r has a 0 in both column c and column c ′ and add these columns to B 2 . Repeat this process until every column of B is in one of these sets, and note that B i ≥ 1 2 |B| for some i. Note that as every column of B was identified, every column of B 1 and B 2 is also identified.
If B 1 ≥ 1 2 |B|, then note that no row r has more than one 0 in B 1 (if r had 0's in c, c ′ ∈ B 1 with c to the left of c ′ , then c ′ should have been added to B 2 ), so by the t = 2 case we can find a set R with |R| = |B 1 | ≥ 1 2 |B| that contains an I c |R| . If |B 2 | ≥ 1 2 |B|, then note that B 2 's rows all have at most t − 2 0's (as every row with a 0 in some c ′ originally had a 0 in the corresponding c column from B 1 ), so by the inductive hypothesis we can find a set R with |R| ≥ 2 2−(t−1) |B 2 | ≥ 2 2−t |B| that contains an I c |R| .
We can use the graph idea from the proof of Theorem 4.10 to achieve lower bounds as well.
We define a generalized product operation for matrices. Let A and B be simple matrices with m 1 and m 2 rows respectively and G = G(C A , C B ) a bipartite graph with the vertex set C A corresponding to the set of columns of A and C B to the set of columns of B. We define A × G B to be the simple matrix on m 1 + m 2 rows such that it contains the column defined by placing the column a ∈ C A on the Let G(V, W ) be a bipartite graph on m vertices such that G avoids K r,s and such that G has the maximum number of edges. Note that using the probabilistic method it is easy to show that |E(G)| ≥ 1 2 ex(m, K r,s ). We claim that Then we must have all of the I r rows coming entirely from either the I |V | rows of A or the I c |W | rows and the I c s rows coming entirely from the other. Indeed, no two rows of the I |V | block of A contains a column of two 1's, but every row of I r in I r × I c s together with a row of I c s contains a column of two 1's, so the I |V | rows can contribute to at most one of these blocks. Further note that if s ≥ 3 then the I c s must come from the I c |W | block (as it needs a column with two 1's), and similarly if r ≥ 3 then I r must come from the I |V | block (and hence again the I c s must come from the I c |W | block). Now consider B = I |V | × G I |W | . If I r × I c s ≺ A then we certainly have I r × I s ≺ B (if s or r were at least 3 then the I c s must have been in I c |W | and then complimented to become an I s , and if s = r = 2 complimenting either block would still leave you with an I 2 × I 2 ). But I |V | × G I |W | is the incidence matrix of G, a graph that avoids K r,s , and hence it must avoid I r × I s , the incidence matrix of K r,s . Thus we could not have had I r × I c s ≺ A.

Avoiding 1 k,ℓ
In this section we study the identically 1 matrices 1 k,ℓ . We first note an immediate consequence of Theorem 1.2.
Proof. Note that 1 k,ℓ ≺ T k+ℓ , I c k+ℓ and that I 3 , F 10 ≺ I 4 and 0 k,ℓ ≺ I k+ℓ . We thus have an upper bound of BB(k + ℓ) by Theorem 1.2.
We next consider a slight generalization of a result from [7].
Proof. As a lower bound one can take all columns with fewer than k − 1 1's, along with the incidence matrix of a maximum (k − 1)-uniform H avoiding hypergraph.
For an upper bound, note that one can have at most m 0 + · · · + m k−2 columns with fewer than k − 1 1's, and the columns with weight k − 1 define the incidence matrix of a (k − 1)-uniform hypergraph that avoids H, and hence can be no larger than ex (k−1) (m, H). . . . , s k−1 )).
We can get similar results when considering configurations of the form 1 k,2 .
Theorem 5.4. Let F be the incidence matrix of a k-uniform complete r-partite hypergraph H with r ≥ k. Then Proof. For a lower bound, again take all columns with fewer than k 1's along with the incidence matrix of a maximum H avoiding k-uniform hypergraph. Let A be a maximum matrix of Avoid(m, 1 k,2 , F ) and let A ′ be a matrix obtained from A by taking every column with more than k 1's and removing 1's until these columns have k 1's. We claim that A ′ ∈ Avoid(m, 1 k,2 , F ). Clearly 1 k,2 ⊀ A ′ (if 1 k,2 ⊀ A then removing 1's from A can't induce this configuration) and A ′ is simple (the columns with fewer than k 1's were already distinct, and if any columns with k 1's were identical we would have a 1 k,2 ), so all that remains is to show that F ⊀ A ′ . To see this, we claim that if F ′ is the matrix obtained by changing any 0 of F to a 1 then F ′ contains a 1 k,2 . This claim is equivalent to saying that if one extends any e ∈ E(H) to e ′ = e ∪ {v} for some v ∈ V (H), v / ∈ e, then there exists an f ∈ E(H) such that |e ′ ∩ f | = k. If e contains no vertices that are in the same partition class as v, then if f is any k-subset of e ′ that includes v then f ∈ E(H) and |e ′ ∩ f | = k. If e contains a vertex v ′ that belongs to the same partition class as v, then f = e ′ \ {v ′ } ∈ E(H) with |e ′ ∩ f | = k, and thus we've proven the claim. This means that A can not contain any configuration that is obtained by taking 0's of F and changing them to 1's (since A avoids 1 k,2 ), and hence the procedure of deleting 1's from A can not induce an Thus for an upper bound of forb(m, 1 k,2 , F ), one only needs to consider matrices where each column has at most k 1's, and this clearly gives the above upper bound. forb(m, 1 k,2 , I s1 × · · · I s k ) = m 0 + · · · + m k − 1 + ex(m, K (k) (s 1 , . . . , s k )).
We note that in general forb(m, 1 k+1,1 , F ) = forb(m, 1 k,2 , F ) when F is the incidence matrix of a k-uniform hypergraph. That is, the statement of Theorem 5.4 can not be strengthened to include all hypergraphs as in Theorem 5.2. For example, Q 9 is the incidence matrix of two disjoint edges. It isn't difficult to see that the extremal number for this graph is m − 1, and hence forb(m, 1 3,1 , Q 9 ) = 2m. However, the following matrix A satisfies |A| = 2m + 1 and A ∈ Avoid(1, 1 2,2 , Q 9 ): It should also be noted that the statement of Theorem 5.4 is not as strong as possible. For example, the theorem statement and general proof also applies to the configuration F stated below, despite it not being the incidence matrix of a complete r-partite 3-uniform hypergraph. It would be interesting to know of a complete characterization of k-uniform hypergraphs that satisfy Theorem 5.4.
Unfortunately for 1 k,ℓ with ℓ > 2, this "downgrading" technique no longer works. We are, however, able to obtain some partial results.
Proof. The lower bound is simply the incidence matrix of the extremal hypergraph. We first prove the upper bound for k = 2 to demonstrate the general idea of the proof. Let A be a maximum matrix in Avoid(m, 1 2,ℓ , I r × I s ) that has no columns with fewer than two 1's (and hence the forb function will be at most O(m) larger than |A|). Let C i denote the set of columns of A whose first 1 is in row i. Note that any row j = i restricted to C i has at most ℓ − 1 1's (otherwise the row together with the ith would induce a 1 2,ℓ ), and further note that each column of C i has a 1 in some row other than the ith (since every column has at least two 1's), i.e. every column of C i is identified by a 1. We can thus use Lemma 4.11 (after switching 0's and 1's in the lemma statement) to find a set of rows R i such that restricted to C i these rows contain a I |Ri| and such that |R i | ≥ 2 2−ℓ |C i |. We then define a bipartite graph with one vertex set corresponding to the C i column sets and the other vertex set corresponding to the rows of A, and we draw an edge between C i and r if r ∈ R i . We would like to say that if this graph contains a K r,s (say the r vertices coming from the C i vertex set and the s vertices coming from the R i vertex set, which is a non-trivial assumption we will deal with later), then A contains an I r × I s . Unfortunately, this is not true. For example, if then A does not contain a I 2 × I 2 , despite the corresponding graph being K 2,2 . The problem is that if we want to use columns from C i and C i ′ with i < i ′ , it's possible that there are 1's in the i ′ th row of C i , and if these 1 columns correspond with the I s under C i then we can't actually use these columns. Fortunately, each row below the ith row of C i contains fewer than ℓ 1's, so this problem can't happen too many times. We claim that if instead of having an I s simultaneously under r different C i we had an I s+c2 , where c 2 = (ℓ − 1) r(r−1)

2
, simultaneously under r different C i , then we could find an I r × I s .
Assume that we have this situation with the i's of our C i 's belonging to the set {i 1 , . . . , i r } < , and let R ′ 0 denotes the set of rows that contain the simultaneous I s+c2 under these C i , noting that |R ′ 0 | = s + (ℓ − 1) r(r−1)

2
. For r ∈ R ′ 0 , we will say that its corresponding column restricted to C ij is the column where r contains the 1 it contributes to the I |R ′ 0 | in C ij . Note that restricted to the r−1 rows {i 2 , . . . , i r }, C i1 contains at most (ℓ−1)(r−1) 1's (as each row has at most ℓ−1 1's). Thus if B 1 is the set of columns of C i1 with 1's in these rows we have |B 1 | ≤ (ℓ−1)(r−1). Define R ′ 1 ⊆ R ′ 0 to be the set of rows that have corresponding columns in C i1 that are not in B 1 , and hence |R ′ Note that restricted to the corresponding columns of R ′ 1 and the rows {i 2 , . . . , i r }, C i1 is identically 0. We can similarly define the subset R ′ 2 ⊆ R ′ 1 consisting of the rows whose corresponding columns in C i2 are 0 in the rows {i 3 , . . . , i r } (row i 1 is automatically identically 0 restricted to C i2 since We repeat this process until we reach the set R ′ r which satisfies |R ′ r | ≥ s and under each C ij , the corresponding columns of R ′ r are identically 0 in the other i j ′ rows. This gives an I r × I s . However, to guarantee an I r × I s in A it is insufficient to simply guarantee the existence of a K r,s+c2 in the graph we constructed, since we could have the s + c 2 vertices coming from the C i vertex set instead of the row vertex set. To remedy this, we must increase r by a suitable amount as well, namely by c 1 = (ℓ − 1) s(s−1) 2 , as in this case a symmetric argument will guarantee our result. Thus the existence of a K r+c1,s+c2 in this graph guarantees an I r × I s , so the graph must have O(ex(m, K r+c1,s+c2 )) edges, and hence |A| = O(ex(m, K r+c1,s+c2 )) as well.
For the general problem, again consider a maximum A with every column having at least k 1's and define the set C(i 1 , . . . , i k−1 ) to be the columns which have their first k − 1 1's in rows i 1 , . . . , i k−1 and with i j > i j−1 . Again we can find rows R(i 1 , . . . , i k−1 ) such that the number of rows is proportional to the number of columns of C(i 1 , . . . , i k−1 ), and restricted to these rows and columns there is a large identity matrix. We can then define a k-uniform k-partite hypergraph with vertex sets V j for 1 ≤ j < k corresponding to all possible choices of i j , and vertex set V k corresponding to all rows of A. We then add the hyperedge {i 1 , . . . , i k−1 , r} to our hypergraph iff r ∈ R(i 1 , . . . , i k−1 ). If this hypergraph contains a K (k) (s 1 + c 1 , . . . , s k + c k ) where c i = (ℓ − 1) max j =i sj −1 2 j =i s j , then we claim that A contains an I s1 × · · · × I s k .
Assume that this hypergraph contains a K (k) (s 1 + c 1 , . . . , s k + c k ), say on the vertex sets V ′ 1 , . . . , V ′ k with V ′ j ⊆ V j and |V ′ i | = s i + c i (again, an assumption we'll have to address later). First note that if i j ∈ V ′ j and i j ′ ∈ V ′ j ′ with j < j ′ , then i j < i j ′ . Indeed, because we have a complete k-partite hypergraph, i j ∈ V ′ j and i j ′ ∈ V ′ j ′ means that there exists an edge containing both i j and i j ′ from these vertex sets. If j ′ < k then this edge corresponds to a column whose jth 1 is in row i j and j ′ th 1 is in row i j ′ ,and if j < j ′ this only makes sense if i j < i j ′ . If j ′ = k then the i j ′ th row must come after the rows where this column has its first k − 1 1's by definition, and hence again i j < i j ′ . This means that for any C(i 1 , . . . , i k−1 ), i ∈ V j with i = i j and j < k − 1, the ith row of C(i 1 , . . . , i k−1 ) is identically 0 (since its (j + 1)th row with a 1 in it comes from row i j+1 > i and its (j − 1)th comes from i j−1 < i if j = 1), and hence when choosing corresponding rows from V ′ k the only potential pitfall will be the rows from V ′ k−1 (as it is possible for C(i 1 , . . . , i k−1 ) to have 1's in row i = i k−1 even if i ∈ V ′ k−1 ). For j < k let V ′′ j ⊆ V ′ j be any subset with |V ′′ j | = s j and let R ′ 0 be the set of rows corresponding to the I s k +c k simultaneously under all of the C(i 1 , . . . , i k−1 ) columns with i j ∈ V ′′ j , and we emphasize that our observations in the preceding paragraph shows us that the rows of R ′ 0 lie entirely below the rows of every V ′′ j for 1 ≤ j < k − 1. Let i 1 , . . . , i k−2 be any fixed elements from the V ′′ j 's. Restricted to the columns C(i 1 , . . . , i k−1 ), where i k−1 varies amongst all V ′′ k−1 , we perform the same procedure that we used for the k = 2 case to obtain a set of rows R ′ 1 , after removing at most (ℓ − 1) s k−1 (s k−1 −1) 2 rows from R ′ 0 , such that that for any i k−1 ∈ V ′′ k−1 and any corresponding column of R ′ 1 restricted to the rows V ′′ k−1 \ {i k−1 }, C(i 1 , . . . , i k−1 ) is identically 0. We then repeat this process for all possible sequences of i 1 , . . . , i k−2 , in total removing at most s k−1 (s k−1 −1) 2 j<k−1 s j rows (which in the worst case scenario is (ℓ − 1) max j =k sj −1 2 j =k s j ). In the end we are left with a set R ′ ⊆ R ′ 0 with |R ′ | ≥ s k and in the corresponding columns of any C(i 1 , . . . , i k−1 ) for i j ∈ V ′′ j and restricted to the rows V ′′ k−1 \ {i k−1 } the matrix is identically 0. This gives an I s1 × · · · × I s k in A. Hence the hypergraph can have at most ex (k) (m, K (k) (s 1 + c 1 , . . . , s k + c k )) edges, which means that overall |A| = O(ex (k) (m, K (k) (s 1 + c 1 , . . . , s k + c k ))).
Proposition 5.7. forb(m, 1 4,1 , F 11 ) = Θ(m 3/2 ). Proposition 5.7 is a corollary of the following theorem that was first proven by Füredi and Sali [16] Theorem 5.8. r ≥ s ≥ k − 2 ≥ 1 be fixed integers. Then forb(m, 1 k,1 , I r × For the sake of completeness we give a simpler proof extending ideas of [15] We need the following theorem of Alon and Shikhelman. Let ex(m, G, H) mean the largest possible number of subgraphs isomorphic to G in an m-vertex graph that does not have H as subgraph. Alon and Shikhelman prove Theorem 5.9 (Alon and Shikhelman). Let r ≥ s ≥ k − 1 be fixed integers. Then ex(m, K k , K r,s ) = O(m k− 1 s ( k 2 ) ), furthermore, if r ≥ (s − 1)! + 1 and s ≥ 2k − 2, then ex(m, K k , K r,s ) = Θ(m k− 1 s ( k 2 ) ).
Simpler Proof of Theorem 5.8. Let A ∈ Avoid(m, 1 k+1,1 , I r × I s ). We can inductively conclude that forb(m, 1 k+1,1 , I r × I s ) <k = O(m k−1− 1 s ( k−1 2 ) ), base case being k = 3. let A ′ be obtained by deleting columns of sum less than k from A. Consider columns of A ′ as characteristic vectors of a k-uniform hypergraph F . Let F ′ 1 be a largest size k-partite subhypergraph of F , with partite classes V 1 , V 2 , . . . , V k . It is well know that |F | ≤ c k |F ′ 1 | for some constant c k . Let H i be the (k − 1)-partite graph induced by F ′ 1 after ignoring V i . Observe that no H i contains K r,s as a trace. Call a hyperedge F ∈ F ′ 1 1-thick if restricted to each H i , F is contained in at least r + s − 2 other hyperedges of F ′ 1 , and call F 0-thick otherwise. There are at most (r + s − 2)|E(H i )| 0-thick edges. Recursively define F ′ i to consist of all F ∈ F ′ i−1 that are i − 1 thick, and call F ∈ F ′ i i-thick if restricted to each H i it is contained in at least r + s − 1 hyperedges of F ′ i . By the same reasoning as before, ) by the inductive hypothesis. On the other hand, the 2-shadow of F ′ k can not contain an K r,s . Assume in contrary that this is the case and consider an edge {x 1 , x 2 } used in this K r,s and let F 0 be a k-thick edge with {x 1 , Otherwise, by definition of F 0 being a k-thick edge there exists r + s − 1 hyperedges that are (k − 1)-thick and that differ with F 0 only in the vertex set V 1 . By the pigeonhole principle, one of these hyperedges, call it F 1 , does not contain any vertex of (V (K r,s ) \ {x 1 , x 2 }) ∩ V 1 and still has {x 1 , x 2 } ∈ F 1 . Continue this way, defining F i to be a (k − i)-thick hyperedge that contains {x 1 , x 2 } and no vertices of (V (K r,s ) \ {x 1 , x 2 }) ∩ j≤i V j , and we can do this at each step by the way we defined (k − i)-thickness. In the end we obtain a hyperedge F k that contains {x 1 , x 2 } and no other vertices of the K r,s . We can repeat this process for each edge of the K r,s , and thus these hyperedges contain I r × I s as a trace. Thus, we inferred that the 2-shadow does not have K r,s as a subgraph. Apply Theorem 5.9 to the graph determined by the 2-shadow of F ′ k and obtain that the number of K k subgraphs is at most O(m k− 1 s ( k 2 ) ), which clearly is an upper bound for |F ′ k |. Summarising, To prove the lower bound take a graph G that gives the lower bound in Alon-Shikhelman' Theorem and let F consists of those k-subsets of the vertices that induce a complete graph. Since G does not have K r,s subgraph, F does not have K r,s as trace, so if A is the vertex-edge incidence matrix of F , then A ∈ Avoid(m, 1 k+1,1 , I r × I s ).
Note that the upper bound in Proposition 5.7 is obtained by putting r = s = k − 1 = 2. The lower bound in Theorem 5.8 does not give the lower bound of Proposition 5.7 directly, however the vertex-edge incidence matrix of a maximal C 4 -free grah works.
Remark 5.10. Despite the largest product avoiding 1 4 and I r × I s being a 1-fold product, Theorem 5.8 shows that one can make forb(m, 1 4 , I r × I s ) = Θ(m 3−ǫ ). Thus the best we could hope for as an extension of Conjecture 1.1 for general forbidden families is forb(m, F, G) = o(m p ) if forb(m, F ) = Θ(m p ) and there exists no p-fold product avoiding both F and G. However, we do not dare to formulate this as a conjecture.
The following extension of Proposition 5.7 was proven in [16].
An alternate proof of this Proposition could be given using similar ideas as in the simpler proof of Theorem 5.8.
Proof. Note that I m gives the lower bound. For the upper bound, we first take a look at what our preliminary data tells us. We have that F 9 ≺ I 3 × I c 2 , so by Theorem 4.10 we know that forb(m, Q 3 (t), F 9 ) = O(m 3/2 ). It also isn't too hard to show (using methods similar to what we'll use below) that |Ã| = O(m) ifÃ ∈ Avoid(m, Q 3 (t), F 9 ) meets all the requirements of the A ′ j matrices in the statement of Theorem 4.3, so we have forb(m, Q 3 (t), F 9 ) = O(m log m) by Corollary 4.9, and this suggests that forb(m, Q 3 (t), F 9 ) = O(m). Unfortunately, this is as far as we can get using the results of Theorem 4.3. However, by following the same basic argument of the proof of the theorem, and by using the extra information that we must also avoid F 9 , we will be able to show the O(m) result.
Let A ∈ Avoid(m, Q 3 (t), F 9 ) such that |A| is maximal and assume |A| = ω(m). Let k be the largest integer such that t · I k ≺ A (we don't consider the R 1 rows as that technical step will not be required for this proof). Rearrange rows so that this t · I k appears in the first k rows and let C i denote the set of columns with a 1 in row i and C 2 the columns with no 1's in the first k rows (and we can assume that k ≥ 3, thus having no Q 3 (t) implies that no column can have two 1's in the first k rows, so all columns belong to precisely one of these sets).
Lemma 6.2. No row r restricted to C i is identically 0.
Proof. Assume there is an r such that r is identically 0 restricted to C i . Consider how many 1's r has in C 2 . If r has fewer than t 1's, then by using the standard induction with row r we see that |C r | ≤ t − 1 = O(1), so we could inductively conclude that |A| = O(m). Otherwise there are at least t 1's, in which case one could use this row to find a t · I k+1 in A, a contradiction. Lemma 6.3. If row r with r > k has a 0 restricted to C i then it has 0's in precisely one C i . Proof. Assume r has a 0 in C i and C i ′ . If there is a 1 in any column of C i ′′ , i ′′ = i, i ′ , then by taking these columns and rows r, i, i ′ , and i ′′ we get an F 9 . If every C i ′′ is identically 0 then by Lemma 6.2 one of C i , C i ′ must have a 1 in some column, say c ∈ C i . But then by taking c, the column with a 0 in C i ′ , and any column in any other C i ′′ along with the relevant rows gives an F 9 .
Proof. Assume |C 2 | = ω(m), in which case there must exist a Q 3 (t; 0) in C 2 and it must lie below the top k rows. But as k ≥ 3, for any two rows r 1 , r 2 ≥ k one can find a 1 1 in some C i (if r 1 has 0's in C 1 and r 2 has 0's in C 2 then neither can have 0's in C 3 by Lemma 6.3). Thus whatever rows the Q 3 (t; 0) lies in one can find a column to give a Q 3 (t), a contradiction.
Proof. Let R i denote C i restricted to its rows that are not identically 1. Note that R i is a simple matrix, and let r i denote the number of rows it has. We can't have |C i | > c 3,t r i (as then we could find a Q 3 (t; 0) in R i and take any column of Theorem 6.6. forb(m, 1 k,ℓ , F 9 ) = Θ(m) provided we don't have k = ℓ = 1.
Proof. Note that I m gives the lower bound. Let A be a maximum sized matrix in Avoid(m, 1 k,ℓ , F 9 ) and apply the standard induction on any row r to get the matrix of repeated columns C r . If C r ≤ BB(k + ℓ + 1) then we inductively conclude that |A| = O(m). Otherwise, we must have either a I 3 , I c k+ℓ+1 or T k+ℓ+1 in C r . As 1 k,ℓ ≺ I c k+ℓ+1 , T k+ℓ+1 , we must have I 3 ≺ C r and hence [01] × I 3 ≺ A. But F 9 ≺ [01] × I 3 , which contradicts F 9 ⊀ A.
It is possible to get a finer value for forb(m, 1 k,ℓ , F 9 ), and even an exact value in a few select cases when m is sufficiently large. We say that a column in A is an n-column if its column sum is n. We define Avoid(m, F ) =n to be the set of matrices A that avoid F and whose columns are all n-columns, and analogously we define forb(m, F ) =n . We similarly define Avoid(m, F ) ≥n and forb(m, F ) ≥n . For columns c, d we will let c ∩ d denote the set of rows that c and d both have 1's in, and we similarly define c ∪ d.
Proof. We first consider the ℓ = 2 case (the ℓ = 1 case is trivial). Assume the first column c of a matrix A ∈ Avoid(m, 1 k,2 , F 9 ) =t has all its 1's in the first t rows. For S ⊆ [t] with |S| ≤ k − 1, let C S denote the set of columns c ′ of A such that c ∩ c ′ = S, and note that every column of A belongs to precisely one such set. But note that |[t] \ S| ≥ 2, which means that for every S there exists two rows such that c has a 1 in these rows and every column of C S has 0's. Hence, below the first t rows the columns of C S can not induce an I 2 (as in these rows c is 0, so these together with the 2 rows mentioned above give an F 9 ). But C S is a simple matrix so if |C S | > BB(k + 2) it must contain a T k+2 , which in particular contains 1 k,2 . Thus |C S | ≤ BB(k + 2) for all S, and as there are fewer than 2 t such sets (and they partition all of A), we must have |A| ≤ BB(k + 2)2 t .
Proof. We have c k,1 = k, so the statement is trivially true for ℓ = 1. Assume for the purpose of induction that this result is true up to ℓ − 1 and consider a matrix A ∈ Avoid(m, 1 k,ℓ , F 9 ) ≥c k,ℓ and any column d in A. Let R 0 denote the rows where d has 0's and R 1 the rows where d has 1's. We claim that restricted to R 0 there exists no I z where z = (ℓ − 1)(c ′ k,ℓ−1 + 1) + 1. Indeed, any two columns of such a I z , say c 1 and c 2 , induce an I 2 in R 0 , and using column d as well as c 1 and c 2 would give a 0 1 0 0 0 1 , thus if there exists two rows in R 1 where c 1 and c 2 are both 0 then one could find an F 9 . As d has at least 2 ℓ−1 (k + 1) − 1 1's, we must have (restricted to R 1 ) |c 1 ∪ c 2 | ≥ 2 ℓ−1 (k + 1) − 2 (otherwise there will be at least two rows of R 1 that aren't covered by c 1 and c 2 ), and hence one of these c i must have at least 2 ℓ−2 (k + 1) − 1 = c k,ℓ−1 1's in R 1 . Thus all but at most one of the I c columns must have at least c k,ℓ−1 1's in R 1 . Let A ′ be A restricted to the R 1 rows and the columns of the I c that have at least c k,ℓ−1 1's in these rows. A ′ need not be simple, but each column can be repeated at most ℓ − 1 times before inducing a 1 k,ℓ , so there are at least c ′ k,ℓ−1 + 1 distinct columns in A ′ . But by the inductive hypothesis this means that there exists either an F 9 (in which case we're done) or a 1 k,ℓ−1 in R 1 , and using column d in addition to this would give a 1 k,ℓ . Thus there can exist no I c in R 0 , but similarly there can't exist sufficiently large I c 's or T 's (as these automatically contain 1 k,ℓ ), so restricted to R 0 there can be at most BB(c) column types.
Any column type restricted to R 0 with at least k 1's can't appear more than ℓ−1 times (as this would give a 1 k,ℓ ), and columns restricted to R 0 with fewer than k 1's must have at least c k,ℓ − (k − 1) = 2 ℓ−1 (k + 1)− 1 − (k − 1) ≥ 2 ℓ−2 (k + 1)− 1 = c k,ℓ−1 1's in R 1 (since every column of A has at least c k,ℓ 1's), and thus can't appear more than c ′ k,ℓ−1 times without inducing in R 1 either an F 9 or a 1 k,ℓ−1 (and hence a 1 k,ℓ by using column d). Thus each of the constant number of column types appears at most a constant number of times, so we have forb(m, 1 k,ℓ , F 9 ) ≥c k,ℓ ≤ BB(c)(ℓ − 1 + c ′ k,ℓ−1 ) = O(1). Lemma 6.9. For any fixed t, if A ∈ Avoid(m, 1 k,ℓ , F 9 ) =t and if c is any column of A, then there are at most Proof. The statement is trivially true for t > k (since there can only be at most O(1) such columns by Lemma 6.7) and t = 1, so assume 1 < t ≤ k. Rearrange rows so that the 1's of c appear in the first t rows of A, and for any S ⊆ [t] let C S denote the columns of A with c ∩ c ′ = S. If S is a set with |S| < t − 1, then as argued in Lemma 6.7 the columns of C S can't contain an I 2 (since there exists at least two of the first t rows with 1's in c and 0's in all of C S ) and it also can't contain a T k+ℓ+1 , so we must have |C S | ≤ BB(k + ℓ + 1), and since there are fewer than 2 t such sets of A we have |A| ≤ BB(k + ℓ + 1)2 t = O(1).
Let A =t denote the collection of columns of a matrix A that are not t-columns. Lemma 6.10. There exists a constant p ∈ N such that if A ∈ Avoid(m, 1 k,ℓ , F 9 ) with |A| ≥ 2pc k,ℓ + c ′ k,ℓ , then there exists a unique t ≤ k such that |A =t | ≤ (2p − 1)k + p. Further, there exists t − 1 rows where every t-column of A has t − 1 1's in these rows.
Note that implicitly this statement requires that m be sufficiently large in order for |A| ≥ 2pc k,ℓ + c ′ k,ℓ .
Proof. Let p be the smallest (constant) value such that it is larger than c k,ℓ + 1, c ′ k,ℓ and all the O(1) constants obtained from Lemma 6.7 for k < t ≤ c k,ℓ and Lemma 6.9 for t ≤ k. Let t ≤ k be the smallest t such that A contains at least 2p t-columns (and at least one such t must exist by the previous lemmas and the assumption that |A| ≥ 2pc k,ℓ + c ′ k,ℓ ). We claim that this is the only such t. Indeed, by Lemma 6.9 at most p of these t-columns don't intersect in the same t − 1 rows, or in other words, at least p of these t-columns must intersect in the same t − 1 rows, say the first t − 1. Their last 1's must all be in separate rows, and this induces an I p below the first t − 1 rows. We claim that A contains no t ′ -column with t < t ′ < p − 1. Indeed, such a t ′ must contain at least two 1's outside of the first t − 1 rows (since t ′ > t), and it does not have 1's in at least two rows of the I p (since t ′ < p − 1). Take two rows where t ′ has 1's below the first t − 1 rows and two rows where t ′ does not have 1's in rows of the I p , as well as the t ′ column and the two columns of the I p that give an I 2 from the rows chosen. The t ′ column gives a the I p has only one 1 in these columns), and this gives an F 9 , so there can be no such t ′ -columns (the same argument shows that any t-column must have 1's in the first t − 1 rows). As t was chosen to be the smallest column type with at least 2p columns, in addition to the fact that forb(m, 1 k,2 , F 9 ) ≥p ≤ c ′ k,ℓ ≤ p, it is the only such column type with at least this many columns, and thus A can contain at most (2p − 1)t + p ≤ (2p − 1)k + p columns that are not t-columns. Corollary 6.11. For m sufficiently large, forb(m, 1 k,1 , F 9 ) = m + c k , where c k is some constant depending only on k.
Proof. Note that I m gives the lower bound. For any A ∈ Avoid(m, 1 k,1 , F 9 ) with |A| ≥ 2pc k,ℓ + c ′ k,ℓ and m sufficiently large, Lemma 6.10 tells us that only one column type appears more than 2p times, say the t-columns for some t ≤ k. But |A =t | ≤ m − t + 1 (only this many t-columns can intersect in the same t − 1 rows, and every t-column in A does this) and |A =t | ≤ (2p − 1)k + p, and hence |A| ≤ m − t + 1 + (2p − 1)k + p ≤ m + (2p − 1)k + p, where (2p − 1)k + p is a constant depending only on k.
Proof. Let p be the constant defined in Lemma 6.10 and let A ∈ Avoid(m, 1 k,ℓ , F 9 ) with |A| ≥ 2pc k,ℓ + c ′ k,ℓ . We claim that A contains at most ℓ − 1 columns with at least k 1's. Indeed, consider the I p in A and note that any column with at least k 1's must have 1's in all but at most one of the rows that contains the I p (as otherwise one can find an F 9 ). As p > k + ℓ, there can exist at most ℓ − 1 such columns before the columns induce a 1 k,ℓ . Thus we can reduce sufficiently large A ∈ Avoid(m, 1 k,ℓ , F 9 ) to an A ′ ∈ Avoid(m, 1 k+1,1 ) after removing at most ℓ − 1 columns, so we have forb(m, 1 k,ℓ , F 9 ) ≤ forb(m, 1 k+1,1 , F 9 ) + ℓ − 1.
It is somewhat surprising that, despite the extra care needed to deal with ℓ > 1 in our lemmas, the value of ℓ only contributes linearly to forb(m, 1 k,ℓ , F 9 ). This will also be the case for forb(m, 1 k,ℓ , Q 9 ) in the next section, and this provides some evidence that the upper bound for forb(m, 1 k,ℓ , I s1 × · · · I s k ) should asymptotically be the same as forb(m, 1 k,2 , I s1 × · · · I s k ).
The exact value of c k seems to be difficult to compute in general, but for specific (small) values of k it is possible to compute. Proposition 6.13. c 2 = 1.
Proof. To do better than our bound of c 2 we must use 2-columns in our construction (and hence we must use Θ(m) of them all intersecting in some row, say row 1). In such a construction, there can't be more than two 1-columns (otherwise we'd have an I 2 below row 1, and then taking any 2-column that doesn't intersect with these 1-columns gives an F 9 ) and we can only have one 0-column. Thus we must have forb(m, 1 3,1 , F 9 ) ≤ 1 + 2 + (m − 1) = m + 2, and this can be achieved by considering A with the 0-column, two 1-columns in rows 1 and 2, and all 2-columns that have 1's in row 1.
Proof. Let A be an extremal matrix in Avoid(m, 1 4,1 , F 9 ) that has a large number of 3-columns that intersect in the first two rows (which again is the only chance of a higher bound than c 3 ) and let A ′ denote the matrix of 0, 1, and 2-columns in A. If A ′ contains an I 2 below the first two rows (say in rows 3 and 4 and columns c 1 and c 2 respectively), then c 1 and c 2 restricted to rows 1 and 2 must look like 1 0 0 1 (they can't contain two 1's in these rows without being a 3-column, and if c 1 and c 2 both had 0's in one of these rows, say the first, then we could find an F 9 by considering rows 1, 2 and 3, columns c 1 , c 2 , a 3-column that has a 1 in row i = 3, 4 and row i). In this situation one can't have a third column c 3 of A ′ with a 1 beyond the first two rows, as either c 3 has a 1 in row 3 (in which case it can't be equal to 1 0 in the first two rows since c 3 = c 1 , and hence c 3 and c 1 contain a row of 0's in the first two rows, giving an F 9 ), row 4 (symmetric argument), or some row other than 3 and 4 (in which case c 3 restricted to the first two rows must be 1 0 to not induce an F 9 with c 2 and 0 1 to not induce an F 9 with c 1 , which is impossible). The only other columns that would be allowed are the four columns with no 1's beyond the first two rows, so in this case we have |A ′ | ≤ 6. The only other case to consider is when all the 1's beyond the second row lie in the same row (say the third), in which case there can be at most 3 2 + 3 1 + 3 0 = 7 columns of A ′ , obtained by considering all columns which have fewer than two 1's in the first three rows and no 1's outside these rows. Such an A ′ avoids F 9 (since F 9 requires four rows with 1's in them), so in total we have that |A ′ | ≤ 7 and that |A ′ | = 7 can be obtained. Thus in total we have forb(m, 1 4,1 , F 9 ) ≤ 7 + (m − 2) = m + 5, and this can be achieved by letting A have all 0, 1 and 2-columns with fewer than three 1's in the first three rows and all 3-columns that have 1's in rows 1 and 2. It turns out that the problem of avoiding Q 9 and 1 k,ℓ has a very similar flavor to the problem of avoiding F 9 and 1 k,ℓ , and because of this we will once again be able to achieve exact results. We maintain all of our notation and terminology from the previous section.
The bound forb(m, Q 9 ) = m 2 + 2m − 1 was proven in [4], where the following classification of Q 9 avoiding matrices was established (following [2]). For each 2 ≤ t ≤ m − 2 we can divide the rows into three disjoint sets A t , B t , C t ⊆ {1, 2, . . . , m} so that after permuting the rows the t-columns can either be given as type 1: We will say t is of type i (i = 1 or i = 2) if the t-columns are of type i.
Proof. The size of a type 1 matrix of column sum t is at most m − (t − 1), while the size of a type 2 matrix of the same column sum is bounded by t + 1. Proof. By the previous lemma, forb(m, Q 9 , 1 k,1 ) is upper bounded by 1 + m + k t=2 (m − (t − 1)) = 1 + (k − 1)m − k−1 2 , and this value can be achieved by having m − (t − 1) t-columns intersecting in the first t − 1 rows, along with all columns of column sum 0 and 1. We can extend these results for ℓ > 1.
Proof. For the lower bound take the lower bound construction for forb(m, Q 9 , 1 k+1,1 ) given above and add in the (m − 1)-column with a 0 in the first row. This new column can't be used to make a Q 9 since it has too few 0's, and it doesn't intersect any other column in k rows so it can't be used to find a 1 k,2 . Thus this new matrix is in Avoid(m, Q 9 , 1 k,ℓ ). For the upper bound, note that if c, d are columns with at least k + 1 1's then either |c ∩ d| ≥ k (in which case we have 1 k,2 ) or there exists two rows where c has 1's and d does not and vice versa (in which case we have Q 9 ), so a matrix in Avoid(m, Q 9 , 1 k,2 ) can have at most one column that has more than k 1's.
Analyzing the ℓ > 2 case once again turns out to be significantly more difficult than the ℓ ≤ 2 cases, but nonetheless we are able to achieve some nearly tight bounds for this problem.
Proof. The size of a type 1 matrix of column sum t can be at most ℓ − 1 without inducing a 1 k,ℓ , and the size of a type 2 matrix of the same column sum is bounded by t + 1 ≤ k + ℓ. Lemma 7.6. forb(m, Q 9 , 1 k,ℓ ) ≥k+ℓ = ℓ − 1.
Proof. Let c be a column of A ∈ Avoid(m, Q 9 , 1 k,ℓ ) ≥k+ℓ with the fewest number of 1's (say t of them). We must have |c ∩ d| ≥ t − 1 for any other d (as if d has two 0's in rows where c has 1's, by virtue of c having the fewest number of 1's d must have at least two 1's where c has 0's, giving a Q 9 ), and hence for any other ℓ − 1 columns in A there exists k rows such that c and all of these other columns have 1's in these rows (since each can have at most one 0 in the at least k + ℓ rows where c has 1's), so we must have |A| ≤ ℓ − 1.
Proof. Take the lower bound construction for forb(m, Q 9 , 1 k+1,1 ) and adjoin to this ℓ − 2 columns with column sum (k + 1) such that k of their 1's are in the first k rows and their remaining 1's are in rows k + 1 through k + ℓ − 2. Additionally adjoin ℓ − 3 columns with column sum (k + ℓ − 2) with k + ℓ − 3 of their 1's in the first k + ℓ − 2 rows excluding row k and their remaining 1's anywhere below these rows. One can't use a (k + ℓ − 2)-column to find a Q 9 (only the (k + 1)-columns and t-columns with a 1 in row k + 1 have 1's in a row where a (k + ℓ − 2)-column has a 0 in the first (k + ℓ − 2) rows, but no such row exists beyond that for these columns, and for all other t-columns there exists at most one such row beyond the first (k + ℓ − 2) and none before this) and one can't use a (k + 1)-column either (it can't be used with a t-column for t ≤ k + 1 as below the first t − 1 rows of the t-column there aren't enough 1's), so this avoids Q 9 . To find a 1 k,ℓ , first note that at most one t-column with t ≤ k could be used (as there exists no k rows where two such t-columns both have 1's). If one uses more than one (k + 1)-column to find a 1 k,ℓ , then one must use the first k rows (since these are the only rows that two distinct (k + 1)-columns agree); but there are only ℓ − 2 (k + 1)-columns and one k-column with 1's in the first k rows, and no (k + ℓ − 2)-column can be used as they each have a 0 in row k, so one can't find ℓ such columns. Thus in total one could use at most one t-column with t ≤ k, one (k + 1)-column and all ℓ − 3 (k + ℓ − 2)-columns, but this can't be used to find a 1 k,ℓ since there are at most ℓ − 1 columns.
No column with at least k+1 1's can have two 0's in the first k−1 rows (as any kcolumn has two rows where it has 0's and this large column does not, and this large column necessarily has two rows where it has 1's and the k-column does not, since it has at least k + 1 1's and two of them aren't in the first k − 1 rows). If a column with at least k + 1 1's has one 0 in the first k − 1 rows and k ≥ 2 then this column must cover the entire I p (otherwise we could find a column that isn't covered by the large column, take these two columns, the rows where the k-column has 1's and the large column has 0's and any rows that the large column has that other doesn't to find a Q 9 ), but because I p is large we can have at most ℓ − 1 columns that cover it before inducing a 1 k,ℓ . We ignore these covering columns for now and restrict our attention to columns with at least k + 1 1's and that are identically 1 in the first k − 1 rows. Let c be such a column with the fewest number of 1's and assume it has 1's in the first k + 1 rows. As argued in the second lemma, any other column must have |c ∩ d| ≥ k and in particular (since all the columns we're considering are identically 1 in the first k − 1 rows) the only 0's the other columns can have are in the kth and k + 1st rows. There can be at most ℓ − 1 columns with a 0 in the kth row before inducing a 1 k,ℓ , but if there are precisely ℓ − 1 such columns then A can not contain the k-column with 1's in rows 1 through k − 1 and row k + 1, decreasing the maximum value p can take by 1, so "effectively" these columns can contribute at most ℓ−2. Similar results hold for columns with a 0 in the k +1st row, so in total we have |A| ≤ forb(m, Q 9 , 1 k+1,1 )+2(ℓ−2)+ℓ−1 = forb(m, Q 9 , 1 k+1,1 )+3ℓ−5 We can get a slightly larger lower bound when k is sufficiently large.
Proof. If k ≥ ℓ − 1 then take the lower bound construction for forb(m, Q 9 , 1 k+1,1 ) and adjoin to this ℓ − 2 columns with column sum (k + 1) with k of their 1's in the first k rows and also adjoin ℓ − 1 (m − 1)-columns with their 0's in the first ℓ − 1 rows (which by assumption is in the first k rows). None of the (m − 1)-columns can be used to find a Q 9 (as they have too few 0's), and by the same logic as before neither can the (k + 1)-columns. To find a 1 k,ℓ , again note that at most one t-column with t ≤ k could be used and if one uses more than one (k + 1)-column to find a 1 k,ℓ , then one must use the first k rows which means no (m − 1)-column can be used (since each has a 0 in the first k rows), so again we conclude that at most one (k + 1)-column can be used. One can't use only (m − 1)-columns since there are at most ℓ − 1 of them, but if any two (m − 1)-columns are used then one can't use two of the first k rows (since each has a different 0 in these rows), and hence one can't use any of the t-columns with t ≤ k + 1 (since outside of these rows they have at most k − 1 1's). Thus the only way one can find a 1 k,ℓ is to use one (m − 1)-column, one (k + 1)-column and one k-column. If ℓ ≥ 4 then we clearly can not find a 1 k,ℓ , but if ℓ = 3 and k = 2 one could use the 2-column with 1's in row 1 and row 3, the 3-column with 1's in rows 1 through 3, and the (m − 1)-column with a 0 in row 2 to find a 1 2,3 . If k ≥ ℓ = 3 then each (m − 1)-column and k-column only share k − 1 rows with 1's in both columns, so in this case we avoid 1 k,ℓ .

Future Directions
A natural extension to this work would be to consider all simple minimal cubic configurations, not just those with 4 rows. [3] does not explicitly list these configurations, but it is possible to determine the complete list (provided a certain conjecture is true).
First, note that there exists no minimal cubic configuration with 7 or more rows. Indeed, each column of a 7 rowed matrix contains 1 4,1 or 0 4,1 , meaning the configuration can't be a minimal cubic.   Proof. Note that we need only consider configurations whose column sum's are precisely 3, as otherwise the configuration will not be minimal. It is noted in [9] that the following configurations are the only six-rowed simple matrices with at least a cubic lower bound such that removing any column would make the configuration less than cubic: Note that F 10 ≺ F 16 and F 9 ≺ F 17 , and consequently F c 10 ≺ F c 16 and F c 9 ≺ F c 17 . Thus the only configurations that could be minimal cubics are F 14 and F 15 .
Anstee and Keevash in [6] note that F 14 is cubic, and moreover, that it with any row removed is quadratic, so this is a minimal cubic configuration. [3] notes that the following configuration is quadratic: If F ′ 7 consists of the 2nd, 3rd and 5th columns of F 7 then we note that F ′ 7 is F 15 without one of its rows (so if F 15 is a cubic configuration it must be a minimal cubic). If we apply the standard induction for forb(m, F 15 ), we must have F ′ 7 ⊀ C r (as otherwise  Proof. Note that any selection of three rows of F 14 contains 1 2,1 and 0 2,1 , but neither I nor I c contains both of these configurations so any I or I c in a product could contribute at most 2 rows to find F 14 . Similarly, any four rows of F 14 contains I 2 , and hence T can contribute at most 3 rows in finding F 14 for any product it is involved in. This shows that all 2-fold products except possibly T × T avoids F 14 , but it isn't too difficult to see that F 14 ≺ T 4 × T 4 ≺ T × T . Any 3-fold product involving only I's and I c 's will contain F 14 , as each of these can contribute an I 2 from two of their rows and three of these put together give F 14 . Thus the only possible 3-fold product that could avoid F 14 are products using precisely one T and the rest I's and I c 's. And this does in fact avoid F 14 , as the most each I and I c can contribute is two rows that form an I 2 , but this still leaves at least one I 2 to be covered by the T , which it can not do. To see that F 15 ⊀ I × I × T , note that any two rows of the I c 3 of F 15 contains 1 2,1 (so I can contribute to at most one row of I c 3 ) and I 2 (so T can contribute to at most one row of I c 3 ). Consequently, each of the I's and the T must contribute to precisely one row of the I c 3 . But if an I contributes to the ith row of F 15 (i ≥ 4), then the only other row it can contribute to is the (i − 3)rd row (as using any other row gives a 1 2,1 ). But if T covers the ith row (i ≥ 4), it can not also contribute to the (i − 3)rd row, as these two rows contain an I 2 . Thus no matter which rows of the I c 3 the I and T blocks cover, it will be impossible to cover all 6 rows of F 15 . It is not difficult to show that F 15 ≺ I × T × T by finding rows 1 and 4 in I, rows 3 and 5 in the first T and rows 2 and 6 in the second T . Similarly F 15 ≺ T × T × T by finding rows 1 and 5 in one T, 2 and 6 in another, and 3 and 4 in the last. From these constructions we are able to show that forb(m, Q, F ) = Θ(m 2 ) where Q is a minimal quadratic configuration and F is either F 14 or F 15 with the exception of the pairing Q = Q 8 and F = F 14 (as the only 2-fold product that avoids Q 8 is T × T , which is the only 2-fold product that contains F 14 ). We would predict based on our previous work that forb(m, Q 8 , F 14 ) = o(m 2 ), but we are unable to show this. Question 1. What is forb(m, Q 8 , F 14 )?
The problem of pairing F 14 and F 15 with other cubics is also a difficult question. Through the constructions we listed, it is possible to show that forb(m, F 1 , F 2 ) = Ω(m 2 ) for F 1 either F 14 and F 15 and F 2 any other simple minimal cubic configuration, and that forb(m, F 14 , F 15 ) = Θ(m 3 ), as well as forb(m, F 1 , F 2 ) = Θ(m 3 ) where F 1 is F 14 or F 15 and F 2 is F 12 or F c 12 . Unfortunately, we are unable to prove any tighter bounds. Question 2. What is forb(m, F 1 , F 2 ) in general for F 1 = F 14 or F 15 and F 2 any simple minimal cubic configuration?
One potential route for proving these results, at least for F 14 , would be to characterize how matrices in A ∈ Avoid(m, F 14 ) =t must look like as was done for Q 9 in [4]. However, classifying t-columns of F 14 seems to be a more difficult problem compared to Q 9 . Question 3. Is there a nice characterization of matrices A ∈ Avoid(m, F 14 ) =t ?