Structure and colour in triangle-free graphs

Motivated by a recent conjecture of the first author, we prove that every properly coloured triangle-free graph of chromatic number $\chi$ contains a rainbow independent set of size $\lceil\frac12\chi\rceil$. This is sharp up to a factor $2$. This result and its short proof have implications for the related notion of chromatic discrepancy. Drawing inspiration from both structural and extremal graph theory, we conjecture that every triangle-free graph of chromatic number $\chi$ contains an induced cycle of length $\Omega(\chi\log\chi)$ as $\chi\to\infty$. Even if one only demands an induced path of length $\Omega(\chi\log\chi)$, the conclusion would be sharp up to a constant multiple. We prove it for regular girth $5$ graphs and for girth $21$ graphs. As a common strengthening of the induced paths form of this conjecture and of Johansson's theorem (1996), we posit the existence of some $c>0$ such that for every forest $H$ on $D$ vertices, every triangle-free and induced $H$-free graph has chromatic number at most $c D/\log D$. We prove this assertion with `triangle-free' replaced by `regular girth $5$'.


Introduction
For graphs with bounded clique number ω, the tradeoff between chromatic number χ being high and there being certain induced subgraphs is a central topic in graph theory. This is the context of the famous and longstanding conjecture of, independently, Gyárfás [8] and Sumner [24], cf. [21]. This note is solely concerned with this type of problem.
Our starting point is indeed a more explicit form of this tradeoff, where the commodities are instead proper colourings and rainbow induced subgraphs (that is, ones in which all colours assigned to its vertices are distinct). This already has some history in the area: for instance, Kierstead and Trotter [15] pursued this in attempts towards the Gyárfás-Sumner Conjecture. It is interesting in its own right and some recent activity [4,10,20] has been motivated by a conjecture of this form due to the first author.
By classic results [7,11,19,25,9], the statement is true when omitting both 'trianglefree' and 'induced', or omitting 'rainbow'. It is false when omitting both 'triangle-free' and 'rainbow', as we discuss below (see Theorem 3). Babu, Basavaraju, Chandran and Francis [4] proved the statement under the extra assumption that no cycle in G has length less than χ. Scott and Seymour [20] proved a form of it for any fixed clique number ω ≥ 2, but with length f (χ) for some unbounded increasing function f instead of χ.
In the discussion at the end of their note, Scott and Seymour observed that the type of rainbow induced subgraph it makes sense to hope for in this problem setting is already rather simple: it is limited to forests of paths. In Section 2, we focus on the simplest possible structure and show the following.
Theorem 2. For each r ≥ 3 every properly coloured K r -free graph of chromatic number χ contains a rainbow independent set of size ⌈ 1 2 χ 1/(r−2) ⌉. When r = 3 this is sharp up to a factor 2. The r = 3 case is a consequence of Conjecture 1 if true, by taking every other vertex in the path. In Section 3, we discuss this result's implications for the related concept of chromatic discrepancy [3]. There appears to be room for improvement in Theorem 2 for r > 3, but much of it may come from the gap in current bounds on off-diagonal Ramsey numbers, as we next discuss.
Consider the smallest independence number α in K r -free graphs as a function of the chromatic number χ. The following statement follows from the best-to-date asymptotic results on off-diagonal Ramsey numbers. For completeness, its brief derivation is included in the appendix.
This immediately yields the following result complementing Theorem 2.
For r ≥ 4, it remains possible that the bound in Theorem 2 could be increased by a logarithmic factor, so that it would match and indeed qualitatively strengthen Theorem 3. On the other hand, intuitively (based on the sharpness of Theorem 3 for r = 3), improvement by more than a logarithmic factor may be for want of a breakthrough in Quantitative Ramsey Theory. Theorems 2 and 3 hint at the following generalisation of Conjecture 1.

Conjecture 5.
For each r ≥ 3 every properly coloured K r -free graph of chromatic number χ contains a rainbow induced path of length χ 1/(r−2) .
The aforementioned result of Scott and Seymour [20, 1.3] already constitutes partial progress. The statement is true when omitting 'rainbow'. It is surprisingly difficult to bound the maximum rainbow induced path length significantly below both the maximum induced path length and the chromatic number.
The proof of Theorem 2 has affinity to a proof of Gyárfás guaranteeing induced paths of length χ in triangle-free graphs of chromatic number χ [9]. Returning to the roots concerning 'non-rainbow' structure, we have formulated, based on Theorem 3 and some intuition from the random graph [17], the following successively stronger conjectures.

Conjecture 6.
There is some c > 0 such that every triangle-free graph of chromatic number χ contains an induced path of length at least cχ log χ.

Conjecture 7.
There is some c > 0 such that every triangle-free graph of chromatic number χ > 2 contains an induced cycle of length at least cχ log χ.
By Theorem 3, each statement if true is sharp up to the respective choices of c. Either but instead with 'induced star/tree of size cχ log χ' is true due to Johansson's result on the chromatic number of triangle-free graphs [14]. Conjecture 7 is a quantitative strengthening of a conjecture of Gyárfás [9], and slightly stronger than [12,Conj. 7]. Gyárfás's conjecture was recently confirmed by Chudnovsky, Scott and Seymour [6], but the induced cycle lengths guaranteed in [6] are very small compared to χ log χ; see also [22].
Perhaps Conjecture 7 is difficult in general, but on the other hand, we have managed to obtain some concrete progress under the additional exclusion of one or more cycle lengths.

Theorem 8.
• There is some c > 0 such that every regular C 4 -free graph of chromatic number χ > 2 contains an induced cycle of length at least cχ log χ.
• For each g ≥ 5 every girth g graph of chromatic number χ > 2 contains an induced cycle of length at least 3 This in particular implies that Conjecture 7 holds for regular girth 5 graphs and for girth 21 graphs. We have not made much effort to optimise the constant 21, for the method we use seems unlikely to reduce it below 13 or so. Theorem 8 also asserts that each girth 5 graph has an induced cycle of length at least χ + 2, which is is not far from best possible since the C 4 -free process [5] yields n-vertex χ-chromatic C 4 -free graphs with independence Johansson's [14] and Conjecture 6 together naturally prompt another possibility.

Conjecture 9.
There is some c > 0 such that for every forest H, every triangle-free graph containing no induced H has chromatic number at most c|V (H)|/ log |V (H)|.
If true, Conjecture 9 would constitute a common generalisation of Johansson's theorem, Conjecture 6 and the fact that χ(G) = O(α(G)/ log α(G)) for every triangle-free graph G, corresponding to the cases where H is a star, a path and an independent set, respectively. The conclusion of Conjecture 9 would fail if some H were allowed to contain a cycle, since for each ℓ ≥ 3 there are C ℓ -free graphs of arbitrarily large chromatic number. Note also that to prove Conjecture 9 it suffices (by adding a single vertex connected to all components if needed) to prove it for all trees H. As a first step, we have proved a form of Conjecture 9 for regular girth 5 graphs (see Corollary 18).

Large rainbow independent sets
Proof of Theorem 2. We carry out an induction on r ≥ 3. Let G be a K r -free graph of chromatic number χ and let c : V (G) → Z + be a proper colouring. We seek a rainbow independent set of size ⌈ 1 2 χ 1/(r−2) ⌉. Initialise G ′ = G and X = ∅, and iterate the following until G ′ is empty (if needed).
(i) Take an arbitrary vertex v ∈ V (G ′ ) and add it to X.
(ii) Let S = c −1 (c(v)) and delete the vertices of S from G ′ .
, then stop the procedure by outputting the largest rainbow independent set in G ′ [N].
Note that if r = 3, then the condition in (iii)(a) is vacuous, in which case we are directly proving the base case. If on the other hand the procedure stops in (iii)(a) (and so r ≥ 4), then since G ′ [N] is K r−1 -free it contains a rainbow independent set of size ⌉ by induction, in which case we are done. If the procedure continues until G ′ is empty, then by construction the final set X is a rainbow independent set, and so it suffices to show that |X| ≥ 1 2 χ 1/(r−2) . To this end, let S i and N i be the vertex subsets S and N respectively in iteration i ∈ {1, . . . , |X|}. Since , as promised. We remark that the same argument, i.e. performing the algorithm above applied to the binomial random graph G n,p (together with standard facts about the model), yields the following result. It is close to best possible: in the first regime it is sharp up to a constant factor, in the second up to a log n factor. Recall that a property in G n,p is said to hold asymptotically almost surely (a.a.s.) if it holds with probability tending to one as n → ∞.
Then a.a.s. for any proper colouring of G n,p , there is a rainbow independent set of size Ω(χ(G n,p )).
• Suppose p = ω( (log n)/n) as n → ∞. Then a.a.s. for any proper colouring of G n,p , there is a rainbow independent set of size Ω(1/p).
Proof. In the first regime, let C = 5/(2c − 1). We first prove the observation that for every vertex v in G n,p , the probability that its neighbourhood induces a graph with maximum degree at least C is at most 2n −5 as n → ∞. Note that this probability increases as p increases, so it is sufficient to prove the statement when p = n −c /2 where 1/2 < c < 1. By the Chernoff bound we know that for n sufficiently large The probability that a neighbour u ∈ N(v) has degree at least C is bounded by So if deg(v) ≤ 2np = n 1−c , then this is bounded by n (1−2c)C = n −5 . This observation implies that with probability at least 1−2n −4 we have χ(G[N(v)]) < C for every vertex v and hence the algorithm gives a rainbow independent set X of size at least χ(G n,p )/(1 + C).
In the second regime, note that by the Chernoff bound a.a.s. every degree of the graph is bounded by 2np. Also a.a.s. the independence number is at most 2p −1 log np < np. Hence a.a.s. we have |S i | + |N i | < 3np for every i in the algorithm and hence |X| ≥ 1/(3p).

Chromatic discrepancy
In related work, the first author together with Kalyanasundaram, Sandeep and Sivadasan [3] studied the notion of chromatic discrepancy, the least over all proper colourings of the greatest difference between size and induced chromatic number taken over all rainbow subgraphs. Starting with some triangle-free graph of chromatic number χ and iterating Theorem 2, each time extracting from what remains a large rainbow independent set and all associated colour classes, one finds an induced subgraph H that is rainbow with at least χ colours and such that χ(H) ≤ log 2 χ + 1.
Theorem 11. Every properly coloured triangle-free graph of chromatic number χ contains a rainbow induced subgraph on χ vertices of chromatic number at most log 2 χ + 1.
In other words, the chromatic discrepancy of any triangle-free graph of chromatic number χ is at least χ − log 2 χ − 1. It is an open question whether the logarithmic term can be reduced to some constant independent of χ. It was conjectured [3,Qu. 4] that χ − ω is a lower bound on the chromatic discrepancy for any graph of chromatic number χ and clique number ω. Corollary 4 refutes this for every fixed ω ≥ 3; however, iterating Theorem 2 in the same way as above yields (1 − o(1))χ chromatic discrepancy as χ → ∞.
Similarly iterating Theorem 10 yields the following for chromatic discrepancy of G n,p , an improvement upon [3, Thm. 4.6].
• Given 1/2 < c < 1, suppose p = o(n −c ) as n → ∞. Then a.a.s. for any proper colouring of G n,p , there is a rainbow induced subgraph on χ(G n,p ) vertices of chromatic number at most O(log χ(G n,p )).
• Given 0 ≤ c < 1, suppose p = ω(n −c ) as n → ∞. Then a.a.s. for any proper colouring of G n,p , there is a rainbow induced subgraph on χ(G n,p ) vertices of chromatic number at most O(− log p · max p, (log n)/(np) χ(G n,p )).
Proof. We will essentially iterate the algorithm in Theorem 2. Initialise G ′′ = G n,p and Y = ∅, and iterate the following until Y contains at least χ(G n,p ) vertices.
(i) Initialise G ′ = G ′′ and X = ∅, and iterate the following until G ′ is empty.
(a) Take an arbitrary vertex v ∈ V (G ′ ) and add it to X. (iii) Add the vertices from X to Y .
In the first regime, we have seen in the proof of Theorem 10 that a.a.s. in every step in the algorithm the chromatic number of N i is at most C = 5/(2c − 1). So in every iteration, we have selected at least 1 C+1 χ(G ′′ ) vertices. This implies that we need to perform at most log χ(G n,p )/ log(1+ 1 C ) iterations to create a rainbow induced subgraph on χ(G n,p ) vertices.
In the second regime, a.a.s. every vertex has degree at most 2np and the chromatic number of every neighbourhood is bounded by f (n, p) := E(χ(G 2np,p )) + 8np log n, due to a result of Shamir and Spencer [23]. Also χ(G n,p ) ∼ np 2 log np a.a.s. So in every iteration of the algorithm, we have selected at least G n,p )). In the other case, we have f (n, p) = O( (log n)/(np)χ(G n,p )).
After that, at most pχ(G n,p ) additional distinctly-coloured vertices are needed to form a rainbow induced subgraph on χ(G n,p ) vertices, the resulting graph having chromatic number O(− log p · pχ(G n,p )).

Long induced paths and cycles
This section is devoted to establishing Theorem 8 and related results.  Nota bene: the first part of this proof closely follows that of [16,Prop. 6].
Proof. Let G be a graph with minimum degree d and girth g. For a nonnegative integer r and a vertex v in G, we let B r (v) := {x ∈ V (G) | d G (x, v) ≤ r} denote the ball of radius r centred at x. Let X be a maximal set of vertices that are pairwise at distance at least 2k + 1. Observe that the balls of radius k centred at the vertices of X are pairwise disjoint. Moreover, each vertex is at distance at most 2k from X. We extend the collection of balls (B k (x)) x∈X to a partition of V (G) as follows. First add each vertex at distance k + 1 from X to one of the balls to which it is adjacent. Then add each vertex at distance k + 2 from X to one of the parts constructed in the previous step. Continue in this way until all vertices of G are covered. For each x ∈ X, denote by T (x) the graph induced by the part obtained from B k (x) in this way. Because G has girth at least 4k + 2, each T (x) is an induced subtree of G. Each non-leaf of the subtree induced by B k (x) has degree at least d, so T (x) has at least d(d −1) k−1 leaves, and thus T (x) sends at least d(d −1) k edges to other trees. Moreover, the fact that g ≥ 1 + 2 · (4k + 1) implies that T (x 1 ) and T (x 2 ) are joined by at most one edge, for any two distinct x 1 , x 2 ∈ X. Therefore the minor G ′ obtained by contracting the trees has minimum degree at least d(d − 1) k . Since g ≥ 1 + 4 · (4k + 1), G ′ must have girth at least 5. This allows us to apply Lemma 13 (with t = 2), together with the girth 5 condition, yielding an induced cycle of length at least 3 + d(d − 1) k in G ′ . Note that for any two vertices x, y ∈ V (G ′ ), x and y are adjacent if and only if their pre-images in G are joined by precisely one edge. We conclude that G has an induced cycle of length at least 3 + d(d − 1) k .
We remark that Theorem 14 for k = 0 is sharp when d = 2 and d = 3, by C 5 and the Petersen graph, respectively. On the other hand, it is conceivable for k = 0 that one could guarantee an induced cycle of length Ω(d 3/2 ) as d → ∞, which would be best possible for infinitely many values of d, due to the Erdős-Renyi orthogonal polarity graph, cf. e.g. [18].
Every graph with chromatic number χ has an induced subgraph with minimum degree at least χ − 1. The second part of Theorem 8 thus follows immediately from Theorem 14.
The following corollary (with t = 2) implies the first part of Theorem 8.
Corollary 15. For each t ≥ 2 there is some c > 0 such that every regular K 2,t -free nonforest graph of chromatic number χ contains an induced cycle of length at least cχ log χ.
Proof. Given a K 2,t -free graph with maximum degree ∆ and an arbitrary vertex v, each neighbour of v has at most t − 1 neighbours in N(v). Therefore the number of edges in the induced subgraph on the set of all neighbours of any vertex does not exceed t−1 2 ∆. This together with the result of Alon, Krivelevich and Sudakov [2] implies that every K 2,t -free graph with maximum degree ∆ has chromatic number χ = O(∆/ log ∆) as ∆ → ∞, and hence ∆ = Ω(χ log χ) as χ → ∞. Now combine this with the consequence of Lemma 13 that every ∆-regular K 2,t -free non-forest graph has an induced cycle of length Ω(∆).
Lemma 13 in particular shows that girth 5 graphs contain induced paths of linear length. In fact they contain many such paths. We hope that this might become useful towards further progress in Conjectures 1 and 6.
Lemma 16. In any graph of girth at least 5 and minimum degree d ≥ 2, there are d! distinct induced paths of order d + 2 starting at any vertex.
Proof. We apply induction on d ≥ 2. Let G be a graph of girth at least 5 and minimum degree d and let v ∈ V (G). If d = 2, then there must be a cycle of G containing v. We may assume that this cycle is an induced cycle of length at least 5, and therefore v is an endvertex of two distinct induced paths of order at least 4 = d + 2. So we may assume that d ≥ 3. For any neighbour w of v, let G w denote the connected component containing w in the graph obtained by deleting v and N(v)\ {w}. No vertex of G w can have more than one neighbour in {v} ∪ N(v)\ {w}, for otherwise G would contain a triangle or a 4-cycle. It follows that the minimum degree of G w is at least d − 1. Hence induction yields that for each w ∈ N(v), there are at least (d − 1)! induced paths in G w of order d + 1, starting in w. By appending v to these paths, we obtain (d − 1)! distinct induced paths of order d + 2 that start in v. Since there are at least d choices for w, the lemma follows.
Lemma 16 on induced paths can be extended to rooted induced forests as follows. Roughly speaking, the following says that in any girth 5 graph with large minimum degree, every large forest occurs many times as an induced subgraph.
Lemma 17. Let G be a graph of girth at least 5 and minimum degree d. Let T be a forest on d vertices, with t components T 1 , . . . , T t . For each 1 ≤ i ≤ t, let u i be any vertex of T i . Let S := {v 1 , . . . , v t } be any size t independent set of G. Then there exists an injective graph homomorphism f : • f (V (T )) induces a copy of T in G.
Proof. We apply induction on n := |V (G)|. There is nothing to prove for n = 1, so suppose n > 1 and assume the result is true for all graphs on fewer than n vertices. Let u(1), . . . , u(k) denote the neighbours of u 1 in T 1 , and let T ′ be the forest obtained from T by deleting u 1 . Furthermore, denote by T (1), . . . T (k) the components of the subforest T 1 \ {u 1 }. Because G has no triangles or 4-cycles, any two vertices in N(v 1 ) have no common neighbour other than v 1 . Therefore v 1 has at least |N(v 1 )|−(t−1) ≥ d−(t−1) ≥ |V (T 1 )| > k neighbours that are not incident to any vertex of S other than v 1 . Thus there exists a set N ′ := {v(1), . . . , v(k)} of k distinct neighbours of v 1 , such that S ′ = S ∪ N ′ \ {v 1 } is an independent set of G. Let G ′ denote the graph obtained from G by deleting v 1 and N(v 1 )\N ′ . Because G has girth at least five, the minimum degree of G ′ is at least d − 1. Moreover, T ′ is a forest on d−1 vertices, with components T (1), . . . , T (k), T 2 , . . . , T t . Recall furthermore that S ′ is an independent set of G, and hence of G ′ . Thus, by induction, we know that there is a mapping f ′ : V (T ′ ) → V (G ′ ) such that f ′ (u i ) = v i for all 2 ≤ i ≤ t, f (u(j)) = v(j) for all 1 ≤ j ≤ k, and f ′ (V (T ′ )) induces a copy of T ′ in G ′ . Now we can extend f ′ to the desired mapping f by defining f (x) = f ′ (x) for all x ∈ V (T ′ ), and f (u 1 ) = v 1 .

Corollary 18.
There is some c > 0 such that for every forest H, every regular girth 5 graph containing no induced H has chromatic number at most c|V (H)|/ log |V (H)|.
Proof. Let G be a ∆-regular girth 5 graph. Let H be a forest that does not occur as an induced subgraph of G. Then by Lemma 17, H must have more than ∆ vertices. Combining this with Johansson's theorem [14] yields χ(G) ≤ c ′ ∆/ log ∆ ≤ c ′ |V (H)|/ log |V (H)|, for some c ′ > 0 and all sufficiently large |V (H)|. From this the corollary easily follows.