Further approximations for Aharoni's rainbow generalization of the Caccetta-H\"{a}ggkvist conjecture

For a digraph $G$ and $v \in V(G)$, let $\delta^+(v)$ be the number of out-neighbors of $v$ in $G$. The Caccetta-H\"{a}ggkvist conjecture states that for all $k \ge 1$, if $G$ is a digraph with $n = |V(G)|$ such that $\delta^+(v) \ge k$ for all $v \in V(G)$, then $G$ contains a directed cycle of length at most $\lceil n/k \rceil$. Aharoni proposed a generalization of this conjecture, that a simple edge-colored graph on $n$ vertices with $n$ color classes, each of size $k$, has a rainbow cycle of length at most $\lceil n/k \rceil$. With Pelik\'anov\'a and Pokorn\'a, we showed that this conjecture is true if each color class has size ${\Omega}(k\log k)$. In this paper, we present a proof of the conjecture if each color class has size ${\Omega}(k)$, which improved the previous result and is only a constant factor away from Aharoni's conjecture. We also consider what happens when the condition on the number of colors is relaxed.


Introduction and preliminaries
We call a graph simple if it has no loops or parallel edges, and we call a digraph simple if its underlying undirected graph is simple. For a simple digraph G and a vertex v ∈ V (G), let δ + (v) denote the number of out-neighbors of v in G. A famous conjecture in graph theory is the following, due to Caccetta and Häggkvist: Conjecture 1 ( [1]). Let n, k be positive integers, and let G be a simple digraph on n vertices with δ + (v) k for all v ∈ V (G). Then G contains a directed cycle of length at most ⌈n/k⌉. For a graph G and a function b : E(G) → N, a rainbow cycle (with respect to b) is a cycle C in G such that for all e, f ∈ E(C) with e = f , we have b(e) = b(f ). We will refer to b as a coloring of the edges of G. 1 For any i ∈ {1, · · · , K}, the set of edges b −1 (i) is called a color class. For k, K ∈ N, we say that b has K color classes of size at least k if |b −1 (i)| k for all i ∈ {1, . . . , K} and b −1 (i) = ∅ for all i > K. The rainbow girth of an edge-colored graph G is the length of a shortest rainbow cycle in G.
In [3], Aharoni proposed a generalization of Conjecture 1: ). Let n, k be positive integers, and let G be a simple graph on n vertices. Let b be a coloring of the edges of G with n color classes of size at least k. Then G has rainbow girth at most ⌈n/k⌉.
In [5], Conjecture 2 was proved for k = 2. The following approximate result shows that the conjecture holds with larger color classes: Theorem 3 ( [7]). Let k > 1 be an integer, and let G be a simple graph on n vertices. Suppose that we have a coloring of the edges of G with n color classes of size at least 301k log k. Then G has rainbow girth at most ⌈n/k⌉.
In [2], Chvátal and Szemerédi show that in a simple digraph of minimum out-degree at least k, there exists a directed cycles of length at most 2n/k. Our main result is an improvement of Theorem 3, reducing the required size of the color classes from Ω(k log k) to Ω(k). Theorem 4. Let k 1 be an integer, and let G be a simple graph on n vertices. Suppose we have a coloring of the edges of G with n color classes of size at least ck, where c = 10 11 . Then G has rainbow girth at most n/k. Our paper is organized as follows. In the remainder of Section 1, we state some results we will use throughout the paper. In Section 2, we prove a number of results on the case where the number of colors is n + ck for some constant c. Then, in Section 3, we prove the main result of the paper, which deals with the case of n color classes. Finally, in Section 4, we present some ideas for future work.
The proof, at a high level, proceeds by a reduction from the case of n colors to the case of n + c ′ k colors for some large constant c ′ , by applying existing results for the Caccetta-Häggkvist conjecture using the method of [3]. Then, in the case of n + c ′ k colors, we use several methods to establish results which are substantially stronger.
We will make use of the following results due to Bollobás and Szemerédi [4] and Shen [6], respectively. The first deals with the girth of a simple graph, which is our primary tool in finding short cycles, while the second is an approximate result for Conjecture 1. In this paper, log denotes the logarithm with base 2.
Theorem 5 ([4]). For all n 4 and k 2, if G is a simple graph on n vertices with n + k edges, then G contains a cycle of length at most 2(n + k) 3k (log k + log log k + 4).
. Then G contains a directed cycle of length at most ⌈n/k⌉ + 73.
We will use the following immediate corollary of Theorem 5: Corollary 7. For all n 4 and k 2, if G is a simple graph on n vertices with n + k edges, then G contains a cycle of length at most Proof. By Theorem 5, we have that the girth is at most: since that is equivalent to: log log k + 4 6 log k.
To see that this is true, let f (k) = 6 log k − log log k − 4. Then f (2) = 6 − 4 = 2 0, and for all k 2 we have: It follows that f (k) 0 for all k 2, as desired. This proves Corollary 7.
Two final results we will make use of is a set of Chernoff bounds and Chebyshev's Inequality: be independent indicator random variables, and let X = m i=1 X i . Then for any ǫ > 0, we have: Theorem 9 (Chebyshev's Inequality). Let X be a random variable with finite expected value µ and finite non-zero variance σ 2 . Then for any real number k > 0 we have:

n + ck colors
We first consider a relaxation of Conjecture 2 where we have n + c 1 k color classes each of size at least c 2 k, for constants c 1 , c 2 which we will specify. In this case, we obtain upper bounds for the rainbow girth that are stronger than ⌈n/k⌉ to a surprising degree. For this reason, these results are interesting in their own right. They are also used in the proof of our main result in the next section. Our first result is the following: Theorem 10. Let k > 1 be an integer, and let G be a simple graph on n vertices. Suppose we have a coloring of the edges of G with n + k color classes of size at least ck, where c = 10 9 . Then G has rainbow girth at most 6 or G has rainbow girth at most: Proof. Since the graph is simple, we have that n 2 |E(G)| cnk and thus we may assume that n ck. Now, we claim there exists a set of vertices S with |S| n log k/(140 √ k) such that every color class has at least one edge incident to a vertex in S. To see this, we let s = ⌊2 log k⌋ and t = ⌊n/(560 √ k)⌋. We will iteratively construct s sets of vertices S 1 , · · · , S s , each of size at most t, as follows. Suppose we have constructed S 1 , · · · , S i so far. Let T i = ∪ i j=1 S j , and let C i denote the set of colors whose color class has no edge incident to a vertex in T i . Let H be a random set of t vertices chosen uniformly with repetition. For any color class a, note that the number of vertices which are incident to an edge of color a is at least √ ck, since if there are at most √ ck vertices incident to edges of color a, the number of edges of color a will be at most ck/2. Also, we have that t n/(560 √ k) − 1 n/(1120 √ k) since n 1120 √ k which is implied by n ck. Using these two observations, we have that the expected number of colors in C i whose color class has no edges incident to the vertices of H is at most: Thus, we can choose S i+1 such that |C i+1 | e − √ c/1120 |C i |, and iterate. When we finish, we have a collection of sets {S 1 , S 2 , · · · , S s } such that: where that last inequality is true for k 2 since: which is true for k = 2 and thus for all k 2. Now, we have that T s is a set of vertices with |T s | n log k 280 √ k such that at most n log k 280 √ k colors a have no edge of their color class adjacent to any of the vertices in T s . It follows that, by adding at most n log k 280 √ k vertices, we can find a set of vertices of size at most n log k 140 √ k which is incident to at least one edge of every color class, as desired. Now, let S be a set of at most n log k 140 √ k vertices such that S is incident to at least one edge of every color. For each color a, choose one edge e c of color a such that e c is incident to at least one vertex in S. Let E be the set of these chosen edges e c . Then |E| = n + k and E contains exactly one edge of each color. Now, let H be the subgraph to a single vertex (by contracting each edge of H i iteratively), and let the resulting graph be H ′ . We have that |V (H ′ )| = |S| n log k 140 √ k and |E(H ′ )| = |S| + k. Note that a rainbow cycle C in H ′ corresponds to a rainbow cycle in G with length at most 3|C|, by replacing each contracted vertex by at most a two-edge path. We may assume that H ′ is simple, since otherwise we obtain a rainbow cycle of length at most 6 in G. Then applying Corollary 7 to H ′ gives a rainbow cycle in G of length at most: as desired. This proves Theorem 10.
We immediately obtain the following interesting corollary: Corollary 11. Let k > 1 be an integer, and let G be a simple graph on n vertices. Suppose that we have a coloring of the edges of G with n + k color classes of size at least ck, where c = 10 9 , and suppose also that 140k 3/2 log k n. Then G has rainbow girth at most n(log k) 2 5k 3/2 . Proof. The condition on the size of n is equivalent to: We have that G has rainbow girth at least 7 since n(log k) 2 5k 3/2 7 is implied by the condition. Then Corollary 11 gives that G has rainbow girth at most: as desired. This proves Corollary 11.
If we would like rainbow girth to be at most roughly n/k 3/2 (as promised by Corollary 11), it is necessary that k 3/2 < n, since a simple graph cannot have rainbow girth less than three. Corollary 11 can be interpreted as saying that, for the region where it makes sense (where k 3/2 < n, roughly), when we relax the number of colors slightly from n to n + k, we obtain a much shorter rainbow cycle of length at most approximately n/k 3/2 , in comparison to the tight bound of n/k for the case of n colors.
Next, we present a result of a similar flavor for the case where k is large relative to n.
Theorem 12. Let k > 1 be an integer, and let G be a simple graph on n vertices. Suppose we have a coloring of the edges of G with n + k color classes of size at least ck, where c = 10 9 , and also that 140k 10/9 n. Then G has rainbow girth at most 6.
Proof. We may assume that n ck since otherwise we have at least nck > n 2 edges which is a contradiction since G is simple. Let a colorful star be a subgraph H of G such that H is a star with at least ck 2 4n edges such that no color appears more than c 2/3 k 2/3 times in E(H). Let a collection of colorful stars be a set C = {H 1 , H 2 , · · · , H m } of colorful stars such that every color appears in at most one of the E(H i ). For a collection C of colorful stars, for 1 i m, let v i be the center of the star H i , and let V (C) = {v 1 , · · · , v p } and be the set of all star centers and the set of all edges, respectively. Now, let C be a collection of colorful stars in G, chosen to be maximal with respect to the number of stars. We first prove the following claim, which says that the number of colors appearing in E(C) is large.
Claim 13. At most k/2 colors do not appear in E(C).
Proof. Suppose not. Let S be the set of colors which do not appear in any of the E(v i ).
. Then the number of edges of color s with both ends in H is at most 4c 2/3 k 2/3 , so it follows that: since the last inequality is equivalent to k 8 3 /c which is true for k > 1 since c = 10 9 . Then, on average, a vertex v has: , and construct a colorful star with center v and d ′ s (v) edges of color s incident with v for all s ∈ S. Then we can add v to C and obtain a larger collection of colorful stars, which contradicts the maximality of C. It follows that there are at most k/2 colors which do not appear in E(C), as desired. This proves Claim 13.
We now prove a second claim, which says the number of colorful stars in C is small.
Proof. Suppose not; then |C| n 1/5 12 . It suffices to show a contradiction for the case where t = |C| = ⌈ n 1/5 12 ⌉, so that n 1/5 12 t < n 1/5 12 + 1 n 1/5 6 since n ck 10 9 k 12 5 . Now, for a colorful star H i with center v i , let M(H i ) = V (H i ) \ {v i }. We claim that for any two colorful stars H i , H j ∈ C with centers v i and v j , if H ij = M(H i ) ∩ M(H j ), then either v 1 has all its edges in H 1 to H ij in the same color class, or v 2 has all its edges in H 2 to H ij in the same color class. Suppose not. Then without loss of generality there are two edges e 1 = (v i , w 1 ) and e 2 = (v i , w 2 ) for w 1 , w 2 ∈ H ij such that e 1 and e 2 have colors a 1 and a 2 with a 1 = a 2 . Let a 3 be the color of (v j , w 1 ). Then clearly a 3 is also the color of (v j , w 2 ), since otherwise we obtain a rainbow cycle of length 4. Now, consider an arbitrary edge (v j , w 3 ) to a vertex w 3 ∈ H ij with w 3 / ∈ {w 1 , w 2 }. We claim that (v j , w 3 ) must have color a 3 . Indeed, if (v i , w 3 ) does not have color a 1 then the 4-cycle (v i , w 1 , v j , w 3 ) implies that (v j , w 3 ) has color a 3 , and if (v i , w 3 ) does not have color a 2 then the 4-cycle (v i , w 2 , v j , w 3 ) implies that (v j , w 3 ) has color a 3 . Since (v i , w 3 ) cannot have both color a 1 and color a 2 it follows that (v j , w 3 ) has color a 3 for all w 3 ∈ H ij , as desired.
Then we have that: which gives a contradiction. This proves Claim 14.
the electronic journal of combinatorics 29(1) (2022), #P1.55 Now, for each color class with at least one edge in E(C), we choose exactly one such edge. Let the resulting set of edges be F ; from Claim 13, we know that |F | n + k 2 . Now, let H be the subgraph with V (H) = (uv)∈F {u, v} and E(H) = F , and let S = {v 1 , v 2 , · · · , v p }, where p = |S|. Partition V (H)\S into X 1 , · · · , X p such that X i ⊆ N H (v i ) for all 1 i p. Now, contract each H i = X i ∪{v i } to a single vertex, and let the resulting graph be H ′ . By Claim 14, we have that |V (H ′ )| < n 1/5 12 , and, since k 10/9 n/140 and n c = 10 9 , we obtain: n 9/10 2 · 140 9/10 > Thus we obtain a rainbow cycle of length at most 2 in H ′ , which gives a rainbow cycle of length at most 6 in H, as desired. This proves Theorem 12.
We conclude this section with an immediate corollary of the above results which will be used in the proof of the next section: Corollary 15. Let k > 1 be an integer, and let G be a simple graph on n vertices. Suppose we have a coloring of the edges of G with n + k color classes of size at least ck, where c = 10 9 . Then G has rainbow girth at most n/k.
Proof. If 28k log k n, then by Theorem 10 we have rainbow girth at most: To see this, note that it is equivalent to log k √ 5k 1/4 . Let f (k) = √ 5k 1/4 − log k. We compute: and it follows that f (k) achieves its minimum for k 0 2, k ∈ R at the point k 0 = 4 √ 5 ln 2 4 . We verify that f (k 0 ) 0, so it follows that f (k) 0 for all k 2, as desired.
If 28k log k > n, we claim that 140k 10/9 n. Indeed, 140k 10/9 > 28k log k is equivalent to 5k 1/9 > log k which is true for k 2. To see this, by taking derivatives as before it suffices to verify that the inequality is true for k 0 such that k 1/9 0 = 9/(5 ln 2), which is true. Then, Theorem 12 gives that G has rainbow girth at most 6. Since n 2 |E(G)| cnk, we have that n/k c 6, so it follows that G has rainbow girth at most n/k, as desired. This proves Corollary 15.
Theorem 16. Let k 1 be an integer, and let G be a simple graph on n vertices. Suppose we have a coloring of the edges of G with n color classes of size at least ck, where c = 10 11 . Then G has rainbow girth at most n/k.
Proof. If k = 1, then taking one edge of each color gives a rainbow cycle of length at most n. So we may assume k > 1. Also, since G is simple, we have that the number of edges |E(G)| satisfies n 2 |E(G)| nck, and thus we may assume that n ck. Now, let t = ck. By removing edges if necessary, we may assume that every color class has exactly t edges. Now, we say that a color a dominates a vertex v ∈ V (G) if there are at least t 100 + 8k edges incident to v with color a. Call a vertex v color-dominated if there exists a color a which dominates v, and call a color a vertex-dominating if there exists a vertex v which is dominated by a. The definition is motivated by a desire to reduce to the case of the Caccetta-Häggkvist conjecture as in [3], where each color class is a star centered at a different vertex. A color being vertex-dominating means that its edges form a large star, which will be useful in applying existing approximate results for the Caccetta-Häggkvist conjecture. Now, for each vertex-dominating color a, pick one vertex v a dominated by a (not necessarily unique), and let the resulting set of vertices be S.
Suppose first that |H| t 100 . Let b be the coloring of the edges. We construct a digraph G ′ with V (G ′ ) = S, and for all i, j with v i , v j ∈ S, there is an arc Every vertex v i is incident with at least t 100 + 8k edges e with b(e) = i, and since |H| t 100 , there are at least 8k edges e = v i u with b(e) = i and u ∈ S. Therefore, δ + (G ′ ) 8k. Now, we claim n/(8k) + 74 n/k, which is equivalent to n 592k 7 which is true since n ck = 10 11 k.
Then, by applying Theorem 6 to G ′ we obtain a directed cycle K of length at most ⌈n/(8k)⌉ + 73 n/k in G ′ . The edges of G that correspond to arcs of K form a rainbow cycle of length at most n/k in G.
So we may assume that |H| > t 100 . Let r = |H|, so we have t 100 < r n. Let T ⊆ H be a random set of vertices in H where each vertex in H is included in T independently with probability 4k r . Now consider a color a which does not dominate a vertex in S (and thus does not dominate any vertex). We will show that the probability that a has at least t/100 edges with both ends in G \ T is at least 1 − k 2r . We claim that we may assume all of the edges of a have both ends in H. Indeed, if this is not the case, perform the following iterative process while there is still an edge e of color a not contained in H.
If e has both ends in G \ H, then remove e. Now, note that at most 200 vertices are incident to at least t/100 edges of color a. Since |H| > t 100 = 2 9 k, there exists a pair of vertices v 1 , v 2 ∈ H such that there is no edge of color a between v 1 and v 2 and v 1 , v 2 are both incident to less than t/100 edges of color a. Then add an edge of color a between v 1 and v 2 . If instead e has one end in H, say the vertex w, then remove e and add an edge of color a from w to any vertex v ∈ H such that there is not already an edge between v and w of color a and both v and w are incident to less than t/100 edges of color a. Repeat this process until we obtain a graph G ′ where all the edges of a have both ends in H. If we can show that for G ′ the probability that a has at least t/100 edges in G ′ \ T is at least 1 − k 2r , then it clearly follows that the probability that a has at least t/100 edges in G \ T is also at least 1 − k 2r . Thus, we may assume without loss of generality that all of the edges of a are contained in H, as claimed. Now, let the edges of a be e 1 , · · · , e t . Let the random variable E i have value 1 if e i ∈ G \ T and have value 0 otherwise. Let E = t i=1 E i , and for a random variable R let Var(R) denote the variance of R. Since a is not vertex-dominating, we have that each edge e i shares an end with at most t 50 + 16k edges of the same color. It follows that each E i is dependent on at most t 50 + 16k of the variables {E 1 , E 2 , · · · , E t }. Let x = r−4k r , and note that the probability that an edge e i is in G \ T is simply x 2 , so for all 1 i t we have that E i is a Bernoulli random variable with probability equal to x 2 . Then it follows that Var(E i ) = x 2 (1 − x 2 ) and furthermore, if e i and e j share an end, we obtain: and thus we have: Claim 17. Let α = 1 − 400 c . For all α y < 1 we have: Proof. Since t/100 < r, we have that 100 c > k r > 0 and thus α = 1 − 400 c < y < 1. Define g(y) as follows: We claim that f (y) g(y) for all α y < 1. To see this, let h(y) = g(y) − f (y). Then h(y) 0 is equivalent to: We claim that h 1 (α) 0 and h ′ 1 (y) 0 for all α y < 1. For the first claim, since t = ck = 10 11 k and α = 1 − 400 c , we have that: To show h ′ 1 (y) 0 for all α y < 1, we compute: Since y > 0, h ′ 1 (y) 0 is equivalent to h 2 (y) 0, where: Now, we claim that h 2 (α) 0 and h ′ 2 (y) 0 for α y < 1. The first claim follows from the facts t = ck = 10 11 k and α = 1 − 400 c : To show h ′ 2 (y) 0 for all α y < 1, we compute: Now, tα − 3 t 50 + 16k + 1 0 for all k 1, so it follows that h ′ 2 (y) h ′ 2 (α) 0 for all α y < 1. This implies that h 1 (y) 0 for all α y < 1, which in turn gives h(y) 0 for all α y < 1. Thus f (y) g(y) for all α y < 1, and we obtain: .
as desired. This completes the proof of Claim 17. Now, Claim 17 gives: .
Let λ = t x 2 − 1 100 . Then we have shown that Var(E) λ 2 k 2r . Let q be the probability that the color a has at least t/100 of its edges in G \ T . Note that E(E) = tx 2 , so: Then by Theorem 9 (Chebyshev's Inequality), we have: We say that a color a is bad if a is not vertex-dominating and a has less than t/100 of its edges in G \ T . Let B be the set of bad colors, and let Y = |B|. Since 1 − q k 2r , we have that E(Y ) k/2. It follows from Markov's Inequality that P(Y k) 1/2. Recall that T was formed by choosing each vertex in H independently with probability 4k/r. Then E(|T |) = 4k. Applying Theorem 8 yields that for all k 2: P(|T | 8k) + P(|T | 2k) exp(−4k/3) + exp(−k/2) < 1/2.
Since k > 1 is an integer, it follows that with positive probability we have both 2k < |T | < 8k and Y < k, so there exists a set T ⊂ G \ S with |T | 8k and such that |T | − Y k. If G ′ = G \ T , then since |T | 8k it follows that for every vertex-dominating color class at least t/100 of its edges are in G ′ . Then we have that at least |V (G ′ )| + k colors a have at least t/100 edges in G ′ . Applying Corollary 15 to G ′ gives that G ′ has rainbow girth at most |V (G ′ )|/k and thus G has rainbow girth at most n/k, as desired. This completes the proof.

Further Work
There are a number of directions further research in this area can go. Here we mention a few of our favorites. One direction is to prove tight results, such as proving that Conjecture 2 is true for k = 3. Another direction is to try to improve the constant from our proof; we made little effort to optimize it.
Another interesting question is whether there exist extremal examples for Conjecture 2 which are not inherited from Conjecture 1, namely that do not have the property that for each vertex v ∈ V (G) there exists a color a whose color class is the edge set of a star centered at v. Finally, a related problem which we did not consider is a relaxation of Conjecture 2, which is the following: Conjecture 18 ( [5]). Let n, k be positive integers, and let G be a simple graph on n vertices. Let b be a coloring of the edges of G with n color classes of size at least k; then G has a cycle C of length at most ⌈n/k⌉ such that no two incident edges of C are the same color.
This conjecture is interesting because it still implies Conjecture 1, but seems like it might be substantially easier than Conjecture 2, as it deals with a local condition rather than a global condition. However, we suspect it may require different methods than those used in this paper.
Finally, nowhere in this paper did we use induction, while a number of the results for Conjecture 1 utilize induction. Is there a way to use inductive arguments in this context?

Corrigendum -added September 9, 2022
An error in the paper was pointed out to the authors via private correspondence by Yuhui Cheng, for which we are very grateful. The error is in the part of the proof of Theorem 16 which concerns the iterative process which allows us to assume without loss of generality that all the edges of a are contained in H. In particular, the issue is in the line "If instead e has one end in H, say the vertex w, then remove e and add an edge of color a from w to any vertex v ∈ H such that there is not already an edge between v and w of color a and both v and w are incident to less than t/100 edges of color a." This is not always possible; in fact it is possible that there is an edge of color a from w to every other vertex in H which is incident to less than t/100 edges of color a. There are at least |V (H)| − 200 > t/100 − 200 vertices in H incident to fewer than t/100 edges of colour a; so when this process fails, w is already adjacent with an edge of colour a to at least t/100 − 200 vertices in H.
We resolve this issue as follows. If we encounter this situation, we delete the rest of the edges of color a incident with w and a vertex in G \ H, and then continue the iterative process. We claim that this will delete at most 0.01t edges of color a. Indeed, note that for each vertex w ∈ H that we delete such edges for, we delete at most 8k + 200 208k edges, and the number of such problematic vertices is at most: 2t t 100 − 200 2 · 10 11 10 9 − 200 Then we have that the total number of edges of color a which are deleted is at most: 416 · 10 11 10 9 − 200 k < 0.01 · 10 11 k since 416 < 0.01(10 9 − 200). Therefore we preserve at least 0.99t edges from each color class. We finish by first noting that the arguments showing Corollary 15 go through with c = 0.99 · 10 9 , and also that the rest of the proof of Theorem 16 goes through with 0.99 · 10 11 k edges instead of 10 11 k edges. It follows that the result of Theorem 16 still holds.