Minimum-weight combinatorial structures under random cost-constraints

Recall that Janson showed that if the edges of the complete graph K n are assigned exponentially distributed independent random weights, then the expected length of a shortest path between a ﬁxed pair of vertices is asymptotically equal to (log n ) /n . We consider analogous problems where edges have not only a random length but also a random cost, and we are interested in the length of the minimum-length structure whose total cost is less than some cost budget. For several classes of structures, we determine the correct minimum length structure as a function of the cost-budget, up to constant factors. Moreover, we achieve this even in the more general setting where the distribution of weights and costs are arbitrary, so long as the density f ( x ) as x → 0 behaves like cx γ for some γ ≥ 0; previously, this case was not understood even in the absence of cost constraints. We also handle the case where each edge has several independent costs associated to it, and we must simultaneously satisfy budgets on each cost. In this case, we show that the minimum-length structure obtainable is essentially controlled by the product of the cost thresholds. 2010 Mathematics Subject Classiﬁcation. Primary 05C80; Secondary 05C85, 90C27.


Introduction
The first problem we study involves paths, here denoted as minimum weight paths for consistency with the remainder of the paper. Let P(i, j) denote the set of paths from vertex i to vertex j in K n .
Constrained Minimum Weight Path (CMWP): Opt(P(1, n), C). Without the constraint c(P ) ≤ C, there is a beautiful result of Janson [13] that gives a precise value for the expected minimum weight of a path, when the w(e)'s are independent exponential mean one. With the constraints, we are only able to estimate the expected minimum weight up to a constant (but can do so for a more general class of distributions).
Throughout the paper we let Υ = r i=1 C i be the product of the cost thresholds. Our results show that for the structures we consider, this product of the cost thresholds controls the dependency of the minimum-weight structure on the vector of cost constraints. In particular, for the minimum weight path problem, we have: Theorem 1. If nΥ 1/β log r/β n → ∞ and C i ≤ 10 log n, i = 1, 2, . . . , r then w.h.p.
Now consider the case of perfect matchings in the complete bipartite graph K n,n . Let M 2 denote the set of perfect matchings in K n,n .
We use Lemma 7 here to bound P(Z i ≤ C i ). We note that a problem similar to this was studied by Arora, Frieze and Kaplan [1] with respect to the worst-case.
Now consider the case of perfect matchings in the complete graph K n . Let M 1 denote the set of perfect matchings in K n .

Structure of the paper
We prove the above theorems in their order of statement. The upper bounds are proved as follows: we consider the random graph G n,p (or bipartite graph G n,n,p or digraph D n,p ) for suitably chosen p associated with the random costs. We then seek minimum weight objects contained in these random graphs. The definition of p is such that objects, if they exist, automatically satisfy the cost constraints. For minimum weight paths we adapt the methodology of [13]. For the remaining problems we use theorems in the literature stating the high probability existence of the required objects when each vertex independently chooses a few (close) random neighbors.
In Section 6 we consider more general distributions. We are able to extend the above theorems under some extra assumptions about the C i .

Upper Bound for CSP
In the proof of the upper bound, we first consider weights w(e) where the w(e) are independent exponential mean one random variables. The costs will remain independent copies of Z β E . We will then use Holder's inequality to obtain the final result.
3.2 log 2+r/β n n ≤ Υ 1/β and C i ≤ 10 log n, i = 1, 2, . . . , r Suppose now that we let L = 10 log n and The proof in this case goes as follows: (i) We search for short paths that only use edges in E 0 and note that the graph ([n], E 0 ) is distributed as G n,p .
(ii) Observe that any path using fewer than L edges of E 0 automatically satifies the cost constraints.
(iii) A simple calculation shows that w.h.p. the number of edges between a set S of size k and the remaining vertices is close to the expectation k(n − k)p for all sets of vertices S, see (3) and (4).
(iv) We run Dijkstra's algorithm for finding shortest (now minimum weight) paths from vertex 1. We use Janson's argument [13] to bound the distance to the m = n/3 closest vertices V 1 . We need the claim in item (iii) here.
(v) We repeat (iv), starting from vertex set n, to obtain the m = n/3 closest vertices V 2 . If V 1 ∩ V 2 = ∅ we will have found a path of low enough weight, otherwise we calim that w.h.p. there will be a low enough weight edge joining V 1 , V 2 .
(vi) We then argue that the trees constructed by the Dijkstra algorithm are close to being Random Recursive Trees and we can easily bound their height. Showing that we can use item (ii).
(vii) We finally use Holder's inequality to switch from w to w.
We first bound the value of p.
where e is an arbitrary edge.
We note that if 0 < x ≤ 1 then x/2 ≤ 1 − e −x ≤ x. This implies that We consider the random graph G n,p where edges have weight given by w and costs c i (e) ≤ C i /3L, i = 1, 2, . . . , r. We modify Janson's argument [13].
We now deal with item (iii). We observe that w.h.p. for every set S of size k, e(S :S) ≈ k(n − k)p where e(S : T ) is the number of edges {v, w} with one end in S and the other in T . We only need to check the claim for |S| ≤ n/2. Let ε = 1 log 1/3 n and Then, using the Chernoff bounds for the binomial distribution, We now continue with item (iv). We set S 1 = {1} and d 1 = 0 and consider running Dijkstra's algorithm [6]. At the end of Step k we will have computed . . , d k where d i is the minimum weight of a path from 1 to i, i = 1, 2, . . . , k. Let there be ν k edges from S k to [n] \ S k . Arguing as in [13] we see that d k+1 − d k = Z k where Z k is the minimum of ν k independent exponential mean one random variables. Also, the memoryless property of the exponential distribution implies that Z k is independent of d k . It follows that for k < n/2, where By the same token, We only pursue the use of Dijkstra's algoritm from vertex 1 for m = n/3 iterations. It follows from (5) and (6) and the Chebyshev inequality that we have w.h.p.
We next deal with item (vi). The tree built by Dijkstra's algorithm is (in a weak sense) close in distribution to a random recursive tree i.e. vertex v k+1 attaches to a near uniformly random member of Indeed, assuming E does not occur, Hence, if T is the tree constructed in the first m rounds of Dijkstra's algorithm, then It follows from (2), (7) and (8) that w.h.p., for every v ∈ V 1 = S m , there exists a path P from 1 to v of weight at most We now deal with item (v). We next consider applying Dijkstra's algorithm to find a minimum weight path from vertex n to other vertices. Using the same argument as above, we see that we can find m vertices V 2 that are within distance λ 0 of vertex n. If V 1 ∩ V 2 = ∅ then we have found a path of weight at most 2λ 0 between vertex 1 and vertex n.
If V 1 , V 2 are disjoint then w.h.p. there is an edge of weight 20/np between them. Indeed, This yields a path P with (Here we have used We now deal with item (vii). We use Holder's inequality to yield This completes the proof of Theorem 1 for this case.
3.3 3ω log r/β n n ≤ Υ 1/β ≤ log r/β+2 n n and C i ≤ 10 log n, i = 1, 2, . . . , r The proof is similar to that of Section 3.2, but requires some changes in places. The problem is that we cannot now assume the non-occurrence of E. Other than this, the proof will follow the same strategy. Our problem therefore is to argue that w.h.p. e(S k :S k ) is sufficiently large.
(a) We now have to keep track of the size of e(S k :S k ) as a random process. This is equation (12).
(b) The term η k is the number of edges between v / ∈ S k and S k . We don't want this to be large, as it reduces e(S k+1 :S k+1 ). So, we do not add vertices to S k if η k ≥ 2np, which only happends rarely.
(c) Finally, we have to work harder in the case where V 1 , V 2 are disjoint. We need to use edges of slightly higher cost in order to get a low weight edge in e(V 1 : V 2 ).
Let p be as in (1) where L = 20 log n. Note that from (2) we see that We again consider the random graph G n,p where edges have weight given by w and costs at most C i /3L and again modify Janson's argument [13]. We also restrict our search for paths, avoiding vertices of high degree.
We set S 1 = {1} and d 1 = 0. At the end of Step k we will have computed S k = {1 = v 1 , v 2 , . . . , v k } and 0 = d 1 , d 2 , . . . , d k where d i is the minimum weight of a path from 1 to i, i = 1, 2, . . . , k. Let there be ν k edges from S k to [n] \ S k . We cannot rely on E of (4) not to occur and so we need to modify the argument here.
Assumption: 1 ≤ k ≤ n 0 = 1/3p Modification: if our initial choice v for v k+1 satisfies e(v :S k ) ≥ 2np then we reject v permanently from the construction of paths from vertex 1.
The initial aim is roughly the same, we want to show that w.h.p.
The binomials are independent here. This is because the edges between v k+1 andS k have not been exposed by the algorithm to this point. The number of trials n 1 comes from the following: we know from the Chernoff bounds that P(Bin(n, p) ≥ 2np) ≤ e −np/3 .
It follows from the Markov inequality that w.h.p. there are at most ne −np/4 instances where the modification is invoked. This means that w.h.p. the initial choice for v k has at least n − n 0 − ne −np/4 ≥ n 1 possible neighbors. We now define We need a lower bound for B k and an upper bound for η k . We next observe that if It follows that if k 0 = min n 0 , e ε 2 np/4 then w.h.p.
For k ≥ k 0 , we use the fact that S k is the sum of bounded random variables. Hoeffding's inequality [12] gives that np and so putting t = k 2/3 np we see that We next observe that It follows that w.h.p., It then follows from (13) and (16) and (17) and (19) that w.h.p.
Arguing as in [13] we see that d k+1 − d k = Z k where Z k is the minimum of ν k independent exponential mean one random variables. Also, Z k is independent of d k . It follows that for k < n, where By the same token, It follows from (21) and (22) and the Chebyshev inequality that w.h.p. we have d n 0 log n np . Let V 1 denote the n 0 vertices at this distance from vertex 1.
We next consider applying Dijkstra's algorithm to find a minimum weight path from vertex n to other vertices. Using the same argument as above, we see that we can find n 0 vertices V 2 that are within distance (1+o(1)) log n np of vertex n. If V 1 ∩ V 2 = ∅ then we have found a path of weight at most (2+o(1)) log n np between vertex 1 and vertex n.
If V 1 , V 2 are disjoint then we will use the edges Given e = {x, y} ∈ V 1 : V 2 , then given the history of Dijkstra's algorithm so far, either e ∈ E 0 or we can say that For the equation in (23) we use For the inequality in (24) we use the fact that we now have p ≤ log 2 n n .
We deal with the height of the Dijkstra trees. Let T be the tree constructed by Dijkstra's algorithm and let ξ i , i ≤ k denote the number of edges from v i to V 1 \ S i .
The first o(1) term here is the probability that there is a small ν k and this is covered by (20).
It follows from the above that w.h.p. there exists a path P where w(P ) 2 log n np Arguing as for (11) we see that

Upper Bound for CAP
Let G denote the subgraph of K n,n induced by the edges that satisfy c i (e) ≤ C i /n for i = 1, 2, . . . , r. Let . and note that log n n ≪ Υ 1/β 2 r n r/β ≤ p ≤ Υ 1/β n r/β . The approach for this and the remining problems is (i) Look for a small weight structure in an edge weighted random graph G. In this case the random bipartite graph G n,n,p .
(ii) Use an idea of Walkup [15] to construct a random subgraph H of G that only uses edges of low weight.
(iii) Use a result from the literature that states that w.h.p. the edges of H contain a copy of the desired structure.
G is distributed as G = G n,n,p . Note that by construction, a perfect matching M of G satisfies c i (M) ≤ C i , i = 1, 2, . . . , r.
Let d = np and note that because dnp ≫ log n the Chernoff bounds imply that w.h.p. every vertex has degree ≈ d. Now each edge of G has a weight uniform in [0, 1]. Following Walkup [15] we replace w(e), e = (x, y) by min {Z 1 (e), Z 2 (e)} where We assign Z 1 (e) to x and Z 2 (e) to y.
Let X, Y denote the bipartition of the vertices of G. Now consider the random bipartite graph H where each x ∈ X is incident to the two Z 1 -smallest edges incident with x. Similarly, y ∈ Y is incident to the two Z 2 -smallest edges incident with y. Walkup [16] showed that H has a perfect matching w.h.p. The expected weight of this matching is asymptotically at most This follows from (i) the the expression given in Corollary 6 for the expected minimum and second minimum of d copies of Z and (ii) the matching promised in [16] is equally likely to select a minimum or a second minimum weight edge.
The selected matching is the sum of independent random variables with exponential tails and so will be concentrated around its mean.

Upper Bound for CMP
We let p, d be as in Section 4.1. We replace Walkup's result [16] by Frieze's result [8] that the random graph G 2−out contains a perfect matching w.h.p. The random graph G k−out has vertex set [n] and each vertex v ∈ [n] independently chooses k random edges incident with v. We again replace c(e), e = (x, y) by min {Z 1 (e), Z 2 (e)} where Z 1 , Z 2 are independent copies of Z W and associate one copy with each endpoint of the edge. We consider the random graph H where each v ∈ [n] is incident to the two Z W -smallest edges incident with x. This is distributed as G 2−out and we obtain an expression similar to that in (29).
We have concentration around the mean as in Section 4.1.

Upper bound for CSTSP/CATSP
For the symmetric case we replace w(e), e = {x, y} by min {Z 1 (e), Z 2 (e)} for each edge of K n and for the asymmetric case we replace w(e), e = (x, y) by min {Z 1 (e), Z 2 (e)} for each directed edge of K n . In both cases we associate one copy of Z W to each endpoint of e. We define p, d as in Section 4.1 and consider either the random graph G n,p or the random digraph D n,p .
For the symmetric case, we consider the random graph H that includes the 3 cheapest edges associated with each vertex, cheapest with respect to Z W (e). This will be distributed as G 3−out which was shown to be Hamiltonian w.h.p. by Bohman and Frieze [4]. For the asymmetric case, we consider the random digraph H that includes the 2 cheapest out-edges and the 2 cheapest in edges associated with each vertex, cheapest with respect to Z W (e). This will be distributed as D 2−in,2−out which has vertex set [n] and where each vertex v independently chooses 2 out-and in-neighbors. The random digraph D 2−in,2−out was shown to be Hamiltonian w.h.p. by Cooper and Frieze [5].
The expected weight of the tour promised by [4] or by [5] is asymptotically O(n 1+rα/β−α /Υ α/β ) as in Section 4.1. We have concentration around the mean as in Section 4.1.
For CMP, assuming that n = 2m, for ε sufficiently small.

More general distributions
We follow an argument from Janson [13]. We will asssume that w(e), has the distribution function F w (t) = P(X ≤ t), of a random variable X, that satisfies F w (t) ≈ at 1/α , α ≤ 1 as t → 0. For the costs c i (e) we have F c (t) ≈ bt 1/β , β ≤ 1. The constants a, b > 0 can be dealt with by scaling and so we assume that a = b = 1 here. For a fixed edge and say, w(e), we consider random variables w < (e), w > (e) such that w < (e) is distributed as Z α+εn E and w > (e) is distributed as Z α−εn E , where ε n = 1/10 log n. (This choice of ε n means that n α+εn = e 1/10 n α .) Then let U(e) be a uniform [0, 1] random variable and suppose that X has the distribution F −1 (U). We couple X, w < , w > by generating U(e) and then w < (e) = F −1 < (U) = log 1 1−u α−εn and F > is defined similarly. The coupling ensures that w < (e) ≤ w(e) ≤ w > (e) as long as w(e) ≤ ε n .
Given the above set up, it only remains to show that w.h.p. edges of length w(e) > ε n or cost c i (e) > ε n are not needed for the upper bounds proved above. We can ignore the lower bounds, because they only increase if we exclude long edges.
Assumptions for CMWP. For the minimum weight path problem we will assume that Υ 1/β ≫ log 1+r/β n n , which is a log n factor larger than required for Theorem 1. We will assume that C i = o(1) and then we only use edges of cost of order C i / log n ≪ ε n .
Observe that the minimum weight of a path from 1 to n is at most 4 log n np w.h.p. and this is less than ε n because of the assumption log 1+r/β n n ≪ Υ 1/β and the definition of p (see (2)).
Assumptions for the other problems, We deal with costs by assuming that C i = o(n/ log n), i = 1, 2, . . . r. It is then a matter of showing that w.h.p. the first few order statistics of Z W are very unlikely to be greater than ε n . (Z W is defined in (28).) But in all cases this can be bounded as follows: let W 1 , W 2 , . . . , W m , m ≥ n/2 be independent copies of Z W . Then, This bounds the probability of using a heavy edge at any one vertex and inflating by n gives us the result we need.

Conclusion
We have given upper and lower bounds that hold w.h.p. for constrained versions of some classical problems in Combinatorial Optimization. They are within a constant factor of one another, unlike the situation with respect to spanning trees and arborescences, [9], [10], where the upper and lower bounds are asymptotically equal. It is a challenge to find tight bounds for the problems considered in this paper and to allow correlation between length and cost.
We have not made any claims about E(w * (C)) because there is always the (small) probability that the problem is infeasible. It is not difficiult to similarly bound the expectation conditional on feasibility.