Capturing the Drunk Robber on a Graph

We show that the expected time for a smart"cop"to catch a drunk"robber"on an $n$-vertex graph is at most $n + {\rm o}(n)$. More precisely, let $G$ be a simple, connected, undirected graph with distinguished points $u$ and $v$ among its $n$ vertices. A cop begins at $u$ and a robber at $v$; they move alternately from vertex to adjacent vertex. The robber moves randomly, according to a simple random walk on $G$; the cop sees all and moves as she wishes, with the object of"capturing"the robber---that is, occupying the same vertex---in least expected time. We show that the cop succeeds in expected time no more than $n + {\rm o}(n)$. Since there are graphs in which capture time is at least $n - o(n)$, this is roughly best possible. We note also that no function of the diameter can be a bound on capture time.


Introduction
The game of cops and robbers on graphs was introduced independently by Quilliot [10] and Nowakowski and Winkler [8], and has generated a great deal of study; see, e.g., [15,16,17,18].
In the original formulation a cop and robber move alternately and deliberately, with full information, from vertex to adjacent vertex on a graph G, with the cop trying to capture the robber and the robber trying to elude the cop. In this work, all graphs are assumed to be connected, simple (no loops or multiple edges) and undirected. A graph is said to be "cop-win" if there is a vertex u such that for every v, the cop beginning at u can capture the robber beginning at v.
In addition to their obvious role in pursuit games, cop-win graphs (which are also known as "dismantlable" graphs) have appeared in diverse places including statistical physics [3]. In the present work, we consider a variation suggested [7] by Ross Churchley of the University of Victoria, in which the robber is no longer in control of his fate; instead, at each step he moves to a neighboring vertex chosen uniformly at random. We may therefore imagine that the robber is in fact a drunk-one who is too far gone to have an objective.
On any graph, the drunk will be caught with probability one, even by a cop who oscillates on an edge, or moves about randomly; indeed, by any cop who isn't actively trying to lose.
The only issue is: how long does it take? The lazy cop will win in expected time at most 4n 3 /27 (plus lower-order terms), since that is the maximum possible expected hitting time for a random walk on an n-vertex graph [2]; the same bound applies to the random cop [4]. It is easy to see that the greedy cop who merely moves toward the drunk at every step can achieve O(n 2 ); in fact, we will show that the greedy cop cannot in general do better. Our smart cop, however, gets her man in expected time n + o(n). Note that when the adversaries play on a lollipop graph consisting of a clique of size cn 1/3 (for some constant c ∈ R) with a path of length n−cn 1/3 attached at one end, with the drunk starting in the clique and the cop starting at the opposite endpoint of the path, the expected capture time will be n−Θ(n 1/3 ) = n−o(n), and we conjecture that this is worst possible.

Preliminaries
In this variation, a "move" (as in chess) will consist of a step by the cop followed by a (uniformly random) step by the drunk. Capture or "arrest" takes place when the cop lands on the drunk's vertex or vice-versa, and the capture time T is the number of the move at which this takes place.
Let us consider some examples. (1) Suppose G is the path P n on n vertices, with u and v its endpoints. Then the cop will (using any of the algorithms we consider later) move along the path until she reaches the drunk; this will take expected time about n− √ n since a random walk on a path will on average progress about distance √ t in time t.
(2) Let G be the complete balanced bipartite graph K n/2 , n/2 , with the cop and the drunk beginning on the same side. Then the poor cop will find herself always moving to the opposite side from her quarry until, finally, he accidently runs into her; since the latter event occurs with probability about 2/n, arrest takes on average n/2 steps.
The reader may feel with some justification that we are being unrealistic in not allowing the cop to stay put; in example (2), sitting for one move would enable her to catch the drunk on the next move. Ultimately, we force the cop to move at each step in order to hold her to the same constraints as her quarry's, and because it gives us the strongest results. Our bounds still apply when the cop, the drunk, or both are allowed to stay put on any move.
Even when the cop is permitted to idle, she cannot expect to catch the drunk in time bounded by a function of the diameter of G. Example (3), let G be the incidence graph of a projective plane of order n. A projective plane P of order n is a collection of objects called "points" and sets of points called "lines" satisfying the following conditions: 1. Two points determine a unique line.
2. Two lines intersect in a unique point.
3. Every line consists of exactly n + 1 distinct points.
4. Every point lies on exactly n + 1 distinct lines.
Projective planes of order n are known to exist for n = p a for any prime number p and positive integer a [13]. The incidence graph G of P is therefore a graph with 2(n 2 + n + 1) vertices, with adjacency relation u ∼ v if u is a point in P and v is a line that goes through u, or vice versa. Such graphs have bounded diameter but unbounded expected capture time:  Proof. Note that G has no odd cycles by the independence of the set of points (and respectively, set of lines). Now assume for sake of contradiction that G contains a cycle of length 4. Then there are two points p 1 , p 2 and two lines 1 , 2 such that p 1 , 1 , p 2 , 2 , p 1 forms a cycle. But this contradicts condition (2) since 1 and 2 must intersect in p 1 as well as p 2 .
Proof. By conditions (3) and (4). Proof. When the cop gets to distance 2 of the drunk, he has only one bad move out of r; the rest keep him at distance at least 2. (Similarly, if the cop gets to distance 1, bypassing ever being at distance 2, the drunk still has only one bad move out of r, the rest of which keep him at distance 1.) Hence the cop's expected capture time cannot be any lower than r (the expected number of independent Bernoulli trials, each with success probability 1/r, until success is achieved).
On the other hand, it is not hard to verify that on any regular graph, the greedy cop-who minimizes her distance to the drunk at each move-wins in expected time at most linear in n.
If G is regular of degree r, its diameter cannot exceed 3n−r−3 r+1 [11]. Since the drunk will step toward the cop with probability at least 1/r at each move, resulting (after her response) in a decrease of 2 in their distance, the expected capture time is bounded by r · diam(G)/2 < 3n/2.
The linear bound also holds on trees. To see this, we proceed by induction on the size of the tree, n. When n = 2, the capture time is clearly less than n (since the drunk will run into the cop on his first move). Now suppose that on any tree with t < n vertices, the expected capture time is at most t, and let T be a tree on n vertices, rooted at c 0 (the cop's initial position). For all descendants v of c 0 , let T v be the subtree of T consisting of v and all of its descendants. So the game begins on T = T c 0 , and after the first move, since the drunk cannot get "behind" the cop without being caught, the game is being played on T c 1 where c 1 is the cop's position after one step. (Note that by the greedy strategy, c 1 is the unique neighbor of c 0 which is on the path from c 0 to r 1 , the drunk's position after he takes his first step.) |V (T c 1 )| ≤ |V (T c 0 )| − 1 = n − 1 so by the induction hypothesis, the game takes no more than expected time n − 1 on T c 1 and therefore no more than n on T .
For general G, one can guarantee only that at a given point in time the drunk will step toward the cop with probability at least 1/∆, giving a bound of order n 2 for the greedy cop.
That may appear to be a gross overestimate, especially in light of the special cases discussed above, but a graph with many high-degree vertices can still have large diameter. For example, consider the following graph. The "ladder" in this graph consists of two copies of the path P n/4 with each pair of corresponding vertices connected by an edge. The "basement" consists of a complete bipartite graph, K n/4 , n/4 . We begin with the drunk inside the basement, and the cop on the far end of the ladder. While the drunk is meandering inside the basement, the cop-staying true to her goal of minimizing the distance between her and the drunk at each step-is alternating between the two paths. Note that we assume she makes the foolish choice when she is presented with several options by her algorithm. It takes the drunk n/4 moves on average to leave the basement, and each time this occurs, the cop will decrease the distance by 2 by traveling along her current path. Therefore the capture will require an average of about (n/4) 2 /2 steps.

Intuition
As noted in the example of the "ladder to the basement" graph in Section 2, a foolish greedy cop can be foiled by her desire to "retarget" too often. That is, since she updates the target vertex (to which she is trying to minimize her distance) at each step, she is made indecisive by an indecisive drunk. One natural solution to this problem would be to walk directly toward the robber's initial position in the basement for several steps before retargeting. Continuing in this way, the cop makes steady progress, ultimately catching the drunk in time less than n.
In general, if a cop and drunk begin at distance d on a graph, and the cop proceeds by retargeting every four steps, then by Lemma 3.4 below, it would take 4(4n 2/3 )(d − 3) moves to get down to distance less than four. Since d can be as large as n − 1, this would not suffice to yield our promised bound of n + o(n), so the cop must first do something else to get her distance to the drunk down without spending too much time doing so-hence the following four-stage strategy. We will refer to the progress made by the cop in the first two stages as "gross progress," and in the last two stages as "fine progress." In order to prove the bounds claimed above, it will be beneficial to have a few lemmas.

Gross Progress
Suppose that the drunk starts on vertex u and the cop starts at v. As noted in the set-up of the previous section, in the first stage of the cop's strategy, she is concerned only with getting to u (even if this may not decrease her distance from the drunk at the end of the stage).
Clearly the time this takes is equal to T 1 = d(v, u) ≤ diam(G). We would like to get a bound on E[D 1 ], the expected distance between the cop and the drunk at the end of this stage. For that, the following lemma will prove quite useful.
Proof. Let p t (x, y) be the probability that a random walk that starts at vertex x will be at vertex y in exactly t steps. The Varopoulous-Carne bound [12], as formulated in [9], says where d(x, y) is the graph distance between the two vertices. Therefore, if we consider the random walk x 0 , x 1 , . . . , x t on a graph of size n and let c ∈ R be any constant, we have the following bound as a corollary of Varopoulos-Carne: Letting c = √ 1 + 5 log n therefore yields that P(d(x 0 , x t ) ≥ √ 1 + 5 log n √ t) < 1 n .
This bound is not tight, but it will be good enough to give us the o(n) bound we seek on Recall that D 1 is the distance between the two players at the end of Stage 1. Note that this is equivalent to the distance between the drunk's initial position and his position at the  Now we are done with the "gross progress" that the cop makes in Stages 1 and 2. Note that the total expected time to complete these two stages is bounded by

Fine Progress
At the conclusion of stage 2, the cop's approach changes. Now she retargets every 4 moves.
We make this notion precise in the following manner.
For each integer j ≥ 1 let x j , y j be the drunk's and cop's positions, respectively, at time j (with it being the drunk's turn to move). Then in Stage 3, while d(x j , y j−1 ) ≥ 4, for all j of the form 4i + 1 for some i ≥ 0, the cop chooses x 4i+1 as her target and proceeds along a geodesic toward that target for the next four steps. Consequently, the cop's target changes every 4 moves, so that for each integer i ≥ 0, she has target x 4i+1 at times 4i + 1, 4i + 2, 4i + 3, and 4i + 4. If at time j = 4i + 1, d(x j , y j−1 ) < 4, Stage 3 terminates and the cop's strategy moves into Stage 4, which will be described after the following lemma. Before we prove this lemma, note that we could not get away with looking at the first three steps of a random walk. That is, we could not get a useful bound for P(d(x 0 , x 3 ) < 3).
Consider the following example: we have a graph G with a vertex x 0 . Let A k be the set of vertices at distance k from x 0 . Suppose that G looks like Figure 2. We now return to the proof of Lemma 3.4.
Proof. We proceed by assuming a graph G and a vertex x 0 ∈ V (G) are such that there is a random walk {x 0 , x 1 , . . . } with the property P(x 0 , x 4 ) < 4) < 1/s, and we shall derive a contradiction.
Let A k be the set of vertices at distance k from x 0 , and let a k = |A k | for all k. We adapt the terms in-degree and out-degree to mean the following: |. We will use the notation p G for the quantity under investigation, , and for a vertex v ∈ V (G), we define p k (v) to be the quantity P(d(x 0 , x 4 ) < 4|x k = v). Note that p 0 (x 0 ) = p G and p k (v) = 1 if v ∈ A j for some j < k. Finally, we call any step by the random walker that guarantees d(x 0 , x 4 ) < 4 a "stall." We will break this proof into several statements.
Claim 1. Let G be the graph defined by removing all edges between x 0 and all but one vertex, Proof. Since p G < 1/s, there must exist a vertex v ∈ A 1 with p 1 (v) < 1/s. Choose x 1 such that p 1 (x 1 ) = min v∈A 1 p 1 (v) and define G as in the statement of the claim. Note that A k and with all edges removed except those that are between a vertex in A k−1 and a vertex in A k for k ∈ [4].
Proof. LetĜ be the induced subgraph of G on the vertices V (G ) − k>4 A k for k > 4. Then since P(x t ∈ A k ) = 0 when t ≤ 4 and k ≥ 5 (so in particular, pĜ and p G depend only on the first four steps of a random walk originating at x 0 ), we have that pĜ = p G .
. Now let G be derived fromĜ by removing all edges except for those that are between A k−1 and A k .
(In particular, this means that for all k ∈ [4], for all for all vertices v with neighbors w such that d(x 0 , v) = d(x 0 , w) and does not change p k (v) for all vertices v with no such neighbors. Since we have that p G ≤ pĜ .
In view of Claims 1 and 2 above, we may assume that G has the following properties: N G (x 0 ) = x 1 , the only edges in G are between A k−1 and A k for k ∈ [4], and A k = ∅ for all k > 4.
is the harmonic mean of the d i ). Consequently, d/a 2 > s. Thus the average out-degree from A 2 is greater than s − 1, which implies that there are more than s(s − 1) edges between A 2 and A 3 . Proof. Define e B to be the number of edges with one endpoint in B and the other in A 3 . Note that c ≤ a 3 < n − a 2 ≤ n − 4n 2/3 and consequently the number of edges with one endpoint in A 2 and the other in C is less than n 4/3 − 4n. Since more than half of the outedges of each vertex in B terminate in a vertex in C, this says that e B < 2n 4/3 − 8n.
Now assume, for sake of contradiction, that b ≥ 1 2 a 2 . Then P(x 2 ∈ B) > 1/2 so we have are the degrees of the vertices in B. For each vertex in B, The average out-degree from B is f b − 1 and so we get . This is a contradiction since 2n 4/3 − 8n < 4n 4/3 − 2n 2/3 .
Proof. If x 2 ∈ A 2 then with probability greater than 1/2, x 2 ∈ A 2 \B. By definition, more than half of the out-edges of a vertex in A 2 \B terminate in A 3 \C, and x 3 is chosen uniformly at random from the neighbors of x 2 . This yields and therefore Therefore the probability of stalling at step 3 is greater than (1/4) n 1/3 n = 1/s, yielding a contradiction.
Let j = 4i + 1 be such that the game is in Stage 3 at time j, and let x j , y j be the positions of the drunk and cop, respectively, after both have moved (so that it is the drunk's turn). By Lemma 3.4 we have that d(x j , x j−4 ) < 4 with probability at least 1 4 n −2/3 . Consequently, since the cop had x j−4 as her target, we now have d(y j , x j ) < d(y j−4 , x j−4 ) (so the distance has decreased by at least 1) with probability at least 1 4 n −2/3 . Let Y i be a random variable which equals the decrease in distance between time 4(i − 1) and 4i. Y i is 0 with probability less than 1 − 1/s and is ≥ 1 with probability at least 1/s.
all i). Let S n = X 1 + · · · + X n , for all n ∈ N. Consider the random process {X i : i ∈ N} with the stopping rule that says the process terminates at time τ if S τ = D 2 − 3. By Wald's identity [14], Stage 3 terminates when the distance between the cop and the drunk is less than four, and it is the cop's turn. In Stage 4, which terminates when the drunk is captured, the cop uses the greedy strategy, defined as follows. Suppose that the strategy enters Stage 4 at time t, during which time the drunk is at vertex x t and the cop is about to move from vertex y t−1 .
Then d(x t , y t−1 ) ≤ 3, and the cop moves such that d(x t , y t ) ≤ 2. Now for any r > t, if the drunk moves such that d(x r , y r−1 ) = 3, the cop can choose y r to ensure that d(x r , y r ) = 2.
For each r, with probability at least 1/∆, the drunk moves "toward" the cop-i.e., such that d(x r , y r−1 ) = 1; if that happens, the cop can choose y r = x r , capturing the drunk. This takes at most ∆ expected moves, so E[T 4 ] ≤ ∆ where T 4 is the expected time spent in Stage 4.
Adding together our results about the expected time to complete each of the four stages The reader may, for instance, have noticed that in the "ladder to the basement" example of Section 2, we considered a cop who was not only greedy but also rather insistently foolish.
What about the greedy cop who makes distance-minimizing decisions at random? The "ladder to the basement" graph is no longer a problem for her, (the expected capture time in this example is now less than n). Is it possible that the greedy algorithm with disputes settled by a random decision between choices is enough to guarantee time n + o(n)?
It is also possible that a deterministic greedy cop who breaks ties by considering her distance to vertices previously occupied by the drunk will capture in expected time at most n + o(n).
An alternative greedy strategy, suggested by Andrew Beveridge [1], concerns itself with minimizing the drunk's expected hitting time to the cop at every step. It would be interesting to see if this strategy also has expected capture time at most n + o(n).

Acknowledgments
This work has benefited from conversations at Microsoft Research, in Redmond, Washington, with Omer Angel, Ander Holroyd, Russ Lyons, Yuval Peres, and David Wilson.