Orthogonal basis for functions over a slice of the Boolean hypercube

We present an orthogonal basis for functions over a slice of the Boolean hypercube. Our basis is also an orthogonal basis of eigenvectors for the Johnson and Kneser graphs. As an application of our basis, we streamline Wimmer's proof of Friedgut's theorem for slices of the Boolean hypercube.


Introduction
Functions over the Boolean hypercube {0, 1} n are often studied using the tools of Fourier analysis (see O'Donnell's excellent recent monograph [19]). The crucial idea is to study functions from the point of view of the Fourier basis, an orthonormal basis of functions over the Boolean hypercube. In this work, we consider functions on a different domain, a slice of the Boolean hypercube [n] k = {(x 1 , . . . , x n ) ∈ {0, 1} n : i x i = k}; we always assume that k ≤ n/2. Such functions arise naturally in coding theory, in the context of constantweight codes, and have recently started appearing in theoretical computer science as well. In this work we provide an explicit orthogonal basis for the vector space of functions on a slice.
The slice has been studied in algebraic combinatorics under the name Johnson association scheme, and in spectral graph theory in relation to the Johnson and Kneser graphs. Our basis is the analog of the Fourier basis for the scheme, and it refines the decomposition induced by the primitive idempotents. Our basis is also an orthogonal basis for the eigenvectors of the Johnson and Kneser graphs, and any other graph belonging to the Bose-Mesner algebra of the Johnson association scheme. Such (weighted) graphs arise in Lovász's proof of the Erdős-Ko-Rado theorem [16], and in Wilson's proof [25] of a t-intersecting version of the theorem.
Despite the name, it is perhaps best to view the slice [n] k as the set of cosets of S k × S n−k inside S n . This point of view suggests "lifting" an orthogonal basis from the symmetric group to the slice. Following Bannai and Ito [1], the relevant representations of the symmetric group are those corresponding to partitions (n − d) + d for d ≤ k. Our basis arises from Young's orthogonal representation of the symmetric group. However, we present the basis and prove its properties without reference to the symmetric group at all. One feature that is inherited from the symmetric group is the lack of a canonical basis: our basis relies on the ordering of the coordinates.
Dunkl [2] showed that the space of functions over the slice [n] k can be identified with the space of multilinear polynomials in n variables x 1 , . . . , x n which are annihilated by the differential operator n i=1 ∂/∂x i ; the input variable x i is an indicator variable for the event that i belongs to the input set. Functions annihilated by this operator were termed harmonic by Dunkl [4]. Our basis forms an orthogonal basis for the space of harmonic multilinear polynomials for every exchangeable measure (a measure invariant under the action of S n ). As a consequence, we show how to lift a low-degree function from the slice [n] k to the Boolean cube (under an appropriate measure) while maintaining some of its properties such as expectation, variance and L 2 -norm.
Wimmer [26] recently generalized a fundamental theorem of Friedgut [9] from the Boolean hypercube to the slice. Friedgut's theorem, sometimes known as Friedgut's junta theorem, states that a Boolean function on the Boolean hypercube with low total influence is close to a Boolean junta (a function depending on a small number of variables). Although Wimmer's main theorem is a statement about functions on the slice, Wimmer lifts the given function to the symmetric group, where most of his argument takes place, exploiting essential properties of Young's orthogonal representation. Eventually, a hypercontractive property of the slice (due to Lee and Yau [15]) is invoked to complete the proof. As an application of our basis, we give a streamlined version of Wimmer's proof in which our basis replaces the appeal to the symmetric group and to Young's orthogonal representation.
Note added in proof Since writing this paper, we have learned that the same basis has been constructed by Srinivasan [23] in a beautiful paper. Srinivasan in fact constructs an extended basis for the entire Boolean cube {0, 1} n , which he identifies with the canonical Gelfand-Tsetlin basis [24], and shows that it is orthogonal with respect to all exchangeable measures. However, he provides neither an explicit description of the basis elements, nor even a canonical indexing scheme for the basis elements. Instead, he gives a recursive algorithm that constructs the basis. We believe that both approaches have merit.
Related work Apart from Friedgut's theorem, several other classical results in Fourier analysis of Boolean functions have recently been generalized to the slice. O'Donnell and Wimmer [21,22] generalized the Kahn-Kalai-Linial theorem [12] to the slice, and deduced a robust version of the Kruskal-Katona theorem. Filmus [5] generalized the Friedgut-Kalai-Naor theorem [10] to the slice.
Filmus, Kindler, Mossel and Wimmer [6] and Filmus and Mossel [7] generalized the invariance principle [17] to the slice. The invariance principle on the slice compares the behavior of low-degree harmonic multilinear polynomials on a slice [n] k and on the Boolean hypercube {0, 1} n with respect to the corresponding product measure µ k/n . If the harmonic multilinear polynomial f has degree d and unit variance, then the invariance principle states that for any Lipschitz functional ϕ, where σ is the uniform distribution on the slice. The invariance principle can be used to lift results such as the Kindler-Safra theorem [13,14] and Majority is Stablest from the Boolean hypercube to the slice. Filmus and Mossel also give basis-free proofs for some of the results appearing in this paper. For example, they give a basis-free proof for the fact that the L 2 -norm of a low-degree harmonic multilinear polynomial on the slice is similar to its L 2 -norm on the Boolean cube under the corresponding product measure.
Synopsis We describe the space of harmonic multilinear polynomials in Section 2. Our basis is defined in Section 3, in which we also compute the norms of the basis elements. We show that our basis forms a basis for functions on the slice in Section 4, in which we also show how to lift low-degree functions from the slice to the entire hypercube, and explain why our basis is an orthogonal basis of eigenvectors for the Johnson and Kneser graphs. Section 5 and Section 6 are devoted to the proof of the Wimmer-Friedgut theorem. Notation We use the notation [n] = {1, . . . , n}. The cardinality of a set S is denoted |S|. If S ⊆ [n] and π ∈ S n (the symmetric group on [n]) then S π = {π(x) : x ∈ S}. We use the same notation in other similar circumstances. We compose permutations from right to left, so βα means apply α then β. We use the falling power notation: n k = n(n − 1) · · · (n − k + 1) (the number of terms is k). For example, n k = n k /k!. A function is Boolean if its values are in {0, 1}.

Harmonic multilinear polynomials
We construct our basis as a basis for the vector space of harmonic multilinear polynomials over x 1 , . . . , x n , a notion defined below. For simplicity, we only consider polynomials over R, but the framework works just as well over any field of characteristic zero.
We denote the vector space of harmonic multilinear polynomials over x 1 , . . . , x n by H n . The degree of a non-zero multilinear polynomial is the maximal number of variables in any monomial. We denote the subspace of H n consisting of polynomials of degree at most d by H n,d .
A polynomial has pure degree d if all its monomials have degree d. We denote the subspace of H n consisting of polynomials of degree exactly d by H ′ n,d . The following lemma calculates the dimension of the vector space of harmonic multilinear polynomials of given degree. Lemma 2.1. All polynomials in H n have degree at most n/2. For d ≤ n/2, where n −1 = 0. Proof. We start by proving the upper bound on the degree of polynomials in H n . Let P ∈ H n have degree deg P = d. The pure degree d part of P is also in H n , and so we can assume without loss of generality that P has pure degree d. For any y = y 1 , . . . , y n , the univariate polynomial P (t1 + y) (where 1 is the constant vector) doesn't depend on t, since In particular, if M is any monomial in P with coefficient α = 0 and y is the vector with y i = −1 whenever x i appears in M and y i = 0 otherwise, then P (v + 1) = P (v) = (−1) d α = 0, showing that P must contain some monomial supported on variables not appearing in M , since y i = 1 only for x i not appearing in M . In particular, 2d ≤ n. We proceed with the formula for H ′ n,d ; the formula for H n,d easily follows. When d = 0, the formula clearly holds, so assume d ≥ 1. The vector space of all multilinear polynomial of pure degree d over x 1 , . . . , x n has dimension n d . Denote by cf(P, M ) the coefficient of the monomial M in P . Harmonicity is the set of conditions There are n d−1 conditions, showing that dim H ′ n,d ≥ n d − n d−1 . In order to prove equality, we need to show that the conditions are linearly independent. We do this by showing that there is a polynomial P having pure degree d satisfying all but one of them, that is Such a polynomial is given by  For any two disjoint sequences A, B ∈ S n,d we define The basis functions will be χ A,B for appropriate A, B.
Definition 2.3. For d ≤ n/2, let A, B ∈ S n,d be disjoint. We say that A is smaller than B, A sequence B ∈ S n is a top set if B is increasing and for some disjoint sequence A of the same length, A < B. The set of top sets of length d is denoted by B n,d , and the set of all top sets is denoted by B n .
The following lemma is mentioned without proof in [8].
We encode each sequence B ∈ S n,d as a ±1 sequence β 0 , . . . , β d as follows. We put β 0 = 1, and for It is not hard to check that B is a top set iff all running sums of β are positive. Each sequence β is composed of d entries −1 and n − d + 1 entries 1. The probability that such a sequence has all running sums positive is given by the solution to Bertrand's ballot problem: it . Therefore the total number of top sets is We can now give Frankl and Graham's basis, which is given in [8] It remains to prove that X d is linearly independent. For an increasing sequence S ∈ S n,d , let Π(S) be the monomial Consider now the matrix representing X d in the basis {Π(S) : S ∈ S n,d } arranged in an order compatible with the partial order of S n,d . The resulting matrix is in echelon form and so has full rank, showing that X d is a linearly independent set.
We comment that dim H ′ n,d = n d − n d−1 is the dimension of the irreducible representation of S n corresponding to the partition (n − d) + d, as easily calculated using the hook formula. This is not a coincidence: indeed, the set of standard Young tableaux of shape (n − d), d is in bijection with B n,d by identifying the top sets with the contents of the second row of each such tableau (the tableau can be completed uniquely).

Young's orthogonal basis
In this section we will construct an orthogonal basis for H n , and calculate the norms of the basis elements. Our basis will be orthogonal with respect to a wide class of measures.
The norm of f ∈ H n is f = f, f .
We are now ready to define the basis.
For d ≤ n/2, we define Young's orthogonal basis for H ′ n,d is We stress that the sequences A in the definition of χ B need not be increasing.
The following theorem justifies the name "orthogonal basis".
Theorem 3.1. The set Y n is an orthogonal basis for H n with respect to any exchangeable measure. The set Y n,d is an orthogonal basis for H ′ n,d with respect to any exchangeable mesure. In particular, the subspaces H ′ n,d for different d are mutually orthogonal. Proof. Lemma 2.3 shows that each χ B ∈ Y n,d lies in H ′ n,d . The technique used in proving the lemma shows that Y n,d is a basis for H ′ n,d , but this will also follow from Lemma 2.1 once we prove that the functions in Y n,d are pairwise orthogonal. In fact, we will prove the following more general claim: if B 1 , B 2 ∈ B n and B 1 = B 2 then χ B1 , χ B2 are orthogonal. This will complete the proof of the theorem.
Consider any B 1 ∈ B n,d1 and We call each of the terms χ A1,B1 χ A2,B2 appearing in this expression a quadratic product. We will construct a sign-flipping involution among the quadratic products, completing the proof. The involution is also allowed to have fixed points; in this case, the expectation of the corresponding quadratic product vanishes. Consider a quadratic product We can represent this quadratic product as a directed graph G on the vertex set For each factor x i − x j in the quadratic product, we draw an edge from i to j; all edges point in the direction of the larger vertex (the vertex having a larger index). We further annotate each edge with either 1 or 2, according to which of χ A1,B1 , χ A2,B2 it corresponds to. Every variable x i appears in at most two factors, and so the total degree of each vertex is at most 2. Therefore the graph decomposes as an undirected graph into a disjoint union of paths and cycles. The annotations on the edges alternate on each connected component. Every directed graph G ′ in which edges point in the direction of the larger vertex, the total degree of each vertex is at most 2, and the annotations on the edges alternate in each connected component, is the graph corresponding to some quadratic product can be read using the annotations on the edges. We define Since B 1 = B 2 , some connected component must have a vertex with in-degree 1. Choose the connected component C satisfying this property having the largest vertex. We construct a sequence of intervals inside C, with the property that each of the endpoints x, y of each interval is either an endpoint of C, or is connected to the rest of C via a vertex z > x, y. Furthermore, each interval, other than possibly the last one, contains some vertex with in-degree 1. The sequence terminates with an interval containing an odd number of edges.
When C is a path, the first interval I 0 is the entire path. When C is a cycle with maximal vertex M , the first interval I 0 is the path obtained by removing M from C. Given an interval I t with an even number of edges, we can break it into two (possibly empty) subintervals terminating at the maximal point M t of I t : Note that not both J t , K t can be empty since I t contains some vertex with in-degree 1. If J t is empty then we define I t+1 = K t , which terminates the sequence. Similarly, if K t is empty then we define I t+1 = J t , which terminates the sequence. If both J t , K t are non-empty then at least one of them has a vertex with in-degree 1. We let I t+1 be the sub-interval among J t , K t with the larger maximal point.
Since the intervals decrease in size, the sequence eventually terminates at some interval I t = v 1 , . . . , v ℓ having an odd number of edges (so ℓ is even). We now consider two graphs obtained from G. The first graph G π is obtained by applying the permutation π which maps v i to v ℓ+1−i and fixes all other vertices. The second graph G r is obtained by detaching I t from G, reversing it, and attaching it back to G; see Figure 1.
If we run the same construction on G r then we get the same connected component C and the same sequence of intervals I 0 , . . . , I t , and so (G r ) r = G, that is, the mapping G → G r is an involution.
The graphs G π , G r differ only in the direction of some edges: the edge between v i and v i+1 in G π has the same direction as the edge between v ℓ+1−i and v ℓ−i in G, which is the same as its direction in By construction, G r (but not G π ) corresponds to some quadratic product is a sign-flipping involution on the collection of all quadratic products, completing the proof.
In order to complete the picture, we need to evaluate the norms of the basis elements χ B , which necessarily depend on the measure.
Proof. We consider first the case in which the exchangeable measure is the measure ν p for some p ∈ [0, 1]. Under this measure, the variables x 1 , . . . , x n are independent, with Pr[ In the proof of Theorem 3.1 we associated a directed graph with each quadratic product χ A1,B χ A2,B : the vertices are A 1 ∪ A 2 ∪ B, and the edges point from a 1,i and a 2,i to b i for each i ∈ [d], annotated by 1 or 2 according to whether they came from χ A1,B or from χ A2,B . Since each vertex appears at most twice, the graph decomposes as a sum of paths and cycles. The edges point from A 1 , A 2 to B, and so each vertex either has in-degree 0 (if it is in A 1 ∪ A 2 ) or in-degree 2 (if it is in B). Therefore the paths and cycles have the following forms, respectively: Here the α i belong to A 1 ∪ A 2 , and the β i belong to B. The corresponding factors of χ A1,B χ A2,B are, respectively: We proceed to calculate the expectation of each of these factors under ν p . The expectation of a monomial is zero unless each variable appears exactly twice, in which case the expectation is (p(1 − p)) ℓ (since each monomial has total degree 2ℓ). In the case of a path, there is exactly one such monomial, namely x 2 β1 x 2 β2 · · · x 2 β ℓ . In the case of a cycle, there are two such monomials: x 2 β1 x 2 β2 · · · x 2 β ℓ and x 2 α1 x 2 α2 · · · x 2 α ℓ . Both monomials appear with unit coefficient. Notice that ℓ is the size of the subset of B appearing in the path or cycle. Hence the expectation of the entire quadratic product is 2 C (p(1 − p)) d , where C = C(A 1 , A 2 ) is the number of cycles. In total, we get We proceed to show that The quantity on the right enumerates the sequences α 1,1 , α 2,1 , α 1,2 , α 2,2 , . . . , α 1,d , α 2,d ∈ S n,2d in which , which we call legal sequences. We show how to map legal sequences into quadratic products in such a way that χ A1,B χ A2,B has exactly 2 C(A1,A2) preimages. Let α 1,1 , α 2,1 , α 1,2 , α 2,2 , . . . , α 1,d , α 2,d ∈ S n,2d be a legal sequence. We construct a quadratic product χ A1,B χ A2,B alongside its associated directed graph. We maintain the following invariant: after having processed α 1,i , α 2,i (and so adding b i to the graph), all vertices belonging to cycles have appeared in the sequence, and out of each path, exactly one vertex (from B) has not appeared previously in the sequence. Furthermore, the endpoints of each path do not belong to B (and so have incoming edges) and have different annotations.
We start with the empty product and graph. At step i we process α 1,i , α 2,i . Suppose first that α 1,i , α 2,i = b i . If α 1,i / ∈ B, then we set a 1,i = α i and add the edge α i 1 − → b i . If α 1,i ∈ B then necessarily α 1,i < b i , and so it appears in some component C. We locate the endpoint x whose incoming edge is labelled 2, set a 1,i = x and add the edge x 1 − → b i . Note that the other endpoint of C is labelled 1. We do the same for α 2,i . It is routine to check that we have maintained the invariant.
Suppose next that one of α 1,i , α 2,i equals b i , say α 2,i = b i . We process α 1,i as in the preceding step. Let x be the other endpoint of the path containing b i . We set a 2,i = x and connect x to b i , completing the cycle. This completes the description of the mapping.
We proceed to describe the multivalued inverse mapping, from a quadratic product to a legal sequence. We process the quadratic product in d steps, updating the graph by removing each vertex mentioned in the legal sequence. We maintain the invariant that each original path remains a path, and each original cycle either remains a cycle or disappears after processing the largest vertex. Furthermore, each edge still points at the larger vertex, the annotations alternate, a vertex b i not yet processed has two incoming edges, and other vertices have no incoming edges.
At step i, we process b i . Let x 1 , x 2 be the neighbors of b i labelled 1, 2, respectively. Suppose first that x 1 = x 2 . We put α 1,i = x 1 and α 2,i = x 2 , and remove the vertices x 1 , x 2 . The other neighbors of x 1 , x 2 , if any, are connected to b i with edges pointing away from b i with annotations 2, 1, respectively. These neighbors had incoming edges and so are b j , b k for j, k > i. It follows that the invariant is maintained. The case x 1 = x 2 corresponds to a cycle whose largest vertex is b i . We either put α 1,i = x 1 and α 2,i = b i or α 1,i = b i and α 2,i = x 2 , deleting the entire cycle in both cases.
It is routine to check that the two mappings we have described are inverses. Furthermore, the multivalued mapping from quadratic products to legal sequences has valency 2 C(A1,A2) when processing χ A,B1 χ A,B2 . This completes the proof of the formula for 2 d c B .
Having considered the measure ν p , we consider a related measure µ p . Under this measure the x i are independent, Pr[x i = 0] = 1 − p, and Pr[ Consider now a general exchangeable measure m. Exchangeability implies that for some integers γ 0 , . . . , γ d , Reading off the coefficient of γ k , we deduce and so In order to evaluate N d , we consider B = 2, 4, . . . , 2d. In this case, b i = 2i and so We conclude that 2 d N d = χ d 2 , and the theorem follows.
Having verified that Y n is a basis for H n , we can define the corresponding expansion. The following simple lemma gives standard properties of this expansion.
where ∅ is the empty sequence.
In particular, if f ∈ H n then for every exchangeable measure π we can write E[ 2 depends only on f . Filmus and Mossel [7] give a basis-free proof of this important fact.
The familiar Fourier basis for the Boolean hypercube gives a simple criterion for a function to depend on a variable. The matching criterion in our case is also simple but not as powerful. Since

Slices of the Boolean hypercube
Harmonic multilinear polynomials appear naturally in the context of slices of the Boolean hypercube. x i = k}.
We also identify [n] k with subsets of [n] of cardinality k. We endow the slice [n] k with the uniform measure, which is clearly exchangeable.
A function over the slice is a function f : [n] k → R. Every function f : R n → R can be interpreted as a function on the slice in the natural way.
We proceed to show that Y n,0 ∪ · · · ∪ Y n,k is an orthogonal basis for the slice [n] k . Theorem 4.1. Let n and k ≤ n/2 be integers, and put p = k/n. The set {χ B : B ∈ B n,d for some d ≤ k} is an orthogonal basis for the vector space of functions on the slice, and for B ∈ B n,d , Proof. We prove the formula and estimate for χ B 2 below. When d ≤ k, the formula shows that χ B 2 = 0 and so χ B = 0 as a function on the slice. Hence the set {χ B : B ∈ B n,d for some d ≤ k} consists of non-zero mutually orthogonal vectors. Lemma 2.2 shows that the number of vectors in this set is n k , matching the dimension of the vector space of functions on the slice. Hence this set constitutes a basis for the vector space of functions on the slice.
In order to prove the formula for χ B 2 , we need to compute The quantity k if S contains exactly one out of each pair {x 2i−1 , x 2i }, in which case the quantity has value 1. Therefore This yields the formula for χ B 2 . We can estimate this expression as follows: implying the stated estimate.

The theorem shows that every function on [n]
k can be represented uniquely as a harmonic multilinear polynomial of degree at most k (see Filmus and Mossel [7] for a basis-free proof). The quantity (2p(1 − p)) d is the squared norm of χ d under the µ p measure (defined below) over the entire Boolean hypercube, and this allows us to lift low-degree functions from the slice to the entire Boolean hypercube while preserving properties such as the expectation and variance. If we endow {0, 1} [n] with the measure µ p given by µ p (x 1 , . . . , Proof. The definition off directly implies that for all B ∈ B n ,f (B) =f (B), and furthermoref (B) = 0 for In order to prove the estimate on the norms, we compute the squared norm of χ ℓ with respect to µ p : Denoting by σ the uniform measure on [n] k , Lemma 3.3 and Theorem 4.1 imply that 2p(1 − p)) |B| , implying the estimate on the norms. The estimate on the variance follows analogously.
Johnson association scheme The Johnson association scheme is an association scheme whose underlying set is [n] k . Instead of describing the scheme itself, we equivalently describe its Bose-Mesner algebra. Definition 4.2. Let n, k be integers such that k ≤ n/2. A square matrix M indexed by [n] k belongs to the Bose-Mesner algebra of the (n, k) Johnson association scheme if M S,T depends only on |S ∩ T |.
While it is not immediately obvious, the Bose-Mesner algebra is indeed an algebra of matrices, that is, it is closed under multiplication. Furthermore, it is a commutative algebra, and so all matrices have common eigenspaces. In particular, the algebra is spanned by a basis of primitive idempotents J 0 , . . . , J k . As the following lemma shows, these idempotents correspond to the bases Y n,0 , . . . , Y n,k . Lemma 4.3. Let n, k be integers such that k ≤ n/2, and let M belong to the Bose-Mesner algebra of the (n, k) Johnson association scheme. The bases Y n,0 , . . . , Y n,k span the eigenspaces of M .
Proof. Since all matrices in the Bose-Mesner algebra have the same eigenspaces, it is enough to consider a particular matrix in the algebra which has k + 1 distinct eigenvalues. Let M be the matrix corresponding to the linear operator More explicitly, it is not hard to calculate that M (S, S) = k 2 + n−k 2 , M (S, T ) = 1 if |S ∩ T | = k − 1, and M (S, T ) = 0 otherwise. In particular, M belongs to the Bose-Mesner algebra. Lemma 5.5, which we prove in Section 5, shows that for d ≤ k, the subspace spanned by Y n,d is an eigenspace of M corresponding to the eigenvalue n 2 − d(n + 1 − d). All eigenvalues are distinct (since the maximum of the parabola d(n + 1 − d) is at d = (n + 1)/2), completing the proof.
As an immediate corollary, we deduce that Y d is an orthogonal basis for the dth eigenspace in the (n, k) Johnson graph and (n, k) Kneser graph, as well as any other graph on [n] k in which the weight on an edge (S, T ) depends only on |S ∩ T |. In the Johnson graph, two sets are connected if their intersection has size k − 1, and in the Kneser graph, two sets are connected if they are disjoint.
The slice [n] k can be identified as the set of cosets of S k × S n−k inside S n . Bannai and Ito [1], following Dunkl [3,4], use this approach to determine the idempotents of the Johnson association scheme from the representations of the symmetric group S n . They obtain the idempotent J d from the representation corresponding to the partition (n − d) + d. The basis Y n,d can be derived from Young's orthogonal representation corresponding to the partition (n − d) + d, but we do not develop this connection here; such a construction appears in Srinivasan [23].

Influences
Throughout this section, we fix some arbitrary exchangeable measure. All inner products and norms are with respect to this measure.
One of the most important quantities arising in the analysis of functions on the hypercube is the influence. In this classical case, influence is defined with respect to a single coordinate. In our case, the basic quantity is the influence of a pair of coordinates.
is obtained by switching x i and x j . The influence of the pair (i, j) is When m = n, we call the resulting quantity the total influence, denoted Inf[f ].
We start with a triangle inequality for influences (cf. [26,Lemma 5.4] for the Boolean case, in which the constant 9 2 can be improved to 3 2 ). Lemma 5.1. Let f ∈ H n . For distinct i, j, k ∈ [n] we have Proof. The Cauchy-Schwartz inequality implies that (a+b+c) 2 ≤ 3(a 2 +b 2 +c 2 ). Since (i j) = (i k)(j k)(i k), The lemma is obtained by averaging with the similar inequality As a consequence of the triangle inequality, we can identify a set of "important" coordinates for functions with low total influence (cf. [26,Proposition 5.3]). Our goal in the rest of this section is to give a formula for Inf m [f ]. Our treatment closely follows Wimmer [26]. We start with a formula for f (m m+1) .
It might happen that B ∈ B n but B (m m+1) / ∈ B n , but in that case, the coefficient in front off (B (m m+1) ) vanishes.
Proof. For brevity, define π = (m m + 1). We start by showing that if B ∈ B n but B π / ∈ B n then the coefficient in front off (B π ) vanishes. Clearly, this case can happen only if m + 1 ∈ B, say b i+1 = m + 1, We conclude that m = 2i + 1, and indeed the corresponding coefficient is This shows that the expressions appearing in the expansion of f π are well-defined.
Consider some B ∈ B n . If m, m + 1 / ∈ B then A < B iff A π < B, and so If m, m + 1 ∈ B, say b i = m and b i+1 = m + 1, then χ π A,B = χ A ′ ,B , where A ′ is obtained from A by switching a i and a i+1 . Again A < B iff A ′ < B, and so χ π B = χ B as in the preceding case. Finally, consider some set B ∈ B n,d such that B π ∈ B n , b i+1 = m and m + 1 / ∈ B. The set B doesn't necessarily belong to B n , and in that case we define χ B = 0; under this convention, the formula χ B = A<B χ A,B still holds (vacuously). We define a function φ which maps a sequence A < B to a sequence φ(A) < B π so that the following equation holds: The function φ is given by a 1 , . . . , a i , m, a i+2 , . . . , a j , a i+1 , a j+2 , . . . , a d if a j+1 = m + 1.
Since b i+1 = m and A < B, in the second case necessarily j > i. It is not hard to verify that indeed φ(A) < B π .
We proceed to verify equation (1). Suppose first that m + 1 / ∈ A. Then Suppose next that a j+1 = m + 1. Then This completes the proof of equation (1).
As a simple corollary, we obtain a version of Poincaré's inequality.
Lemma 5.6. For any f ∈ H n,d we have The left inequality follows from |B|(n + 1 − |B|) ≥ n, and the right inequality from n + 1 − |B| ≤ n.
Friedgut's theorem can be seen as an "inverse theorem" corresponding to the easy fact that a Boolean function depending on d coordinates has total influence at most d. Since the total influence is bounded by the degree, Friedgut's theorem can also be seen as a strengthening of the result of Nisan and Szegedy [18] that a Boolean function of degree d depends on at most d2 d−1 coordinates.
Wimmer [26] proved an analog of Friedgut's theorem for functions on a slice of the Boolean cube. His proof takes place mostly on the symmetric group, and uses properties of Young's orthogonal representation. We rephrase his proof in terms of Young's orthogonal basis for the slice.
The proof relies crucially on a hypercontractivity property due to Lee and Yau [15]. Before stating the property, we need to define the noise operator.
Definition 6.1. The Laplacian operator on functions f ∈ H n is given by The noise operator H t is given by H t = e −tL .
The Laplacian corresponds to the Markov chain applying a random transposition (i j). Moreover, L = I − K where K is the transition matrix of the Markov chain. We can expand the noise operator as H t = e t(K−I) = e −t ∞ ℓ=0 t ℓ ℓ! K ℓ .
In words, H t corresponds to applying P (t) many random transpositions (i j), where P (t) is the Poisson distribution with mean t.
Lemma 5.5 gives a formula for Lf and H t f . The hypercontractivity result of Lee and Yau [15] gives for all p < q a value of t such that H t f q ≤ f p .
Proposition 6.2. Let n, k be integers such that 1 ≤ k ≤ n − 1. The log-Sobolev constant ρ of the Markov chain corresponding to the Laplacian L is given by ρ −1 = Θ n log n 2 k(n − k) .
Proof. The first result is [15,Theorem 5]. Their parameter t is scaled by a fraction of n. Furthermore, their log-Sobolev constant is the reciprocal of ours. The second result is due to Gross [11], and is quoted from [ For an appropriate choice of C, the second term is at most ǫ/2, completing the proof.