Phased Multi de Bruijn Sequences
Abstract
We introduce phased multi de Bruijn sequences, a generalization of de Bruijn sequences. A phased string is a string whose positions sequentially rotate through several alphabets; e.g., “0Ax1Ay1By0Az1Bx” rotates through alphabets $\Omega_0=\{0,1\}$, $\Omega_1=\{A,B\}$, and $\Omega_2=\{x,y,z\}$. We consider cyclic and linear phased strings in which all possible phased $k$-mers (phased strings of length $k$) occur with particular multiplicities, depending on their “phase” (the alphabet they start in). For example, consider the cycle (s)=(0Ax1Ay1By0Az0By1Bx0Ay0Bx1Bz1Ax0Bz1Az) in these alphabets. All possible phased 2-mers starting in phases 0, 1, and 2 respectively have multiplicities 3, 2, and 2; e.g., “0A” occurs three times, “Ax” occurs twice, and “z0” occurs twice (including the occurrence that wraps around the cycle). We determine parameters ($k$, number of phases, alphabet sizes, and multiplicities) for which this is possible. Then we count the total number of phased multi de Bruijn sequences for these parameters, both for cyclic and linear sequences. This extends classical de Bruijn sequences and multi de Bruijn sequences (our previous generalization of de Bruijn sequences in which all possible $k$-mers over one alphabet occur $m$ times each). Our method of counting the sequences uses a change of basis for the Laplacian matrix; this also gives a new proof for the number of classical de Bruijn sequences, as they are a special case of this framework.