## Friday, May 10, 2013

### Sequences without identical neighbours

Last April, in the middle of an Australian Informatics Olympiad camp, I and a number of colleagues ran an intense team event for the high school students at the camp, where we threw lots of mathematical and general knowledge puzzles at them rapid-fire. The students were grouped in teams of three, and each time was given two computers, an internet connection, and barely even ten person-minutes per problem.

I believe Luke Harrison and I came up with this particular problem. We realised about two hours before the competition was due to start that it was even more interesting than we'd previously thought.

"How many sequences of positive integers are there such that (1) no two consecutive integers are the same (i.e. the sequence is "valid"), and (2) they sum to $N$?"

This can be calculated in $O(N^2)$ time using straightforward dynamic programming:

int nVS[N+1];
// nVS[x]      := number of valid sequences summing to x
int nWays[N+1][N+1];
// nWays[x][y] := number of valid sequences summing to x
//   and ending on y
nWays = 1;
nVS = 1;
for (int x = 1; x <= N; x++) {
for (int y = 1; y <= x; y++) {
nWays[x][y] = nVS[x-y] - nWays[x-y][y];
nVS[x] += nWays[x][y];
}
}
return nVS[N];


These sequences are called Carlitz Compositions [Kn98], and the number of sequences summing to $N$ (OEIS A003242) can be computed an alternate way:

Let $N_X$ be the number of valid sequences summing to $X$, and let $N'_{X,Y}$ be the number of valid sequences summing to $X$ and ending with $Y$. We have:  \begin{align} N_X &= \sum_{i=0}^{X} N'_{X,i} \\ N'_{X,Y} &= N_{X-Y} - N'_{X-Y,Y} \end{align}  ...which are in fact the definitions we used for our above DP. But instead consider what happens when we recursively expand the second equation:  \begin{align} N'_{X,Y} &= N_{X-Y} - N'_{X-Y,Y} \\ &= N_{X-Y} - N_{X-2Y} + N'_{X-2Y,Y} \\ &= \sum_{i=1}^\infty (-1)^{i+1} N_{X-iY} \end{align}  (This can be explained informally by borrowing some ideas from inclusion-exclusion: the number of sequences ending with $1$ is just the number of length $X-1$ sequences, minus the number of length $X-2$ sequences to account for anything ending with $1,1$, plus the number of length $X-3$ sequences to account for the non-existent ones ending with $1,1,1$ we erroneously subtracted, etc.)

This leads to a slower (but intriguing) $O(N^2 \lg N)$ solution:

int nVS[N+1];
// nVS[x]      := number of valid sequences summing to x
nVS = 1;
for (int x = 1; x <= N; x++) {
nVS[x] = 0;
for (int y = 1; y <= x; y++) {
for (int i = 1; i*y <= x; i++) {
nVS[x] += (i%2 ? 1 : -1) * nVS[x - i*y];
}
}
}
return nVS[N];

The $O(N \lg N)$ term in the inner loop is derived from the harmonic series much like the Sieve of Eratosthenes's time complexity.

We can bring this down to $O(N^2)$ by observing that the net number of times nVS[x-$k$] is added to nVS[x] is constant for all $x \geq k$, and simply precalculating this before the main loop (resulting in OEIS A048272: the number of odd factors of $n$ minus the number of even factors).

As far as I can tell from the literature, a more elegant closed form does not exist: even the canonical generating function for this sequence involves an unwieldy infinite sum in its definition.

(N.B. To their credit, one out of five student teams solved it despite the time pressure. They did not, however, win the grand prize of a handful of chocolate frogs.)

References:
[Kn98] Knopfmacher, Arnold, and Helmut Prodinger. "On Carlitz compositions." European Journal of Combinatorics 19.5 (1998): 579-589.