September 5, 2020 2020 Topcoder Algorithm Round 4 Editorials
TCO20 round 4 editorial
The key observation is that if we have A+B taken seats, we will have RC-A-B seats that will remain empty. We can therefore look at the original problem as a problem in which we have to place the empty seats as barriers in a way that produces components of seats with correct size.
Without loss of generality, assume R <= C. If we have at least R empty seats available, we can construct a solution for any such pair (A, B) by using R empty seats to form a barrier that connects the top and bottom side of the cinema. Shown below: a 5×7 cinema divided this way into components of size 7 and 23, with ‘E’ denoting seats designated to be left empty.
..E.... ..E.... .E..... .E..... .E.....
A very simple way of constructing these seatings is to go through all seats in column major order, placing all the ‘A’s, then all the empty seats (at least R of them), and finally all the ‘B’s.
If we have fewer than R empty seats, we can no longer connect the top and bottom of the cinema. In this setting, regardless of which seats we designate as empty, there will always be one large component that will contain a majority of available seats. Then, the question is how big we can make the smaller component. (Or, more precisely, the total size of all other components – but it’s fairly easy to see we never want more than one of those.) And the answer is that the optimal solution is to cut off a triangular area in one corner:
...E... ..E.... .E..... E...... .......
Given E empty seats, this small component has size E(E-1)/2. A test case (A, B) with A <= B and RC-A-B = E < R is solvable iff A does not exceed this maximum value. As before, we have an easy way of constructing these solutions – this time by going along diagonals of the cinema. If we have fewer ‘A’s than the maximum size of the triangle, we will stop placing them sooner, but they will then be followed by a sufficient number of empty seats anyways. Below we show a partial solution in which we placed just 5 ‘A’s, followed by four empty seats.
AAAE... AAE.... EE..... ....... .......
Carefully implemented solutions may solve all test cases using the diagonal pattern only, but our tests have shown that doing it that way is more error-prone 🙂
The solution described below is polynomial in the number of digits of N, so it could also solve inputs much bigger than the ones that satisfy the problem’s constraints. The reason why such inputs were not allowed is actually to make the problem a bit harder: large inputs already tell you what type of solution you must look for.
There are various ways of counting the inversions efficiently, but all of the ones we are aware of are similar in the sense that we are grouping the inversions by the lengths of the numbers involved in it.
This comes quite naturally from the observation that numbers with the same number of digits maintain the same relative order when sorted as strings. Thus, each inversion must have the form “a number with more digits to the left of a number with fewer digits”.
Most of these groups of inversions are complete and they are easy to count. For example, if N has 7 digits, we have all the possible inversions of the form “a 6-digit number to the left of a 4-digit number”.
These inversions are quite easy to count. Let L and S be the number of digits in the longer and in the shorter number, respectively. If we fix the shorter number x, we know that the longer number can be precisely any number whose length-S prefix is less than x, and thus their count is (the number of such prefixes) times (the number of ways to fill in the remaining digits of the longer number). Summing this over all x is simply computing the sum of an arithmetic series.
The only complicated part are inversions that involve, on the left, a number with the same number of digits as N. To handle these easily, we will iterate over minimal prefixes that are guaranteed to be smaller than N, and count the number of inversions for each of those separately. For example, if N=2123, we will break the numbers with the same number of digits as N into the following groups: 1???, 20??, 210?, 211?, 2120, 2121, 2122, and 2123.
Thus, the general counting problem we now need to solve is the following one: given that the longer number has L digits and a given prefix P, and that the shorter number has S digits, how many such inversions are there? This is just a slightly more complicated version of the counting problem we have shown below. In the general case, there is a group of inversions that are implied by the shorter number having a bigger prefix (these can be counted directly by multiplying the number of options) and a group of inversions where both numbers share the prefix P (which leads to essentially the same arithmetic series as above when the common prefix is discarded, only for a non-empty prefix both numbers can now start with zeroes).
Challenge yourself: can you now solve this problem (modulo 10^9 + 7) with the constraint that N can have up to 100,000 digits?
The problem is asking us to find a Hamiltonian cycle in a rather large graph. As the problem most likely doesn’t have a polynomial-time solution for general graphs, we will need to find some special structure in this particular graph.
One related problem some of you might have already seen is the construction of de Bruijn sequences. In this problem, the goal is to construct a sequence of digits such that each possible contiguous subsequence of length K appears (at least) once. The optimal solution for this problem looks like a Hamilton cycle problem: the vertices are sequences of K digits, the edges correspond to appending a digit. The main trick to construct these sequences efficiently is to transform the problem into a closed Euler tour problem in a different graph. For these sequences, the new graph has sequences of K-1 digits as vertices. Edges still correspond to adding a digit, e.g., for K=5 you would have an edge from 1234 to 2347. Walking along this edge then produces the subsequence 12347 somewhere in the final sequence. In other words, what we did is we found a different way of representing the problem, only in this new way each edge corresponds to one of the sequences, and we can find a way to traverse each edge once in polynomial time.
Our solution to the BigJump problem will proceed in a similar way: we will find a new way to model this problem so that instead of a Hamiltonian cycle problem we will just need to solve a closed Euler tour problem.
Let’s begin by observing the edges of our graphs for N=8 and N=10:
x | 0 1 2 3 4 5 6 7 ------------------- A | 7 1 3 5 7 1 3 5 B | 0 2 4 6 0 2 4 6 x | 0 1 2 3 4 5 6 7 8 9 ----------------------- A | 9 1 3 5 7 9 1 3 5 7 B | 0 2 4 6 8 0 2 4 6 8
The first observation we can make now is one type of symmetry. Let N=2K. In our problem, one symmetric thing is that in the above tables the second half matches the first. In general, in our graph G the vertices x and x+K have the same outgoing edges: to 2x and to 2x-1 (all vertex numbers are to be considered modulo N, obviously).
The symmetry discovered above may lead us to the idea of building a smaller graph G’ with just K vertices: for each x we contract the original vertices x and x+K into a single vertex, and we only keep one pair of outgoing edges (instead of having each edge twice). This way, the vertices correspond to “current states” of the traversal: the vertex you are in now uniquely corresponds to the options where to go next.
What can we tell about G’?
We already know that each vertex has outdegree 2.
Let’s look at the edges of the new graphs for N=8 and N=10. (We use the labels 0 to K-1 for the new vertices.)
x | 0 1 2 3 ----------- A | 3 1 3 1 B | 0 2 0 2 x | 0 1 2 3 4 ------------- A | 4 1 3 0 2 B | 0 2 4 1 3
We can see that each vertex also has indegree 2 (just like it had in the original G). It’s easy to prove this in the general case, just note what we see when we read the list of edges of G’ from the left to the right.
We may also note that G’ is always strongly connected. First we can observe that we can go from vertex 1 everywhere: we can go from 1 to 2, from 2 to any vertex in [3,4], from those to any vertex in [5,8], and so on. Next, we can observe that if we start by going from 1 to x, the fact that all indegrees and outdegrees are the same means that if we then continue walking along unused edges arbitrarily, we must eventually reach 1.
Thus, G’ is Eulerian.
We can now take any Euler tour of G’ and convert it into a Hamilton cycle in G as follows: Let’s label each edge of G’ with its original destination in G. For our two examples, this means that the edges have labels as follows:
x | 0 1 2 3 ----------- A | 7 1 3 5 B | 0 2 4 6 x | 0 1 2 3 4 ------------- A | 9 1 3 5 7 B | 0 2 4 6 8
Then, the labels of edges read along the Euler tour give us a valid solution.
One final nice property of this problem is that it’s really easy to test your solution locally once you have it implemented. The number of valid inputs is low enough so you can iterate through all of them and check that your solution solves them all. This is especially helpful if you didn’t get all the way to the argument about the Euler tour and e.g. only discovered some heuristic way of connecting smaller cycles into larger ones. (As we now know, a correct implementation of such an approach has to always produce a valid solution. But from the point of view of a person solving this task it is perfectly plausible that their solution may sometimes get stuck with multiple disjoint cycles, and it’s nice to be able to verify that this actually never happens.)