August 1, 2020 2020 TCO Algorithm Round 3A Editorials
The full problem is solvable in constant time, but to make the implementation a bit more pleasant the constraints for the obstacle size were chosen so that O(obstacle perimeter) passed as well.
In order to solve the problem for small s, we could use breadth-first search. However, s can be large and thus we need to be more clever – we cannot afford to count the reachable cells one at a time. (Implementing the BFS solution can still be useful when stress-testing the solution, though.)
We can divide the visited cells into several categories, as shown using different numbers in the figure below
.........4..... ........544.... ...XXXXXX333... ..3XXXXXX3333.. .2222222222222. 1111111S1111111 .1111111111111. ..11111111111.. ...111111111... ....1111111.... .....11111..... ......111...... .......1....... ...............
- The full rows in and below the row with S. Their total area is the sum of a simple arithmetic series (and turns out to be a square number).
- The full rows between S and the obstacle. Again, an arithmetic series.
- The sides of the obstacle. For each cell on the boundary of the obstacle we can compute the number of steps to reach it, and if it’s at most s, this cell and some additional ones away from the obstacle are reachable. (Yet another arithmetic series, because as we go up, the distance to the boundary cell increases by 1.)
- The corners to the top right and top left of the obstacle. If reachable at all, the reachable cells form an isosceles right triangle.
- The top of the obstacle. For the given constraints we can afford to process it in the same way as the sides: for each boundary cell we can compute the shortest distance from the start (this time we have to consider both going around the obstacle from the left and from the right) and knowing that we know how many cells upwards from that boundary cell are also reachable.
(In the constant-time solution the top of the obstacle can be split into at most two arithmetic series.)
We can start by realizing that the answer is always less than L. This is because Maru’s path + the extra edge that connects its endpoints has length at most M+L, and for Vlado’s solution to be different there has to be a cheaper cycle (with length at most M+L-1) and if we leave out one of its edges of length x, we will get that Vlado’s path has length at most M+(L-1-x).
Clearly we do not lose anything if we make Maru’s extra edge have length exactly L. As we want to have a difference at least L/2, we need a graph in which Vlado’s cycle is as expensive as possible (ideally, M+L-1) and the most expensive edge on that cycle must be cheap enough. In particular, obviously we must have x < L/2 because otherwise the difference V-M (which is at most L-1-x) becomes too small.
Probably the simplest construction that works for all inputs that match the given constraints is therefore:
- Build Maru’s cycle using n-1 edges of length 1 and one edge of length L. (We will make Maru’s path the path 0-1-…-(n-1) and set the edge 0-(n-1) to length L.)
- Add three edges of length approximately L/3 each to form a second, slightly cheaper cycle. One possibility for this is to add the edges 0-(n-2), 1-(n-1) and 2-(n-1). Then, Vlado’s cycle will go …-(n-3)-(n-2)-0-1-(n-1)-2-… and avoid using the expensive edge 0-(n-1).
Vlado’s cycle uses n-3 edges of length 1 (shared with Maru) and the three new edges instead of three edges of length 1,1,L. One of those three is then removed to form Vlado’s path.
- All other edges are set to L so that they do not interfere with our construction. (Cycles that include an edge of length L clearly cannot be optimal.)
The problem statement defines the return value in terms of probability, but once we multiply it by 6^(P+N), it’s useful to realize that the return value is simply the number of different sequences of rolls that lead to success. Thus, we are just counting sequences and we can use integers everywhere, even in our thinking.
Instead of counting the sequences that give us a success we will count the complement: sequences that give us failure. At the end, we will then subtract their count from 6^(P+N).
One good observation is that there are too many states for a direct DP on states: with P positive dice we can get (P+5 choose 6) different outcomes, which evaluates to something like 3*10^11 for the maximal P. Thus, we need something more clever.
The first main good idea is to use dynamic programming, but to use the value rolled as the outer loop. For each v (from 6 down to 1) and for each p, n we can count the sequences of p positive and n negative rolls, each being between v and 6, inclusive, such that they give us failure. When going down from v+1 to v, we need to consider all valid possibilities (pv, nv) for how many positive and how many negative dice rolled exactly the value v. If v is greater than the target, all combinations are valid (it has no implication on success), if v is at most equal to the target, we should only consider the combinations (pv, nv) where pv <= nv.
Thus, we get the following recurrence:
dp[v][p][n] = sum over all valid (pv, nv) of (p choose pv) * (n choose nv) * dp[v+1][p-pv][n-nv].
(We are counting failure sequences with p positive and n negative dice. We choose how many of them rolled exactly v, then we choose at which positions in the sequence those rolls appear – these are the binomial coefficients – and then we are left with p-pv positive and n-nv negative dice, each of which must have rolled something greater than v.)
The above solution runs in O(P^2 N^2) and from the constraints it should be obvious that it’s an order of magnitude too slow. We need to save one dimension somewhere.
The way I did this last step is to break up the computation of the above recurrence in a clever way so that I can reuse parts of the computed values. Conceptually, this will correspond to choosing the positive dice that rolled v first. Formally, suppose we fixed p-pv (the number of positive dice that rolled at least v+1) and n (the number of negative dice that rolled at least v). Only now will we iterate over all possibilities for pv (the number of positive dice that rolled v). For each fixed pv, we want to take (p choose pv) times some sum and add it to dp[v][p][n]. Each of those sums has the following form:
sum over some nv of (n choose nv) * dp[v+1][p-pv][n-nv]
In each of those sums, nv is the only non-constant and it always ranges from some lower bound to n. (The lower bound is either the current pv or 0, depending on whether v <= target.) Thus, we can do the computation of the sum just once (by iterating nv from n downwards and storing the prefix sums) and then for each pv we can access the correct sum in constant time.
The above optimization improves the dynamic programming to O(P^2N + PN^2) and that should be good enough to get accepted.