JOIN
 Match Editorial
SRM 312
Saturday, July 22, 2006

## Match summary

Division 1 coders faced a 250-pointer that didn't stop them for long, but that was definitely not the case with the 500-pointer and the 1050-pointer. Eventually fuwenjie enjoyed his first win thanks to 4 successful challenges!

As the coding proceeded, most of the coders managed to submit the 500-pointer, many managed to resubmit it, too. In the closing minutes several 1050's were in, although not for long. The challenge phase saw more than a half of 1050's and 500's go down, completely changing the standings. So did the systests. In the end fuwenjie was followed by Kawigi (another 4 successful challenges here). The only coder solving the 1050-pointer, rem, came 3rd. Surprisingly, targets could not book places on the top of the standings today.

Division 2 coders faced a geometry-oriented set, with both 500-pointer and 1050-pointer having geometric origin. That couldn't trick the duo of newcomers, dyatlov and zan, taking the top two spots from regular div2 competitors.

# The Problems

EsperantoNumbers
Used as: Division Two - Level One:
 Value 250 Submission Rate 474 / 517 (91.68%) Success Rate 362 / 474 (76.37%) High Score stupidcoder for 248.32 points (2 mins 20 secs) Average Score 201.90 (for 362 correct submissions)

This problem was pretty straightforward. The solution should carefully address several different cases: where x is 1 through 10, where x is 11 through 19, where x ends in 0, and the remaining x's. For an implementation of this approach, see jthread's solution.

ParallelepipedUnion
Used as: Division Two - Level Two:
 Value 500 Submission Rate 331 / 517 (64.02%) Success Rate 263 / 331 (79.46%) High Score pinc for 480.88 points (5 mins 42 secs) Average Score 315.75 (for 263 correct submissions)
Used as: Division One - Level One:
 Value 250 Submission Rate 399 / 406 (98.28%) Success Rate 380 / 399 (95.24%) High Score Eryx for 247.46 points (2 mins 53 secs) Average Score 205.77 (for 380 correct submissions)

There were two approaches available. First approach took advantage of the fact that coordinates were integer and not more than 100. We separate the entire space into unit cubes, and for each unit cube check if it belongs to at least one of the parallelepipeds. Then the volume of the union is the total number of such unit cubes. For an implementation of this approach, see bmerry's solution.

Second approach was based on the formula for the volume of a parallelepiped: the volume of a parallelepiped is a product of lengths of three edges going from any its vertex. When transformed to the xyz-form, it becomes V=(z2-z1)⋅(y2-y1)⋅(x2-x1). To compute the volume of the union, we first just add up the volumes of both parallelepipeds. In case these parallelepipeds do not intersect, we're already there. In case they do, we've counted the intersection volume twice and need to subtract it to get the correct answer. To find the intersection volume, we note that the intersection of two parallelepipeds is also a parallelepiped (possibly singular, like segment or rectangle), and thus the same formula can be applied. To understand why this is the case, simply realize that the parallelepiped is a set of points (x,y,z) satisfying the following constraints: x1≤x≤x2, y1≤y≤y2, z1≤z≤z2; suppose we have a second one of the form x3≤x≤x4, y3≤y≤y4, z3≤y≤z4 — one can deduce now that the intersection is specified by the inequalities max(x1,x3)≤x≤min(x2,x4), max(y1,y3)≤y≤min(y2,y4), max(z1,z3)≤z≤max(z2,z4), also following the same pattern. For an implementation of this approach, see Eryx's solution.

Used as: Division Two - Level Three:
 Value 1000 Submission Rate 25 / 517 (4.84%) Success Rate 7 / 25 (28.00%) High Score dyatlov for 716.37 points (19 mins 35 secs) Average Score 441.63 (for 7 correct submissions)

For all the geometric stuff required for this problem refer to lbackstrom's tutorial.

Let's follow the solution for this problem step-by-step. First, we need a function that checks whether two given points are mirror images of each other for the given cut. It can be easily seen what we can only consider cuts passing through the point of origin O(0,0) (because the pizza must be cut into equal halves), so each cut can be specified by any point on it, or, in other words, by its direction vector. Suppose the direction vector of the cut is V(x0,y0), and the points given are A(xA,yA) and B(xB,yB). What we need to check is whether the cut passes through the middle of AB and is perpendicular to AB. The middle point of AB is: C((xA+ xB)/2,(yA+yB)/2), and cut passing through it means vectors V and OC are collinear, i.e. their cross product is zero, i.e. x0(yA+yB)/2-y0(xA+ xB)/2=0, i.e. x0(yA+yB)-y0(xA+ xB)=0. The cut being perpendicular to AB means the dot product of the vectors V and AB is zero, i.e. x0(xB-xA)+y0(yB- yA)=0.

Now that we're past the geometric part of the problem (almost), it becomes simpler. Using the function described above, we construct a function that checks whether the given cut is beautiful. It just iterates over all toppings, and checks whether there exists a mirror image for them (in case the topping is right on the cut, it will be the mirror image for itself).

Now that we can check whether a cut is beautiful or not, all we need is to find all possible candidates for being beautiful. For example, pick any topping X that has at least one nonzero coordinate; it must have a mirror image. So we can iterate over all other toppings and try to consider them as a mirror image to X; we should also consider the case when it's the mirror image of itself. When we've fixed one pair of mirrored points, the cut is uniquely determined (it's the middle perpendicular to the corresponding segment), so we can check it for beauty. To simplify the formula for finding middle perpendicular, we can note that mirror images must be equidistant from the point of origin, and thus the direction vector for middle perpendicular to X(a,b)Y(c,d) has coordinates (a+c,b+d). The latter formula for finding the direction vector also works when X=Y.

This problem's constraints also allowed a more straighforward approach: just check all the possible direction vectors with both coordinates integer and not exceeding 1000 by absolute value. To understand why this is enough, reread the previous paragraph :)

See zan's solution for an implementation of the former approach.

It seems intuitive (and I can't find a counterexample) that the answer for this problem is always -1,0,1,2 or 4. I have an idea how to prove it, related to irrationality of eπi/n for integer n >= 3, but could not make the proof formal, nor could I find a solution that relies on this fact. The reader is invited to bridge this gap :)

AntarcticaPolice
Used as: Division One - Level Two:
 Value 500 Submission Rate 282 / 406 (69.46%) Success Rate 79 / 282 (28.01%) High Score gawry for 387.57 points (16 mins 19 secs) Average Score 252.98 (for 79 correct submissions)

The first observation we need is that if there's a town with no incoming roads, we definitely need a police station there. Let's make this more generic. We'll require some graph theory definitions here, as we'll consider Antarctica as a directed graph, and one-way roads as edges in it.

A directed graph is called strongly connected, if for every pair or vertices u and v there exist paths from u to v and from v to u. The strongly connected components of a directed graph are its maximal (one can't add any more vertices to it) strongly connected subgraphs. For example, consider a graph with 6 vertices A,B,C,D,E and F, and edges A-->B, B-->C, C-->D, D-->E, E-->F, F-->E, E-->D, B-->A. Then it has three strongly connected components: one with vertices {A,B}, vertex C itself, and one with vertices {D,E,F}. Strongly connected components have several useful properties, one of them being each cycle in the graph belongs entirely to one of the strongly connected components. An easy way to find strongly connected components is Floyd-Warshall's algorithm (see gladius' tutorial).

Let's separate all edges of the graph into two groups: internal edges are between vertices in the same strongly connected component, external edges are all others. Consider any strongly connected component in our graph without any incoming external edges. Let's call such components starter components. Then we must build at least one police station there, because these vertices are not reachable from outside. And if we must - why not build the cheapest one? To prove this, consider the case when we don't build the cheapest station X, but build some other station Y in this component. Let's flatten the station Y, and build X instead. X and Y are in the same strongly connected component, thus X is reachable from Y and vice versa, thus "reachability" didn't change, but the average cost decreased (or stayed the same if Y had the same cost as X). So it's always good to take the cheapest station in each starter component.

Moreover, if we build at least one police station in each starter component, then all towns are already reachable by police. The formal proof is left to the reader, but the general idea is as follows. Consider any town. If it's in a starter component, then it's reachable from the station we built there. If it's not in the starter component, then its component has an incoming external edge. If that edge originates in a starter component, then our vertex is reachable from the police station in that component. Continuing this "reverse" path on external edges, we'll eventually finish in a starter component, and thus complete the proof.

So, what do we have now? We've already built a police station, the cheapest one, in each starter component, and all towns are already reachable by police. In case we were required to minimize the total cost, that would be enough. But it may happen that we can build some extra stations to lower the average cost. More specifically, if there is a station that is cheaper that the current average cost, then building it is good for us. So what we do is take the cheapest station that's not already built and build it, and repeat this process until the cost of the cheapest station left is more than the current average cost. That completes the solution for this problem. The proof of this 'greedy' part is left to the reader, as this proof goes just along the lines of the proof above about taking the cheapest station in each starter component.

See xOberon's solution for an implementation of this approach.

Coders taking part in the match have found many other solutions for this problem, check the forum for some of those.

CheapestIsland
Used as: Division One - Level Three:
 Value 1050 Submission Rate 27 / 406 (6.65%) Success Rate 1 / 27 (3.70%) High Score rem for 447.47 points (50 mins 35 secs) Average Score 447.47 (for 1 correct submission)

Let's imagine if the field was no larger than 4x4. What we could do was to consider all the possible 216 subsets of the set of the cells, and check each for connectedness. This could be written as the following recursive function:

[01]int rec(x, field) {
[02]  if (x >= height * width) {
[03]    check the field for connectedness
[04]    if it's connected, return zero, else return infinity
[05]  } else {
[06]    set current_min to infinity
[07]    mark cell x as taken in field
[08]    if [cost of cell x + the return value of rec(x + 1, field)] is less than current_min, assign it to current_min
[09]    mark cell x as not taken in field
[10]    if the return value of rec(x + 1, field) is less than current_min, assign it to current_min
[11]    return current_min
[12]  }
[13]}

We could calculate the cost of the field after checking it for connectedness, but this 'accumulating' way is more convenient for the following. Now let's suppose it was 5x5 or even 6x6. How to improve the previous approach? By adding an early exit to the recursive function. Namely, if we can determine that there's no way to make the current field connected, we can safely return infinity. And this is the case when there exists a connected component in the processed part of the field that is not reaching the bottomline of the processed part. For example, we could return infinity at the following point (pluses are taken cells, minuses are not taken cells, dots are yet undecided cells):

--+++--
-++--++
---....
.......

Blue pluses show the connected component that is already not reaching the bottomline (highlighted with orange color).

This is a good improvement, but not yet enough to process 9x9 testcases. Let's look closer at the bottomline notion we introduced. Do we really need to pass the entire field into the recursive calls at the lines 8 and 10 of the above code? The answer is no. What we're interested in is what cells are taken on the bottomline and which of them are connected between each other. For example, the two following cases are really the same for us (the return value will be the same):

+++++--  +++----
+++--++  +-+--++
+-+....  +-+....
.......  .......

as in both cases the bottomline contains 4 pluses at the columns 0, 2, 5 and 6, 0 and 2 are connected, and 5 and 6 are connected. Considering this, let's rewrite our function the following way:

[01]int rec(x, bottomline) {
[02]  if (x >= height * width) {
[03]    if all the taken cells in the bottomline are connected, return zero, else return infinity
[04]  } else {
[05]    set current_min to infinity
[06]    mark cell x as taken in bottomline, correctly maintaing connectedness
[07]    if [cost of cell x + the return value of rec(x + 1, bottomline)] is less than current_min, assign it to current_min
[08]    mark cell x as not taken in the bottomline, correctly maintaining connectedness
[09]    if the above procedure didn't result in a connected component losing reach to the bottomline, then:
[10]      if the return value of rec(x + 1, bottomline) is less than current_min, assign it to current_min
[11]    else
[12]      if the component losing reach to the bottomline was the only one, and current_min is positive, set it to zero
[13]    return current_min
[14]  }
[15]}

bottomline pseudo-variable holds information about what cells are taken in the bottomline and their interconnections. Line 3 is the new way of checking if the field is connected: anything above the bottomline is connected because we've reached the current state, so we just check the bottomline. Line 9 is the early exit check, and line 13 is for finding the components that don't reach the bottom row of the grid.

Only one step is left between our solution and the correct solution. Can't you guess? We need to memoize the results of function rec! It appears that there are <200000 possible rec parameters for a 9x9 board. With some more thinking one can convert memoized solution to a DP solution, and the latter appears even easier in this case. A DP solution could be formulated in different terms, but it is essentially the same as the recursive approach with memoization, so one could choose the way he likes to implement more. Writing the state of that DP formally (that is, writing formally what 'rec' returns) and inventing a way to store the 'bottomline' are left as good exercises for the reader.

See rem's only passing solution as an example of another possible approach, moving row-by-row instead of moving cell-by-cell. It is slower, but could be optimized to pass, too.

By Petr
TopCoder Member