JOIN
 Match Editorial
SRM 191
Saturday, April 24, 2004

Match summary

Pity poor SnapDragon's thumbs. After brutalizing the first-time writer's problems in a mere 25 minutes, he had nothing to do but twiddle them for the next 65 minutes, before he was finally able to challenge yet another hapless victim. And how demoralizing must it have been for third-place finisher Klinck? First, he looks up from finishing all three problems in an astonishing 33 minutes, only to find that he was more than 30% slower than the leader. Then he was to watch as gepa passes him with three successful challenges. Klinck fought back with a challenge of his own in the waning minutes, but it wasn't enough.

In Division Two, it was a good day for newcomers, who took four of the top five spots, led by texel, HilbertRaum, and tenshiyuan. boets was able to partially defend the honor of TopCoder veterans with a respectable fourth-place finish.

# The Problems

BettingMoney
Used as: Division Two - Level One:
 Value 250 Submission Rate 159 / 167 (95.21%) Success Rate 154 / 159 (96.86%) High Score haowei for 246.40 points (3 mins 26 secs) Average Score 216.49 (for 154 correct submissions)

An easy problem to start the day. Most of the successful solutions looked essentially like this:

```  int netGain = 0;
for (int i = 0; i < amounts.length; i++)
if (i == finalResult)
netGain -= amounts[i] * centsPerDollar[i];
else
netGain += amounts[i] * 100;
return netGain;
```
A slightly shorter variation, but one that is easier to get wrong, is to pretend everybody lost their bets in the loop, and then adjust for those who actually won afterwards.
```  int netGain = 0;
for (int i = 0; i < amounts.length; i++)
netGain += amounts[i] * 100;
netGain -= amounts[finalResult] * 100;
netGain -= amounts[finalResult] * centsPerDollar[finalResult];
return netGain;
```
I've written the last three lines separately for clarity, but they can easily be combined into a single, long line.

With so few failures, it's hard to generalize any common mistakes, but the only mistake that appears to have been made twice is to covert the loss into dollars before eventually converting the total into cents.

VolumeGuess
Used as: Division Two - Level Two:
 Value 500 Submission Rate 97 / 167 (58.08%) Success Rate 58 / 97 (59.79%) High Score wilsong for 443.37 points (10 mins 25 secs) Average Score 292.50 (for 58 correct submissions)
Used as: Division One - Level One:
 Value 250 Submission Rate 129 / 146 (88.36%) Success Rate 103 / 129 (79.84%) High Score Larry for 247.14 points (3 mins 3 secs) Average Score 184.89 (for 103 correct submissions)

This was perhaps the most interesting problem of the match, because it admits such an amazing variety of solutions. At least five major families of algorithms were used successfully in the competition—and I don't pretend to have looked at every solution!

In the first variation, hordes of C++ programmers, including SnapDragon and gepa, were seduced by the availability of the next_permutation function. In this variation, you create a sequence of all the distinct volumes, where element k of the sequence stands for the volume of box k. The volume of the largest box does not appear anywhere in queries, so you add a large dummy value to the sequence to stand in for the missing volume. Then you loop through all the permutations of this sequence until you find one that satisfies all the queries, and return element ithBox of the sequence. Personally, I hate this solution, because O(n!) algorithms offend my sensibilities. Kids, do not try this at home!

The second variation was probably the least common, but a few coders such as centipede80 and Im2Good did a depth-first search through the queries. For query x,y,vol, you first try assigning the given volume to box x and then to box y, backtracking whenever a box is assigned incompatible volumes. You can prune the search space considerably by remembering that, if we assign the given volume to box x, then the other box must already have a larger volume, or must eventually be given a larger volume.

In the third variation, used by jms137 and texel, you take advantage of the fact that only the box with the minimum volume will have the same volume in all of its queries. Find that box and the associated volume. If it is the box numbered ithBox, then return the volume. Otherwise, remove all queries for that box and repeat.

In the fourth variation, exemplified by Wernie, you realize that, if the same volume appears in two different queries involving a given box, then that must be volume of that box. So, you can scan through the queries involving ithBox until you find a duplicate volume, and return that volume.

The fifth variation was the simplest and fastest of all, both fastest in running time and fastest to code. Speedster Larry used this approach, as did macro-happy Eryx. In this variation, you realize that if two queries involving the same box have different volumes, then the smaller of the two volumes cannot be the volume of the given box. Therefore, by process of elimination, the volume of a given box is the maximum volume that appears in any query involving that box. So, just look through all the queries involving ithBox, and keep track of the biggest volume.

CuboidJoin
Used as: Division Two - Level Three:
 Value 1000 Submission Rate 34 / 167 (20.36%) Success Rate 8 / 34 (23.53%) High Score texel for 775.06 points (16 mins 19 secs) Average Score 609.87 (for 8 correct submissions)

The first temptation in a problem like this is to loop through all the possible unit cubes and just count the ones that are inside one or more cuboids. But with a grid that is 10001-by-10001-by-10001, there are just too many unit cubes for this brute-force approach to be feasible.

The approach taken by most of the successful solutions (see, for example, HilbertRaum's) is to reduce the problem to a much-more-manageable 10-by-10-by-10 grid. There are at most 10 x-coordinates of interest—namely, those that appear in one or more of the cuboids. Similarly, there are at most 10 y-coordinates of interest and at most 10 z-coordinates of interest. We can collect all the coordinates of interest into three sets, sort them, and then brute-force our way through this much smaller grid, considering all possible regions formed by neighboring coordinates.

For example, suppose there are only two cuboids, one with 0 < x < 100, 15 < y < 115, and 22 < z < 122, and the other with 96 < x < 196, 15 < y < 115, and 7 < z < 122. The x-coordinates of interest are {0, 96, 100, 196}, the y-coordinates of interest are {15, 115}, and the z-coordinates of interest are {7, 22, 122}. Now there are only six regions that we have to check, beginning with the region defined by 0 < x < 96, 15 < y < 115, and 7 < z < 22.

For each region, we check if it falls inside one or more of the original cuboids, and, if so, add the region's volume to the total. Because of the way we choose the boundaries of the regions, we know that each region should be counted entirely or not at all—it is impossible for one of the regions to fall partly inside and partly outside some cuboid.

Two of the successful solutions, including texel's, were based instead on the inclusion-exclusion principle. You begin by adding all the volumes of the individual cuboids. But this overcounts those regions that appear in more than one cuboid. To correct for this error, you subtract all the intersections of two cuboids. But now you are undercounting those regions that appear in more than two cuboids, so you add back in all intersections of three cuboids. Continuing in this fashion, you subtract all intersections of four cuboids, and finally add the intersection (if any) of all five cuboids.

To calculate the intersection of several cuboids, you take the maximums of their left, bottom, and front coordinates, and the minimums of their right, top, and back coordinates. Then just verify that the resulting left, bottom, and front coordinates are less than the resulting right, top, and back coordinates.

PolicePair
Used as: Division One - Level Two:
 Value 500 Submission Rate 82 / 146 (56.16%) Success Rate 63 / 82 (76.83%) High Score SnapDragon for 452.91 points (9 mins 21 secs) Average Score 316.24 (for 63 correct submissions)

First, build a table of how many officers have skill s (call this numWith[s]). Then just brute-force it baby! Consider the possible sums of the skills of the two officers per squad car. These sums range from 2 to 2000. For each sum, calculate how many officers would be unassigned, and keep the sum that results in the fewest unassigned officers. At the end, just remember to divide the sum by 2 to get the average.

For each sum, consider all possible individual skill levels and determine how many of the officers with that skill level would remain unassigned. Officers of skill level s would be paired with officers of skill level sum-s. If numWith[s] <= numWith[sum-s], then all officers with skill level s would be assigned. If numWith[s] > numWith[sum-s], then numWith[s] - numWith[sum-s] would be left unassigned. There is a special case when s is half of sum. In that case, officers with skill level s would be paired with each other, so one officer would be left unassigned if numWith[s] is odd, and none if numWith[s] is even.

MagicianTour
Used as: Division One - Level Three:
 Value 900 Submission Rate 54 / 146 (36.99%) Success Rate 27 / 54 (50.00%) High Score SnapDragon for 839.93 points (7 mins 42 secs) Average Score 614.58 (for 27 correct submissions)

As Hard problems go, this one was pretty easy, as reflected by the unusually high submission rate. At least, it was easy if you are familiar with textbook algorithms such as Depth-First Search and Knapsack. Even so, SnapDragon's speed on this problem was awfully impressive.

This problem can be broken down into four steps, although the first three steps were done at the same time in most successful solutions.

1. Partition the cities into connected components, most likely using DFS, but perhaps using union-find if you happen to have that precoded.
2. Assign shows to the cities in each component. For each component, begin by assigning show 1 to an arbitrary city. Then assign show 2 to all of its neighbors, show 1 to all of its neighbors' neighbors, and so on. The constraints guarantee that such an assignment will always be possible. Furthermore, for each set of connected cities, the assignment will be unique, except for the possibility of switching all show 1 cities to show 2 cities, and vice versa. This step is most easily accomplished using DFS, probably the same DFS that determined the components to begin with.
3. For each component, calculate the difference between populations of the show 1 cities and the show 2 cities. Again, this is probably done during the same DFS as the first two steps. If the difference is D, then the contribution of this component to the overall difference is either D or -D. The latter difference can be achieved by switching the shows for all cities in the component simultaneously.
4. Finally, use a knapsack-like algorithm to make the overall difference between the populations of the show 1 cities and the show 2 cities as small as possible. Essentially, you want to choose either D or -D for each component in such a way that you minimize the absolute value of final sum. This is sometimes known as the "tug-of-war" problem (and in fact appeared as a TopCoder problem of the same name once upon a time), named for the problem of trying to create two tug-of-war teams that are as evenly balanced as possible. Most of the successful coders, like SnapDragon, used a minor variation of the classic dynamic-programming knapsack algorithm. Ryan took a slightly different approach, but his code for this step is particularly understandable.

Many of the unsuccessful submissions on this problem fell to timeout challenges. One common mistake was trying to solve the problem with a single DFS, without separating the cities into connected components.

By vorthys
TopCoder Member