Match Editorial

SRM 132
Saturday, February 1, 2003

Match summary

This afternoon's challenge was one of the wackiest in recent memory. Division 2 featured a couple of math problems that could lead to issues of double imprecision if solved the wrong way, and a simple dynamic programming problem about binary search trees. Division 1 seemed very simple at first glance, with almost everyone submitting the first two problems. And while the 450 ended up being as simple as it looked, the 300 timed out in a way that was not caught by the examples. As a result, almost everyone submitted the 300 immediately, only to find out during the challenge phase that the examples were less than exhaustive. This allowed InsaneParadox to get a whopping 11 challenges on one problem, probably a new TC record.

The Problems

Typing

Used as: Division-II, Level 1:

Value 250

Submission Rate 192 / 202 (95.50%)

Success Rate 116 / 192 (60.42%)

High Score anup for 249.33 points

To find the overall rate, all you have to do is take each individual rate times the associated probability, and add all that up. At the end, divide by 100, since probabilities are always percent/100.

HardIneq

Used as: Division-II, Level 2:

Value 550

Submission Rate 164 / 202 (81.19%)

Success Rate 101 / 164 (61.59%)

High Score anup for 544.28 points

This problem required a bit of trickery. The task given to you is very simple: Given a formula, find the smallest integer that satisfies it. However, the formula involves large exponents which produce numbers that are so large, they even overflow double, which goes up to around 10^300. So, to make it work, you have to use the log identities given to you as a note:

log(x^y) = y * log(x)
log(x*y) = log(x) + log(y)

This allows you to change
a*(n^k) < k^n
into:
log(a)+k*log(n) < n*log(k)
This allows you to evaluate the inequality for all of the values of a and k that could be inputs.
One thing that one always has to consider when dealing with doubles, is that they are inexact. This suggests that it might be better to replace our inequality with something like: log(a)+k*log(n) < n*log(k) - 1E-10, just to be safe. In fact, it turned out that different languages dealt with logs and doubles slightly differently, so C++ solutions required an epsilon, while Java did not.

BSTs

Used as: Division-II, Level 3:

Value 1000

Submission Rate 64 / 202 (31.68%)

Success Rate 54 / 64 (84.38%)

High Score BryanChen for 986.41 points

Used as: Division-I, Level 2:

Value 450

Submission Rate 90 / 118 (76.27%)

Success Rate 87 / 90 (96.67%)

High Score John Dethridge for 444.93 points

Reference Solution: John Dethridge

Trying to construct all of the BST's and then count them is an approach that would probably work, since there are at most 10 nodes, but it would probably take quite a long time to do. Instead, you should use a little bit of dynamic programming (or catalan number, more on that later). The first thing to notice is that the numbers on the nodes are of no consequence. Furthermore, we can pick the root node in such a way that there are anywhere i nodes in the left subtree, and n-i-1 nodes in the right subtree, where n is the total number of nodes. Since each subtree is itself a BST, we can count the number of BSTs with i nodes, and the number with n-i-1 nodes, and multiply those two numbers together to get the total number of BSTs with a given root. So, we can define a recurrence bst(n) = sum(i = 0 to n-1)(bst(i)*bst(n-i-1)), with a base case of bst(0) = 1. An iterative approach is the quickest way to solve this, which can be seen in John Dethridge's solution.

This is a fairly well-known problem, and it turns out that it can be solved with a simple closed form solution, called a Catalan number:C(2n,n)/(n+1), where C means stands for choose (C(m,n) = fact(m)/fact(n)/fact(m-n)). It turns out that the number of BSTs with n nodes is exactly equal to the nth Catalan number. For more on Catalan numbers, check out what MathWorld has to say about them.

BigFunc

Used as: Division-I, Level 1:

Value 300

Submission Rate 115 / 118 (88.97%)

Success Rate 55 / 115 (47.83%)

High Score RyanPai for 289.09 points

Reference Solution: WishingBone

This problem appears to be absolutely trivial. So trivial in fact, that one has to wonder about the 300 points assigned to it. Most people looked at it, and right away and coded it up exactly as given. This solution passed the examples, and a number of people submitted for scores of around 299. However, as a result this problem ended up with what was probably the lowest submission accuracy ever. If one spends a couple minutes to look at the recurrence, he or she would realize that it is far from obvious how many recurrences it will take. The moral of the story is that the given examples are not always exhaustive, even on the easy problems. Especially when it is relatively simple to try a few large numbers, as in this case, you should invest the minute or two it will take. That rant over, on the the problem analysis.

The naive implementation is very close to correct and can easily be fixed to work by using a technique called memoization. To do this, we simply have to cache the results of our recursion, and use the cache if we already know the result. WishingBone's implementation is a good example of how to do this. The idea is that, for a given value of m and n, f always evaluates to the same thing. So, there is no need to reevaluate the function each time. Instead, we save the returned value of f for each input m and n, and then if f gets called again, we just used the saved value. You can do the same thing with g.

Another way to solve this is iteratively, as seen in Maris's code. The trick to this approach is to realize that f(m,n) depends only on f(m-1,n) and g(m-1,n-1). Similarly g(m,n) depends only on f(m,n) and g(m,n-1). So, if we already know f(m-1,n) and g(m-1,n-1), it is easy to calculate f(m,n). You can do essentially the same thing with g. As a result, the problem can be solved easily with two nested for loops.

Though no one realized it during the contest, it turns out that f(m,n) = sum(i = 1 to n)i^m. For a proof of this, click here.

PinballLanes

Used as: Division-I, Level 3:

Value 1000

Submission Rate 25 / 118 (21.19%)

Success Rate 12 / 25 (48.00%)

High Score John Dethridge for 828.78 points

Reference Solution: John Dethridge

This is a fairly standard dynamic programming problem. There are 2^(number of lanes) possible configurations of lights being on or off. So, what we should do, is initialize an array with 2^lanes elements, and set them all to 0, except for the element representing all off. Then, for each time that the ball goes up and back down, we need to calculate the probabilities that the lights are in each possible configuration. To do this, we can simple look at all possible paths that the ball can take for each configuration. So, we have something that looks like this:

int state[1<<lanes];
state[0] = 1;//0 represents all off, and in general, the ith bit represents the ith light.
foreach(time){
    int nextState[1<<lanes];
    foreach(state s){
        foreach(lane i){
            foreach(lane j){
                nextState[s with up and down lanes toggled] += 
                  state[s]*(probability of rolling down lane j, given it rolled up lane i);
            }
        }
    }
    state = nextState;
}

The above code isn't complete. You still have to deal with all of the lights turning on, and then resetting. But it illustrates the basica idea behind the approach. You also have to determine the actual probabilities from the relative probabilities. John Dethridge's code is probably the shortest example of this approach.

By lbackstrom
TopCoder Member