Match Editorial

SRM 226
Wednesday, January 5, 2005

Match summary

In Division 2, Oleksiy took top honors with a commanding lead, while bluequark and wickedman took second and third. A special mention goes to wickedman for being the highest scoring newbie.

In Division 1, haha picked up his second SRM victory (his first was way back in SRM 187), with John Dethridge and natori coming in a close second and third.

The Division 1 coders saw a quirky problem set, wherein the hard problem had a 95% success rate, while the others were successfully solved by less than 40% of those who submitted a response. In fact, the total number of correct submissions was nearly equal for all three problems (44, 37, and 40, respectively).

The Problems

VLNString

Used as: Division Two - Level One:

Value 250

Submission Rate 149 / 170 (87.65%)

Success Rate 119 / 149 (79.87%)

High Score lingo for 248.68 points (2 mins 4 secs)

Average Score 212.18 (for 119 correct submissions)

The basic idea behind this problem is to work from left to right, building words, and stopping the build when you get to one or more spaces, then adding the first letter if it wasn't "and", "or", or "the". Trying to use a standard library split function could get you into trouble because of multiple consecutive spaces; a simple parsing approach is safer.

ExperimentalAnalyzer

Used as: Division Two - Level Two:

Value 600

Submission Rate 83 / 170 (48.82%)

Success Rate 63 / 83 (75.90%)

High Score nickel for 522.10 points (11 mins 19 secs)

Average Score 323.80 (for 63 correct submissions)

This problem is not too difficult to code, once the methodology is determined. A simple approach is, for each variable, determine the minimum and maximum value of that variable where the result is 0, and where the result is 1. If those two ranges are non-overlapping (note however, that all result 1 values could be less than the result 0 values, or vice versa), that variable is a predictor.

ManhattanMovement

Used as: Division Two - Level Three:

Value 800

Submission Rate 60 / 170 (35.29%)

Success Rate 14 / 60 (23.33%)

High Score Oleksiy for 669.55 points (13 mins 4 secs)

Average Score 514.14 (for 14 correct submissions)

Used as: Division One - Level One:

Value 250

Submission Rate 115 / 129 (89.15%)

Success Rate 44 / 115 (38.26%)

High Score haha for 245.78 points (3 mins 44 secs)

Average Score 201.56 (for 44 correct submissions)

This is a classic type of problem that can be solved very easily once you see the "ah-ha". In this case, the big "ah-ha" is that the answer is always a single vertical or horizontal line. There is never any advantage in changing direction. Need further convincing? If the road is vertical or horizontal, then only one direction is possible to travel towards the road, and it's trivial. For any diagonal road, consider the two direct paths (vertical and horizontal). If both direct paths are the same, the case is again trivial. In any other case, one or the other is shorter. Now, suppose you were to do some of your travel along the short path, then change directions. After travelling the short path, you're left at another point, with the same setup, whereby the same direction is still shorter. Thus, it always makes sense to continue on the short path.

So, calculate the distance to the line horizontally and vertically (be careful of cases where a or b is 0!) and return the shorter of the two. Also, it's important to cast all of the variables to double before calculating, otherwise an overflow can result. This is possibly the biggest gotcha that left percentages so low.

OneArmedBandit

Used as: Division One - Level Two:

Value 500

Submission Rate 103 / 129 (79.84%)

Success Rate 37 / 103 (35.92%)

High Score LuckyLibran for 438.05 points (11 mins 0 secs)

Average Score 310.45 (for 37 correct submissions)

In spite of a lot of numbers floating around, this problem is fairly straightforward. Knowing that expected outcome = probability * payout, we really have all the information we need to calculate our result.

First, we determine the expected outcome of each of the non-jackpot payout lines. This is simply the probability of each wheel landing on the appropriate symbol (or 1, if the symbol is a wildcard) multiplied by the payout. We add these together. If this is greater than or equal to 1, then the slot machine is a winner, even without the jackpot, and we return 0.

Finally, we determine the probability of the jackpot line coming up. If it's 0, then we return -1, since it doesn't matter what the jackpot is if we can never win it. Otherwise, we know that the expected outcome of the jackpot payoff must be sufficient to make our total expected outcome at least 1. So, solving the simple formula, the payout then needs to be at least (1 - non-jackpot expected payout) / jackpot percentage.

TestScores

Used as: Division One - Level Three:

Value 800

Submission Rate 42 / 129 (32.56%)

Success Rate 40 / 42 (95.24%)

High Score antimatter for 769.78 points (5 mins 40 secs)

Average Score 548.22 (for 40 correct submissions)

This is the type of problem where either you saw the solution and coded it, or you didn't. With a very high success rate among those who submitted a solution, obviously many saw the solution.

The first thing to realize is that calculating the mean is simple: sum the probabilities of each question being answered correctly; standard deviation is a little more subtle. But, for a very large set of data, knowing the probability of obtaining each raw score (which, in the worst case ranged from 0 to 50) suffices to calculate the standard deviation. The procedure to do this is pretty simple:

For each raw score value, multiply the probability times the square of the deviation from the mean
Add up each of the values obtained above.
Calculate the square root of that sum.

I'll leave it as an exercise to the reader to work out exactly why that works.

The final piece of the puzzle is simply calculating the probabilities of each raw score. Obviously, with 50 questions, there are 2^50 possible ways the questions could each be correct/incorrect, so a more efficient calculation is necessary.

A simple dynamic programming approach works well. If P(a, n) is the probability that a of the first n questions were answered correctly, and Q(a) is the probability of an arbitrary person answering question a correctly, then we can use the recurrence P(a, n) = P(a - 1, n - 1) * Q(a) + P(a, n - 1) * (1 - Q(a)), P(0, 0) = 1 to build our DP.

By timmac
TopCoder Member