Get Time
statistics_w  Match Editorial
SRM 218
Thursday, November 4, 2004

Match summary

SRM 218 turned out to be full of interesting surprises in both divisions. The medium and hard problems in Division 1 turned out to be major stumbling blocks for many coders, and many of the successful solutions took considerable lengths of time. The Division 1 easy problem, which was also the Division 2 medium, held at least one quirk that led to a high challenge and failure rate.

In Division 1, gepa turned in a superior performance to earn his first SRM victory. An extra measure of congratulations is deserved for his offering up the lone successful solution to the hard problem. Hats off to a truly awesome round. nicka81 and Vedensky were 2nd and 3rd, but by a considerable margin.

In Division 2, pako took first place honors by more than 100 points, followed by Javaholic and quake, with several others not far behind. Several competitors served up solutions to all three problems, but many of them had one or more fail.

A special pat on the back should go to glguy, finishing 6th, as the only newcomer to successfully solve all three problems (and one of nine overall)! Likewise to katiej76, finishing 11th, also as a newcomer, but dishing out a commanding six successful challenges on the 500 point problem. Wow! Though not newcomers, modenl and Vishwesh86 both also deserve commendation for eight and six successful challenges, respectively.

The Problems

AccessLevel discuss it
Used as: Division Two - Level One:
Value 250
Submission Rate 162 / 170 (95.29%)
Success Rate 156 / 162 (96.30%)
High Score csd98412 for 249.39 points (1 mins 24 secs)
Average Score 237.77 (for 156 correct submissions)

This problem was about a straightforward as it gets, with really no surprises. The most common basic approach was start with an empty string, loop through the input array, and append characters to the string one at a time. The only problems with solutions seemed to stem from mishandling an empty array by not properly initializing the empty string, or by using > for the permission comparison instead of >=.

FolderSize discuss it
Used as: Division Two - Level Two:
Value 500
Submission Rate 152 / 170 (89.41%)
Success Rate 64 / 152 (42.11%)
High Score pako for 476.21 points (6 mins 24 secs)
Average Score 389.12 (for 64 correct submissions)
Used as: Division One - Level One:
Value 200
Submission Rate 116 / 120 (96.67%)
Success Rate 89 / 116 (76.72%)
High Score Im2Good for 198.12 points (2 mins 46 secs)
Average Score 183.64 (for 89 correct submissions)

This problem was a plentiful source of either joy or frustration, depending on which side of the challenge you sat. The basic algorithm was to initialize your array based upon folderCount, then loop through the files, and add the wasted space of each to the appropriate folder.

The trick is to make sure the wasted space of each file is properly calculated, and herein lied the subtle mistake of so many coders. In nearly every case that failed, there was something like:

waste = clustersize - filesize % clustersize
BUT... this makes a critical error in handling the case when a file exactly fills all of it's clusters. Likewise, a size 0 file will not use any clusters at all, and will have no waste. That being said, once you handle that special case properly, the rest is trivial.

Nonetheless, the lesson to learn is two-fold. No matter how easy a problem seems, it pays to think it through before hitting the [Submit] button. That there was no example case to illustrate this case only reinforces the importance of thoughtful testing.

PermissionTree discuss it
Used as: Division Two - Level Three:
Value 1000
Submission Rate 38 / 170 (22.35%)
Success Rate 20 / 38 (52.63%)
High Score alena for 641.58 points (24 mins 18 secs)
Average Score 458.93 (for 20 correct submissions)

This problem, more than the others, demanded some measure of creativity in finding a good solution, but still worked out nicely under a number of different approaches.

One method is to search the tree and test each folder as being a possible home folder, and from there continue to go deeper until no child folders would be valid. While this is certainly a legitimate approach, it is probably not the most efficient.

Another method is to find all folders that a given user can access. Then, travelling the tree back up to the root, we simply look for the first common ancestor of all such accessible folders. A simple implementation is to first assume the home folder will be -1. Then, search for a folder the user can access, and set that to the temporary home folder. Then, continuing to search the remaining folders, for each one the user can access, set the temporary home folder to the common ancestor of itself and the newest found folder. Continue for all remaining folders, and your result is the home folder for that user.

Some common mistakes that caught coders by surprise were returning a home folder that was not as deep as possible, and not properly determining if a user was in the user list for a folder. Also, some coders had trouble properly handling the parent-child relationship of the root node, which was defined as it's own parent (so that the input would look consistent).

As of this writeup, the practice room already contained over a dozen good, varied solutions to this problem.

Reconstruct discuss it
Used as: Division One - Level Two:
Value 650
Submission Rate 52 / 121 (42.98%)
Success Rate 24 / 52 (46.15%)
High Score nicka81 for 531.93 points (14 mins 2 secs)
Average Score 335.33 (for 24 correct submissions)

In this problem, we sought to reconstruct an original set of Euclidian points in 3-space, given a complete set of information about their Euclidian distances from each other. We are given that the first point is (0, 0, 0) and the second is well-enough defined that we can return a unique set of points.

The trick to this problem is use brute force to find possible locations of the second point, then recurse for the next point, and so on. The reason brute force is possible in time here is that with each successive point found, there are an increasing number of constraints (because there are more other points to measure from) on what points are valid. As such, the recursion avoids ballooning, even with the brute force searching.

The failure condition in which no set can be reconstructed from the given data is determined when all possible points have been tried to no avail, or at any point along the way when the squared distance between two points is not a sum of three squares.

WinningProbability discuss it
Used as: Division One - Level Three:
Value 750
Submission Rate 8 / 121 (6.61%)
Success Rate 1 / 8 (12.50%)
High Score gepa for 281.82 points (68 mins 5 secs)
Average Score 281.82 (for 1 correct submission)

When only a single solution passes, that is usually an indication that a low-point-value problem must be more difficult than meets the eye. Ultimately, the biggest difficulty here was simply just understanding what was required. However, the first example did wonders to help clear this up. In particular, a weighted average over all possible values of p is what we are after.

So, how do we do that? If you caught the scent and think this problem has an odor of calculus about it, you are absolutely correct. But, what do we want to integrate? Well, actually we want two integrals... the first is of:
(Probability previous games turned out as they did) * (Probability of winning required games)
And the second is just (Probability previous games turned out as they did)

Now, since each of these two calculated probabilities can [in theory, at least] be represented by a polynomial (in particular, we'll need to employ binomial probability polynomials here) in p. So, it seems like computing the definite integral should be possible, and should guarantee an accurate result. But, when we examine the constraints, and see that prevWins and prevLosses may each be up to 100, we should immediately realize the magnitude of the binary polynomial coefficients we would need to calculate.

Instead, a numerical approach may make more sense, and is arguably simpler to code. To do this, we simply divide the interval from 0 to 1 into many very small intervals, and approximate the integral for each. This is where things get troublesome once more, though, as we are constrained by accuracy. In particular, integration by the trapezoid rule often fails due to precision errors, so instead Simpson's rule should be used.

Once we calculate the two integrals, we simply divide to obtain our weighted average, and we thus have our final result.

By timmac
TopCoder Member