Get Time
statistics_w  Match Editorial
2007 TopCoder Open
Round 3

Saturday, April 14, 2007

Match summary

The second round of the tournament was by no means easy. The easy problem was made way more tricky than expected due to the use of doubles in the input. The medium problem was fairly simple – once you understood what you had to do. Apparrently the problem statement was not clear enough so a clarifying broadcast had to be issued. Finally, the hard problem, while not conceptually difficult, required a lot of careful coding, and only a few coders had enough time to finish it. Petr won the round with a comfortable 300-point lead before second andrewzta and third tomek

The Problems

Defects rate itdiscuss it
Used as: Division One - Level One:
Value 250
Submission Rate 604 / 817 (73.93%)
Success Rate 365 / 604 (60.43%)
High Score x-ray for 242.02 points (5 mins 11 secs)
Average Score 161.76 (for 365 correct submissions)

In my opinion the fact that doubles were used in this problem's input only made it unnecessarily tricky. The idea of the solution would remain the same if integer coordinates were used. This way, many coders were unnecessarily bitten by various precision issues. After the match many coders remarked that the situation was not comfortable. For example, the standard "rule of doubles" about 1e-9 error tolerance didn't apply for the input values, one had to consider them as exact.

When approaching the task, we should first note that the rectangle is completely useless. We are only allowed to move along the boundary, and its shape doesn't matter. The only thing we need are the distances between consecutive points.

To calculate the distances, we can (for example) for each given point P calculate the counter-clockwise distance d(P) from [0,0] to P. These values uniquely describe the given points, and sorting them is equivalent to sorting the points along the boundary.

Given two points P and Q, their distance is the smaller of the two values (d(P)-d(Q)) and (perimeter-(d(P)-d(Q))).

After these preparations we are ready to look for the best point. Usually the thought process in similar situations is: Place the point somewhere and look at what happens if you move it slightly. You will probably be able to move it and improve the solution until your point hits some special location. All that remains is to identify the special locations, and then try them and pick the best one.

Using this approach we may come to this conclusion: Either the location of one of the given points is the optimal place, or there is an optimal place such that one of the given points is exactly on the opposite point of the rectangle.

Proof: Imagine that we stand in some place different from all the given points. Suppose that none of the given points is exactly in the opposite place. For A of the given points the shorter distance is clockwise, for B of them it is counter-clockwise. If A<B, we may move by some small amount D clockwise. Clearly the sum of distances will increase by D(B-A), and thus the place was not optimal. Similar reasoning may be applied for the case A>B.

The case A=B remains. In this case we may move by a small amount in either direction and the sum won't change at all. Pick any direction and start moving. Sooner or later one of the numbers A or B will change. This happens exactly when we pass one of the given points, or if a point is exactly on the opposite side. Q.e.d.

The theorem that we just proved can be improved upon (how?), but in our case it is not necessary. We have a set of at most 2N candidate places for the optimal location. For each of them compute the sum of distances, and pick the best one.

See tomek's solution for a clean implementation.

WordSplit rate itdiscuss it
Used as: Division One - Level Two:
Value 500
Submission Rate 544 / 817 (66.59%)
Success Rate 274 / 544 (50.37%)
High Score Krzysan for 488.67 points (4 mins 20 secs)
Average Score 329.86 (for 274 correct submissions)

Suppose that we decide where to make the first split. We are left with a suffix of the original string, and we have to decide how to split it so that the entire set of pieces will be optimal.

We claim that in this case it is enough to solve the original task for our suffix.

Why is it so? Clearly we need to split it into as few pieces as possible. What remains is to show that the resulting sequence of pieces will be the lexicographically smallest one.

Let P be the first part of our original string, and let S be the suffix. Suppose that we have two arbitrary ways how to split S into an optimal number of pieces. Let the corresponding sorted piece sequences be B = (B[1], ..., B[n]) and C = (C[1], ..., C[n]), respectively. Suppose that (B[1], ..., B[n]) < (C[1], ..., C[n]). To construct the sequence of pieces for the original string, we now only need to insert P into its appropriate position in each of the sequences. We need to show that after we insert P, the sequence B will still be smaller than the sequence C.

Let i be the first position where B and C differ. We know that B[i]<C[i]. We just need to check three cases. If P≤B[i], we may insert it into B and C at the same position less than or equal to i, and the new sequences will differ in position i+1. Similarly, if C[i]≤P, the sequences will still differ in position i, as P may be inserted after this position. In the remaining case when B[i]<P<C[i], after inserting P we have: B[i] remains the same, and C[i] changes to P. Clearly, in all cases we still have B < C.

Thus the task is now pretty simple: To compute the optimal partitioning for a given string, try all valid places where the first piece ends, and for each of them take the optimal solution for the rest of the string. In this way for each length of the first piece you will get one candidate. Pick the best candidate.

Recursion with memoization / a linear dynamic programming was probably the most simple way how to implement this task. As the time limit was not an issue with this approach, you could go for convenience. For example, C++ coders could use the power of STL – do you know what operator< does if the operands are vectors? See my solution for a reasonably clean implementation of this approach.

Neaten rate itdiscuss it
Used as: Division One - Level Three:
Value 1000
Submission Rate 45 / 817 (5.51%)
Success Rate 9 / 45 (20.00%)
High Score Petr for 691.88 points (21 mins 2 secs)
Average Score 493.76 (for 9 correct submissions)

We will show a solution that doesn't use any fancy BigInteger stuff. We don't need too many operations on large numbers, and those we will need will be simple enough.

We will start by making the input number more pretty. We will convert it to a vector of digits, with say 120 decimal places. To do this, start by locating the decimal point. If it is not present, add it at the end. Add zeroes to the right end of the input number until it has enough decimal places. Check whether the number is less than 10-k. If yes, return "0" and exit. Delete unnecessary leading zeroes.

Now we have to find the interval of valid approximations. For the absolute error the interval is (number-10-k,number+10-k). For the relative error the interval is (number*(1-10-k),number*(1+10-k)). Clearly for number≥1 the relative error gives us the larger interval, and for number<1 the absolute error is better.

In either case, let L and U be the lower and upper bound of the interval of valid approximations. Clearly both L and U have a finite decimal expansion and they have less than 120 decimal digits. Thus we can compute and store them exactly. All we need to compute them is to implement addition and subtraction for large decimals – e.g., for the relative error we can compute U by adding number and (number shifted k digits to the right).

Now we need to find the "most round" number in the interval (L,U). This can be done simply by manipulating the digits of L and U. We will show it on a simple example:

L = 13.035935623512
U = 13.036253252352

Suppose that we fix the number of digits left in the resulting approximation. Now we need to check whether there is such a number in the interval (L,U). The number has to be larger than L. We can obtain the smallest such number by taking L, removing its last digits to get the correct length, and then adding 1 to the last digit. If this candidate lies within (L,U), it is the approximation we wanted. If not, we know that such an approximation is not possible.

For example, consider the above case and 8 digits.

L_truncated = 13.035935
candidate   = 13.035936
candidate   < U

Thus we found an approximation with 8 digits. If we now consider 6 digits only, we get the following approximation:

L_truncated = 13.0359
candidate   = 13.0360
candidate   < U

Thus we found an approximation with 6 digits. This is clearly not an optimal approximation, as it ends with a zero. However, this does not matter. In such a case we will always discover the same approximation if we restrict ourselves to less (in our case, 5) digits.

And we are almost done. As the numbers are reasonably small, all we need to do is to check all possible numbers of digits and for each of them check whether it leads to a valid approximation Note that we are limited by the position of the decimal point, we can not drop the digits that are to its left. The solution is reasonably simple to implement, we only need addition, subtraction and comparison. And as we fixed the number of decimal places, these operations will look exactly the same as with big integers.

You can find my solution implementing this approach in the practice room.

By misof
TopCoder Member