JOIN
 Match Editorial
SRM 335
Wednesday, January 17, 2007

## Match summary

Single Round Match 335 saw a fairly high number of coders face a seemingly innocent -- but tough -- set of problems. In both divisions there were many failed submissions in more than one problem. Luckily, many failures arose during the challenge phase, which was a great deal of fun and resulted in big points for successful challengers.

In division 1, Petr took the lead at the end of the match, despite his resubmission of the medium, thanks to the high score in both the easy and the hard problems and +200 during the challenge phase. nika was second with solid performances on the problems and a n extra +150 due to challenges. kalinov came in third because he "just" got +50 on challenges.

In division 2, three newcomers attained yellow ranking by monopolizing the podium -- although first, second and third all finished within 75 points of each other, there was a margin of more than 150 point between them and the fourth place finisher. Suprisingly, the three top finishers all got the problems and exactly one succesful challenge each. First place was taken by MasterZerg with the fastest submission of the hard. leoveanu_unu came in second and jekica took third.

# The Problems

Palindromize
Used as: Division Two - Level One:
 Value 250 Submission Rate 453 / 570 (79.47%) Success Rate 371 / 453 (81.90%) High Score sidag011235 for 249.11 points (1 mins 42 secs) Average Score 185.65 (for 371 correct submissions)
This problem was a little tougher than the usual Division 2 easy problem, but the examples were very useful in helping to solve it. To start off, you have the string s and assume that the best answer adds exactly k characters. Those k characters have to be the first k characters of s reversed because, after adding them, we need it to become a palindrome. We can try every possible k until we find a palindrome. For an implementation on this idea see leoveanu_unu's code.

Since this means the last size(s)-k characters of s must form a palindrome, it was also possible to find for the largest such palindrome (a postfix of s) and then complete it properly.

Multifactorial
Used as: Division Two - Level Two:
 Value 500 Submission Rate 392 / 570 (68.77%) Success Rate 87 / 392 (22.19%) High Score jekica for 492.71 points (3 mins 28 secs) Average Score 354.88 (for 87 correct submissions)
Used as: Division One - Level One:
 Value 250 Submission Rate 480 / 487 (98.56%) Success Rate 288 / 480 (60.00%) High Score Petr for 248.99 points (1 mins 49 secs) Average Score 215.79 (for 288 correct submissions)
This problem was about having a safe way to detect overflow in multiplication -- no matter what the input was, the overflow or the result would appear with at most 20 factors multiplied, so runtime was no issue. Since the limit was close to the upper bound of a 64 bit integer, the detection method had to be a little trickier than a simple greater than comparison.

The selection of the method is a matter of personal choice, but my opinion is that the best approach is to multiply and then check if the result is correct with a division and a modulus comparison. Since division and modulus never overflow, this check is completely safe. Another way of looking at this is to see if the maximum possible value divided by one of the factors is less than the other factor. Both approaches are equally safe and are fairly similar. To see the code for this, check out the implementation Petr that was used to get the fastest submission.

It was also safe to use large integer arithmetic. This was especially useful for Java coders. For an example of this see jekica's code.

Using floating point allows a greater upper bound than integers, but it's tricky because of the loss of precision. Suprisingly, 80 bit floating points have enough precision to solve the problem, according to wefgef's solution.

In a language without long doubles, you could use a mix of integer calculation for the cases without overflow or with overflow by a little margin and a regular double to detect larger margin overflows, as kalinov did.

MinimumVariancePartition
Used as: Division Two - Level Three:
 Value 1000 Submission Rate 68 / 570 (11.93%) Success Rate 32 / 68 (47.06%) High Score MasterZerg for 787.65 points (15 mins 39 secs) Average Score 581.91 (for 32 correct submissions)
This problem had one important first step after which, if you had enough experience with DP, it was pretty straightforward. Many coders realized, without the need of a formal proof, that it was always optimal to sort the sequence and then cut that it into pieces instead of taking subsets. I won't give a formal proof here, but you can try doing it by taking an optimal solution that couldn't come from cutting the sorted sequence and take it to one that could with little steps that never increase the sum of variances.

Once you grasped this idea, the problem is pretty classical: Among all the ways of cutting this, take the one that minimizes a certain function (in this case, the sum of variances). Formally, we have a function f(i,k) that represent the least sum of variances that can be obtained from the sequence that starts at index i, splitting it into k or less pieces (since it's always better or equal to use more pieces, there is no need to worry about making it exactly k, though it was pretty simple to do so). So, if A is the sorted array of mixed samples:
• f(i,k) = 0 if i >= length of A
• f(i,k) = infinite if k <= 0 and i < length of A
• f(i,k) = min{ f(j,k-1)+var(i,j) | i < j <= n }
where var is the function that calculates the variance of the partition that starts at index i and ends at index j-1, inclusive. From the above formal definition to a memoized implementation is just a little step.

Check out diam1's solution for an example of the iterative version.

ExpensiveTravel
Used as: Division One - Level Two:
 Value 550 Submission Rate 254 / 487 (52.16%) Success Rate 105 / 254 (41.34%) High Score krijgertje for 488.89 points (10 mins 18 secs) Average Score 294.60 (for 105 correct submissions)
This problem sounded strange when read too fast, but the extra 50 points should have warned coders to approach it with care. Many Division 1 coders realized that this was a shortest path problem right away, but the exact way to do it was not that obvious. There were 2 possible approaches to it.

Modified dijkstra
In this approach the basic idea is that if a node (cell) is on the path you are taking, you always want to get there in the first interval that is possible. Among two ways of doing it during the same interval, you always want the one that has less cost consumed for that interval. From this reasoning, you could code a dijkstra algorithm (even the O(n2) version, since there are at most 2500 nodes) where at each node you keep the best pair <interval,consumed_cost> found so far. As you can easily see, the invariant for the algorithm is mantained with this kind of cost (furthermore, you can use any cost that has a total ordering of preference). When examining edges out of the current closing node, you must check to see if it can be traversed during this interval, or if it can be done by waiting until the next interval and then tranversing the edge

For a clean implementation of this approach, see the solution by bmerry.

BFS + dijkstra to calculate edges
This idea was longer to code, but much more straightforward from the problem statement. Simpy run a dijkstra from each cell to see which cells are reachable during an interval, add an edge to them, then do a BFS in the resulting graph. The only trick here was that you needed a O(m*log(n)) dijkstra implementation in order to get in on time, or to realize that only cells with manhattan distance less than 9 were reachable to avoid getting an overall execution time of roughly O(n3), which was actually 180*n2. Both cases, however, had a danger of timing out (actually, some naive implementations of a n2 dijkstra may work if you maintain a list with the currently open nodes instead of iterating all of them when looking for the one to close next).

Check out misof's implementation on this idea.

RandomFights
Used as: Division One - Level Three:
 Value 950 Submission Rate 110 / 487 (22.59%) Success Rate 78 / 110 (70.91%) High Score Petr for 896.71 points (7 mins 0 secs) Average Score 518.43 (for 78 correct submissions)
This problem had two important parts. The first was to reduce the formula from all possible pairings (n!) to all possible pairs (n2). Since O(n2) was too high in this case, we needed to play with the formula a little to find a faster way to calculate its result.

Precision was also an important issue. In order to get it correct, you needed to do almost every calculation on integers (long double alone was also not enough).

First part: We need to calculate the expected result for any possible pairing. Since every possible pair is in the same situation in the formula, then the correct result must be proportional to the average result for a pair. In fact, it is the sum of the results of all pairs divided by n (this is the same as the average result multiplied by n). This intuitive idea can be formalized by looking at the formula for the expected result and associating to each pair its number of appearances in the sum (it's easy to see that it's the same number for all pairs).

Second part: We need the average value of a fight, and iterating through every possible pair is not feasible because of the size of n. If we fix one member of A (let's call it a) we'll see that it contributes some positive points -- because of the fights with all members of B with less strength -- as well as some negative ones -- because of the fights with members of B with greater strength. If B is sorted, these two sets form a prefix and a postfix of the B sequence, respectively. We are getting close, now we need to calculate the sum of each. Both are of the form
`(a-x1)2+(a-x2)2...+(a-xk)2`
for some X = {x1,...,xk} each time (X contained in B), so we can rewrite the formula as:
`a2*k+sumOfSquares(X)-2*a*sumOf(X).`
By calculating the accumulated sum and the accumulated sum of squares of B we can calculate the sum or the sum of the squares of any contiguous subsequence given its start and end indices, including any prefix or postfix. Since we can calculate those indices with binary search, we get an overall complexity of n*log(n), which is small enough to fit on time. A clear C++ implementation of this idea follows:
```double expectedNrOfPoints(vector  AA, vector  BB, int n) {
vector<long long> A=parse(AA,n), B=parse(BB,n); //parse generates the input of size n
sort(B.begin(),B.end());
vector<long long> SB(n+1,0);  //SB2[i]=sum of B[x] up to i-1
vector<long long> SB2(n+1,0); //SB2[i]=sum of B[x]2 up to i-1
long long r=0;
for(int i=1;i<=n;++i) SB[i]=SB[i-1]+B[i-1], SB2[i]=SB2[i-1]+B[i-1]*B[i-1];
for(int i=0;i<n;++i) {
long long l=lower_bound(B.begin(),B.end(),A[i])-B.begin();
long long g=lower_bound(B.begin(),B.end(),A[i]+1)-B.begin();
long long won=A[i]*A[i]*l+SB2[l]-A[i]*SB[l]*2; //sum of all fights won by member i of A
long long lost=A[i]*A[i]*(n-g)+SB2[n]-SB2[g]-A[i]*(SB[n]-SB[g])*2; //sum of all fights lost by Ai
r+=won-lost; //we do everything with 64 bit integer arithmetic
}
return double(r)/n; //only floating point operation
}
```

By soul-net
TopCoder Member