JOIN
 Match Editorial
SRM 348
Wednesday, May 9, 2007

## Match summary

The turnout for this match -- 839 competitors -- was the lowest since January, however that couldn't stop it from being very exciting, as usual. Division 1 was won by Petr, with a 200+ point gap over second place marek.cygan and third place tomek.

Relatively standard easy and medium problems led to many coders being able to approach the hard, and to very tight standings at the top after the coding phase. However, as most of the hards failed during either the challenge or the system test, only 12 coders got the hard right, and only 8 of them got all three problems. The challenge phase was quite eventful as some of the competitors tried to improve their ranking, but the most notable change was tomek losing his (ultimately) second place due to an unsuccessful challenge.

Division 2 was won by newcomer FHead, jumping well into division 1 for the next match. He was followed by karthik123 and macs.

# The Problems

OptimalList
Used as: Division Two - Level One:
 Value 250 Submission Rate 401 / 434 (92.40%) Success Rate 361 / 401 (90.02%) High Score Samlik for 247.72 points (2 mins 44 secs) Average Score 200.23 (for 361 correct submissions)

This problem is strongly tied with the notion of the Cartesian coordinate system. That is, no matter how complicated the sequence of instructions is, the location of the grandma's house is uniquely determined by two integer numbers, called coordinates — how many blocks should Billy walk horizontally (east or west) and how many blocks should he walk vertically (north or south). Let's call the first number dx, and the second number dy.

We'll make dx positive if Billy needs to walk east and negative if he needs to walk west (i.e., dx=-3 means he needs to go 3 blocks west); positive dy will correspond to walking north. To calculate dx then we need just to subtract the number of 'W's in the input string from the number of 'E's in it, dy will be the number of 'N's minus the number of 'S's.

Once we've determined the coordinates of grandma's house, it becomes easier to find the shortest path: we know the moves we should do horizontally (if dx is positive, we need to walk only east, and not west, or else the path would not be shortest; a similar argument applies in other cases) and vertically, so we're practically there. The only thing left to do is to find the alphabetically first sequence of moves — but that's straightforward: 'E's go before 'N's and 'S's, 'W's go after 'N's and 'S's.

See dplass' solution for an example of implementing this approach.

This approach can also be rephrased as the following greedy one: while the current instructions string contains both 'W's and 'E's, remove one of each, do the same for 'N's and 'S's. Sort the result and return as an answer. For an implementation, see m4risU's code.

LostParentheses
Used as: Division Two - Level Two:
 Value 500 Submission Rate 270 / 434 (62.21%) Success Rate 188 / 270 (69.63%) High Score mi6091 for 486.14 points (4 mins 49 secs) Average Score 310.43 (for 188 correct submissions)
Used as: Division One - Level One:
 Value 250 Submission Rate 389 / 401 (97.01%) Success Rate 344 / 389 (88.43%) High Score gevak for 248.75 points (2 mins 1 secs) Average Score 197.36 (for 344 correct submissions)

This problem had several possible correct approaches. Arguably the easiest of those was the following: the numbers before the first minus sign should be added to the result, and the numbers after should be subtracted, as coded by Quelloquialism. To understand why this approach works, we need to prove two facts: that this result is achievable, and that it is minimal possible. We can achieve that number by putting brackets between consecutive minuses, as follows: 3+5+7-(2+3+10)-(99+57)-(22)-(2)-(3+4) — that way all the numbers in the brackets are subtracted. We can't reduce the result even further, as the numbers before the first minus get added to the final result no matter how we put the brackets.

If you've solved a lot of similar 'add-brackets' problems before, you could easily skip this easy solution and go for a dynamic programming one. That is, let's find the minimal possible and maximal possible values for every substring of the input string that starts and ends with a number. To find the minimum for a particular substring, we iterate over all possibilities of which operator will be evaluated last. If that operator is a '+', we need to take the minimum possible values of the strings to the left of it and to the right of it, and add them. If it's a '-', we need to take minimum on the left and maximum on the right. The maximum is computed similarly. We can evaluate the minima and maxima in the order of increasing substring length to be able to use previously computed values, or we can implement the memoizing approach, like marek.cygan did. For more information on dynamic programming, see the tutorial.

In fact, this problem allowed even some seemingly incorrect approaches to pass. For example, you could make a mistake in the DP and not compute the maxima, assuming the minimum can always be obtained by adding or subtracting two minima, and still succeed because the method of putting brackets that was discussed above when proving the greedy algorithm can indeed be obtained by considering only minima.

IncreasingSubsequences
Used as: Division Two - Level Three:
 Value 1000 Submission Rate 47 / 434 (10.83%) Success Rate 21 / 47 (44.68%) High Score macs for 784.26 points (15 mins 50 secs) Average Score 572.72 (for 21 correct submissions)

To solve this problem, one needed to think about how to tell if a given increasing subsequence is maximal. The answer is not so complicated, and is obtained by trying to unerase any other number: an increasing subsequence is maximal, if all the numbers erased before its first number are not less than it, all numbers erased after its last number are not greater than it, and all numbers erased between any two its consecutive numbers are either not greater than the first of those two, or not less than the second.

One could then test all increasing subsequences, or even try to write some backtracking solution, but both these methods will probably time out, because the answer can be as high as about 86 million (try the sequence 3,2,1,6,5,4,9,8,7,...).

But the nature of the problem — the fact that the maximality test can be done separately for each segment between the two consecutive numbers of the sequence — leads us to a dynamic programming solution. Let's define am[i] as the number of maximal increasing subsequences of the first i elements of our sequence that end with exactly i-th element. To compute am[i], we first check if the i-th element can start a maximal subsequence, by checking that no element before is less than it. If it can, we initialize am[i] with one, otherwise with zero. We then check all the possible positions j of the previous element of the subsequence, and if i-th element can follow j-th (i.e., the former is greater than the latter and no element can be inserted in the middle), we increase am[i] by am[j]. To compute the final answer, we sum all the am's for the positions that can end a maximal subsequence — that is, for which no following number is greater.

See FHead's solution for an implementation of this approach.

RailwayTickets
Used as: Division One - Level Two:
 Value 500 Submission Rate 249 / 401 (62.09%) Success Rate 122 / 249 (49.00%) High Score abikbaev for 475.20 points (6 mins 33 secs) Average Score 313.73 (for 122 correct submissions)

The first thing to understand in this problem is that we should not be overconcerned with assigning seats to passengers. The only thing that really matters is that, at any time, there should be no more than seats than passengers inside the train. To prove that, we'll simply allow every entering passenger that has his request accepted to sit at any unoccupied seat. Obviously this strategy will fail only if there are more than seats than passengers traveling simultaneously.

So, long story short, we have a set of segments on a line and need to choose a subset such that every point is covered by at most seats segments. In case seats is equal to 1, this problem is called an "Activity Selection Problem", and its solution is described in the tutorial on greedy algorithms. Try to read and understand the first part of that article (until "BioScore" caption) before continuing with this editorial.

How to adapt that solution to work with a train with multiple seats? It turns out that the same greedy approach works: we sort the passengers in the increasing order of their exit (and not enter!) time, and then process them in the obtained order and assign any passenger a seat while it's possible. Why does it work? The proof is also similar to the one described in the above tutorial. Assume that at some point we take passenger i, but not taking her could lead us to a better solution — and the seat we give to passenger i is given to passenger j instead. Let's consider the first such moment, i.e., let's assume that all passengers that we placed earlier retain their seats. But since passenger i has exit time not later than passenger j and all other remaining passengers, we can replace passenger j with passenger i without affecting the passengers placed later, and we don't affect the passengers placed earlier by the choice of i, so we've obtained a correct solution of the same size but still containing passenger i. So there can be no error in following our greedy algorithm.

The easiest way to check if adding the next passenger is possible, is to keep the number of occupied seats for each unit segment of time (the segment from time x to time x+1). For an implementation of this approach, see RAVEman's solution.

This problem also allows for a totally different greedy approach. At first, we will accept all the requests, and try to simulate the train's travel. At any moment of time, we will remember the exit times of the passengers that are on the train. When some passenger enters the train, we will add his exit time to the list. When he exits, we remove it. The only possible problem is that after some passenger enters we might get more than seats passengers onboard — that's when we need to reject some request. But whose request we will reject? After some thinking we can deduce that we should reject the request with the latest exit time of those currently onboard; the formal proof of this fact is left to the reader. A solution implementing this approach will also be quite simple, like that of abikbaev.

The main difficulty in this problem is that it also has a lot of incorrect greedy approaches. For example, sorting by the starting time or choosing the shortest trip are incorrect. To avoid falling into that trap, one needed to be either lucky or capable of proving her solution. This is really a situation where one can't be too careful.

NormalizingTrees
Used as: Division One - Level Three:
 Value 1000 Submission Rate 55 / 401 (13.72%) Success Rate 12 / 55 (21.82%) High Score Petr for 770.47 points (16 mins 34 secs) Average Score 549.84 (for 12 correct submissions)

The first, straightforward step in the solution of this problem is to construct the tree (the list of adjacent vertices for each vertex) given the input array.

Then, let's try all possibilities for the root vertex. Assuming we've chosen the root, we need to renumber all the vertices in such a way that the resulting vector is smallest lexicographically.

First, the root should obviously be labeled 0, as it is the only possibility for the resulting vector to have -1 at the 0th position. Then, all the root's children will correspond to zeroes in the resulting vector, and thus we need to label them with 1 to s in some order — it is the only possibility for the resulting vector to have zeroes at next s positions, in all other cases some of these numbers will be positive. Using the same argument we find that we need to label the children of vertex 1 with next consecutive numbers, then children of vertex 2, etc. The only freedom left is choosing the order of children at each step.

Consider the children of vertex 0. Their zeroes will be followed by the ones of the children of vertex 1, and then by larger numbers. Thus, we need the number of ones to be as big as possible, choosing the vertex with the greatest number of children as the first child. Or, in other words, we need to sort the children of vertex 0 in the decreasing order of the number of their children. But what to do if several of those numbers are equals? Then we need to check the number of children of the first child, etc.

All this sorting seems too complicated. But it turns out that we can formulate the sorting criterion in a simple way. Assume our function to find the lexicographically smallest description is called calc. Let's sort the children in the lexicographical order of the descriptions obtained by calling calc recursively for them, but with a slight modification: if one of such descriptions is a starting segment of the other, the former should be treated to be greater than the latter, and usually (for example, like vectors are compared in C++ by default) it's the other way around. To achieve that modification, we can append infinity to every description, then our sorting becomes just the sorting of vectors.

It's not at all obvious why the above sorting works, but we'll omit the formal proof because of its length and tediousness. To prove that for yourself, try to understand step-by-step how the sequence for the parent is obtained from the sequences for the children.

tomek's solution is an excellent implementation of the above approach, refer to it for the details.

By Petr
TopCoder Member