Match Editorial

SRM 281
Thursday, January 5, 2006

Match summary

SRM 281 proved to be very challenging for most competitors and sported low success rates on all but one problem. Congratulations to Petr for coming out on top of division 1, despite having to resubmit the hard problem. andrewzta and misof followed closely in second and third.

Coders in division 2 faced tricky easy and medium problems, but a relatively easy 900-pointer which more than 150 competitors got right. P_YegreS_P won the match with all three successful submissions and 4 challenges for a rating boost of almost 200 points and a chance to participate in division 1 for the first time.

The Problems

RunLengthEncoding

Used as: Division Two - Level One:

Value 250

Submission Rate 357 / 396 (90.15%)

Success Rate 186 / 357 (52.10%)

High Score jakub for 245.31 points (3 mins 56 secs)

Average Score 187.56 (for 186 correct submissions)

As illustrated by the examples, there's a little more to this problem than it may seem at first glance: the decoded string can get very long (much longer than we can fit into memory) so it will not suffice to just reconstruct the string and then check if the result is too long. To avoid this, we can observe that the result will surely be too long if any three consecutive characters in the input are decimal digits. After we've made sure that this isn't the case, the result can become no longer than about 1600 characters, which will easily fit into memory. The rest of the problem is string manipulation.

Alternatively, the problem can be solved in one pass using Horner's method of parsing integers from strings left to right. It starts from zero and, as each new digit is processed, multiplies the number by 10 and adds the new digit. This way we can detect large integers early enough and bail out. Here's how it could have looked in C++:

   string decode( string text )
   {
      string ret;
      int rep = 0;
      for ( int i=0; i<(int)text.length(); ++i ) {
         if ( isdigit( text[i] ) ) {
            // new digit, continue building integer
            rep = rep*10 + ( text[i] - '0' );
            if ( rep > 50 )
               return "TOO LONG";
         }
         else {
            // letter, append rep occurences to the return
            if ( rep < 1 ) rep = 1;
            if ( ret.length() + rep > 50 )
               return "TOO LONG";
            
            ret += string( rep, text[i] );
            rep = 0;
         }            
      }

      return ret;
   }

IntegerGenerator

Used as: Division Two - Level Two:

Value 500

Submission Rate 291 / 396 (73.48%)

Success Rate 52 / 291 (17.87%)

High Score jianqinf for 425.21 points (12 mins 22 secs)

Average Score 280.75 (for 52 correct submissions)

Used as: Division One - Level One:

Value 250

Submission Rate 333 / 345 (96.52%)

Success Rate 166 / 333 (49.85%)

High Score bmerry for 239.33 points (6 mins 3 secs)

Average Score 184.51 (for 166 correct submissions)

The constraints were set up so that an approach which repeatedly increases the current number until it runs into a valid number was sure to time out (such an algorithm can take about 90 billion steps in the worst case). A good rule of thumb which helps avoid failure by timeout is to always test your solution on large cases, regardless of the point value of the problem.

A much more efficient algorithm parallels how we increase a decimal number (on paper, say): we add 1 to the last digit and if it was 9, set it to 0, carry 1 to the left and repeat. The same algorithm can be generalized and applied here: change the last digit to the next smallest allowed digit. If the last digit was already the largest possible, instead set it to the smallest allowed and proceed with the second last digit etc.

Another approach coders tried was (if allowed contains B digits) to interpret the input as a base-B integer (which fits into a 64-bit integer type), increase it by one and then convert back. There were many tricks to this, such as leading zeros in the base-B representaion. Most successful submissions used the first algorithm.

One more common mistake was failing to notice that, while the input number was at most 10 digits long, the output could be 11 digits. Examples 4 and 6 in the problem statement served as hints for this case.

BinarySearchable

Used as: Division Two - Level Three:

Value 900

Submission Rate 200 / 396 (50.51%)

Success Rate 162 / 200 (81.00%)

High Score fengjun for 888.25 points (3 mins 16 secs)

Average Score 733.65 (for 162 correct submissions)

The pseudo-code in the problem statement can be directly converted into a dynamic programming solution. For each element X and pair of indices a and b, calculate f(X, a, b), a boolean function which returns true if X would be found in the subsequence of the original sequence represented by indices a and b. To calculate f(X, a, b) for some X, a and b, we try all possible choices of pivots and recursively evaluate f, with the new a and b depending on X and the pivot. If X would be found for all choices of pivots, then f(X, a, b) is true. Caching the results yields a O(n⁴) algorithm, which will easily do.

Most (if not all) competitors who successfully solved the problem recognized a much simpler (and more efficient) solution. It is easy to prove that an element X is not binary searchable if and only if there is an element before X larger than it or if there is an element after X smaller than it. What follows is a simple O(n²) algorithm which can even be improved to O(n) with a little preprocessing.

   public int howMany( int[] sequence )
   {
      int ret = 0;
      for ( int i=0; i<sequence.length; ++i ) {
         boolean ok = true;
         for ( int j=0; j<i && ok; ++j )
            if ( sequence[j] > sequence[i] )
               ok = false;

         for ( int j=i+1; j<sequence.length && ok; ++j ) 
            if ( sequence[j] < sequence[i] )
               ok = false;

         if ( ok )
            ++ret;
      }
      
      return ret;
   }

BallBouncing

Used as: Division One - Level Two:

Value 600

Submission Rate 87 / 345 (25.22%)

Success Rate 35 / 87 (40.23%)

High Score KOTEHOK for 461.15 points (16 mins 40 secs)

Average Score 305.65 (for 35 correct submissions)

With a board of size at most 1000000x1000000, a straightforward simulation (step by step) would time out so a different approach was required. One important observation is that, given the current position and direction of the ball, it is possible to determine where it will bounce next (position and direction) in O(1) time using arithmetic. If we're able to determine a reasonable upper bound on the number of bounces then a simulation bounce by bounce will be fast enough. One such upper bound is 8 million bounces, because there are about 4 million boundary points in the worst case and the ball can enter each boundary point from two directions. As it turns out, 4 million bounces was also good enough (see the analysis by members in the forum for this match). Additionally, by the pigeonhole principle, if the ball does that many bounces then it has surely visited some state twice and thus run into a cycle, so we can return -1.

One implementation detail was checking whether the ball will fall into a hole before the next bounce. It was intended that this check be done in O(1) time rather than in O(holes), although coders got away with the latter. An important observation is that the expression row-col is constant for all squares on a single "slash" diagonal (line with slope +1). Similarly, all squares on a "backslash" diagonal (line with slope -1) have constant row+col. By preprocessing the holes (keeping track of which diagonals have holes on them) it is possible to do the check in O(1) time during the simulation. See Petr's solution for a clean implementation.

Equidistance

Used as: Division One - Level Three:

Value 1000

Submission Rate 68 / 345 (19.71%)

Success Rate 23 / 68 (33.82%)

High Score Egor for 872.30 points (11 mins 12 secs)

Average Score 610.28 (for 23 correct submissions)

The division 1 hard problem was quite mathy and required a lot of intuition to solve. I'll try and go about describing how to solve the problem and prove claims along the way. To simplify things, let's make a few definitions: we are given a sequence of integers a₁, a₂ … a_n, sorted in ascending order (sorting is the first step in the solution which, unfortunately, some coders missed). What we are to find is a sequence δ₁, δ₂ … δ_n (where δ_i is how much we move item i; the difference between its positions after and before) such that a_i+δ_i - (a_i-1+δ_i-1) equals some constant D for all positive i and the sum of all abs(δ_i) is minimal.

The first osbservation is that there exists an optimal arrangement (arrangement with the smallest possible total effort) in which at least one of the items does not move. This can be proven by contradiction: assume that all optimal arrangements have no items which don't move. But then, if we take any such optimal arrangement, we can move all of the items left by one or right by one without increasing the cost. If we continue moving the items in that direction, we will eventually run into an arrangement in which at least one of the items is fixed (and the effort is no greater than that of the starting optimal arrangement).

One wrong assumption some coders made is that an optimal arrangement exists in which at least two of the items remain fixed. While this would be true if the items' positions could be real numbers, it isn't in this problem.

The next part of the solution is to show that for a fixed item a_f (δ_f=0) and a chosen D, the positions of all the other items are uniquely determined. This should be obvious and the actual formula to calculate δ_i is δ_i = a_f + (i-f)D - a_i.

What's left is to analyze how total effort changes with D (when one item is fixed). As we said, total effort is the sum of abs(δ_i) for each i. But each of these component functions is, according to the formula from the previous paragraph, a "spike" (it would be a line if it weren't for the absolute value) with slope abs(i-f). Since this function is convex and the sum (actually, any linear combination with positive coefficients) of convex functions is also a convex function, the sum is also unimodal (decreases, reaches minimum, then increases again) and we can use ternary search to find the minimum. A number of successful submissions used this approach, for example those by misof and John Dethridge.

An alternate solution is to go even deeper into analyzing the function and prove that the minimum must be somewhere around (ceil or floor) the minima (spikes) of the component functions (if real numbers were allowed, then the total minimum would be exactly at the minimum of one of the component functions). The formal proof for this is similar to what we used in the first observation. This solution takes O(n³) time. Petr and andrewzta are among those who used this approach during the contest. Here is the code which implements this approach:

   long effort( int[] initial, int fixed, long D )
   {
      if ( D < 1 ) D = 1;
      
      long ret = 0;
      for ( int k=0; k<initial.length; ++k )
         ret += Math.abs( initial[fixed] + (k-fixed)*D - initial[k] );
      return ret;
   }
   
   public long minimumEffort( int[] initial )
   {
      Arrays.sort( initial );
      long best = Long.MAX_VALUE;

      int n = initial.length;
      for ( int i=0; i<n; ++i )
         for ( int j=0; j<n; ++j ) {
            double D = (double)( (long)initial[j] - initial[i] ) / (j-i);

            best = Math.min( best, effort( initial, i, (long)Math.floor(D) ) );
            best = Math.min( best, effort( initial, i, (long)Math.ceil(D) ) );
         }

      return best;
   }
}

By lovro
TopCoder Member