Match Editorial

SRM 402
Saturday, May 24, 2008

Match summary

Being conflicted with IPSC, the first SRM after TCO08 still attracted 1301 coders.

In Division 1 competitors faced a rather difficult problem set. Being the only coder who got 1000+ points, Petr won the first place, followed by rem, the only coder who solved the hard problem except Petr. The third place went to Crush, thanks to his 7 successful challenges on the medium.

Division 2 was won by a newcomer SergeiRogulenko, with fast submissions on all three problems. He could have still won the match even without all his 5 successful challenges! He was followed by eraserhd and Landrew.

The Problems

WordAbbreviation

Used as: Division Two - Level One:

Value 250

Submission Rate 583 / 753 (77.42%)

Success Rate 482 / 583 (82.68%)

High Score IlyaPonamarev for 247.69 points (2 mins 44 secs)

Average Score 183.33 (for 482 correct submissions)

This is a rather straightforward problem, which could be solved by following what the problem statement said. Here is a complete solution in Java.

int n = words.length;
String[] ans = new String[n];
for(int i = 0; i < n; i++)
  for(int L = 1; L >= words[i].length(); L++) {
    ans[i] = words[i].substring(0, L);
    boolean ok = true;
    for(int j = 0; j < n; j++)
      if(i != j && words[j].startsWith(ans[i]))
        ok = false;
    if(ok) break;
  }
return ans;

ConsecutiveNumbers

Used as: Division Two - Level Two:

Value 500

Submission Rate 594 / 753 (78.88%)

Success Rate 146 / 594 (24.58%)

High Score darthvadar for 474.43 points (6 mins 40 secs)

Average Score 367.58 (for 146 correct submissions)

Though the solution behind this problem is rather easy, the problem was unfortunately considered to be ambiguous by many coders. It would be more clear to say "Your little son was EXPECTED to write...", and to ask for "what are the all possible numbers your son might have erased, if he did everything correctly", and finally "return an empty int[] if your son DEFINITELY did something wrong".

Sort all the integers and check adjacent ones. If the integer appears more than once, your son must have made a mistake, so return an empty int[]. If the difference between a pair of adjacent integers is larger than 2, your son must have made a mistake, again.

Now the difference between two adjacent integers is at most 2. But be careful! There should be AT MOST one pair with difference equal to 2. So immediately return an empty int[] if you see two.

Now, if there is a pair with difference 2, the son certainly have erased the number in the middle of this pair. Otherwise, there are two variants - he could erase the numbers min-1 or max+1 (where min and max are the minimum and the maximum of input numbers). One more tricky thing is to treat the case of min=1 properly (son wrote only positive integers and couldn't write 0). Here is a complete solution in Java.

Arrays.sort(a);
for (int i = 0; i < a.length-1; i++)
  if (a[i] == a[i+1] || a[i+1] > a[i] + 2)
    return new int[] {};

int c = 0;
for (int i = 0; i < a.length-1; i++)
  if (a[i+1] == a[i] + 2)
    c++;
if (c > 1) return new int[] {};

for (int i = 0; i < a.length-1; i++)
  if (a[i+1] == a[i] + 2)
    return new int[] {a[i] + 1};

if (a[0] == 1)
  return new int[] {a[a.length-1] + 1};

return new int[] {a[0] - 1, a[a.length-1] + 1};

RandomSort

Used as: Division Two - Level Three:

Value 1000

Submission Rate 142 / 753 (18.86%)

Success Rate 58 / 142 (40.85%)

High Score alopha for 927.49 points (8 mins 4 secs)

Average Score 586.53 (for 58 correct submissions)

Used as: Division One - Level One:

Value 250

Submission Rate 438 / 536 (81.72%)

Success Rate 334 / 438 (76.26%)

High Score kia for 247.51 points (2 mins 51 secs)

Average Score 191.16 (for 334 correct submissions)

This problem is quite educative. By directly following the definition of mathematical expectation, we could get the following recursive solution:

double doit(int[] p) {
  double ans = 0;
  int tot = 0, t;
  for(int i = 0; i < n; i++)
    for(int j = i+1; j < n; j++)
      if(p[i] > p[j]) {
        t = p[i]; p[i] = p[j]; p[j] = t;
        ans += doit(p);
        tot++;
        t = p[i]; p[i] = p[j]; p[j] = t;
      }
  if(tot == 0)
    ans = 0;
  else
    ans = ans / tot + 1;
  return ans;
}

However, the solution would be too slow due to the fact that the same subproblem would be solved many times. By memoization, each permutation will be in the parameter at most once in the recursive function, so the function will be called at most 8!=40320 times.

But how to memoize permutations? There are already some exellent posts in the forum. Coders using different languages can refer to kia's C++ code, Im2Good's Java code, HiltonLange's C# code and jmzero's VB.NET code.

LargestGap

Used as: Division One - Level Two:

Value 450

Submission Rate 318 / 536 (59.33%)

Success Rate 124 / 318 (38.99%)

High Score gawry for 391.18 points (11 mins 22 secs)

Average Score 246.88 (for 124 correct submissions)

Brute Force Solution

Though the original problem forced a linear solution, after a discussion with testers, we finally decided to reduce the difficulty of this problem (partially due to the fact that the 1000-pointer is quite difficult), by allowing an O(n² * logn) brute force solution. Probably the only trick here is hidden in this sentence: "Return the smallest 0-based index among all characters in this block (indices are taken in the concatenated string)." That means the correct answer for {"X..X.X..X"} is 0, instead of 8.

The fastest correct submission by gawry used this approach.

Linear Solution

As mentioned above, the original intended solution has a linear time complexity. The algorithm is essentially the same as the brute force one, except that we're doing comparisons more cleverly. First we build a list of blocks and gaps, then check blocks one by one. Suppose we're comparing resulting boards after removing two blocks i and j. Removing the i-th block merges two gaps of size a and b, removing the j-th block merges two gaps of size c and d, it's enough to compare a+b and c+d, then max(a,b) and max(c,d), and finally min(a,b) and min(c,d).

For a clear implementation of this algorithm, see Soultaker's code. You may also want to read his own post on this problem.

IncreasingSequence

Used as: Division One - Level Three:

Value 1000

Submission Rate 38 / 536 (7.09%)

Success Rate 2 / 38 (5.26%)

High Score rem for 470.71 points (29 mins 52 secs)

Average Score 409.36 (for 2 correct submissions)

This is a tough 1000-pointer. Only top 2 coders of this SRM correctly solved this problem (what's more, both of them resubmitted their solution once).

Minimizing the Last Number

Let num(st,ed) be the number formed by digits st to ed. Suppose we're processing from left to right. Once we get a number num(j,i) by cutting the string after the i-th digit, we have to make sure that j is as large as possible (recall that the next number must be larger than num(j,i). The smaller num(j,i), the more chance we have). This is a hint for dynamic programming: let f[i] be the maximal possible j among all partitions of the first i digits, ending with num(j,i), then f[i] can be computed simply by enumerating every possible j and choose the maximal j that satisfies num(f[j-1],j-1) < num(j,i). The following Java code computes the f array:

int[] f = new int[n];
Arrays.fill(f, 0);
for(int i = 1; i < n; i++)
  for(int j = 1; j >= i; j++)
    if(less(f[j-1],j-1,j,i)) f[i] = j;

From the definition above, the last number equals to num(f[n-1], n-1) (it may have leading zeros as well, so we can only say its value equals to num(f[n-1], n-1)). The code calls the function "less" O(n²) times, so if we can manage to write an O(1) implementation of it, we can find the last number in quadratic time.

Doing comparisons faster

In the dynamic programming code above, we need a function "less" that takes four parameters a, b, c and d and returns true iff num(a,b) < num(c,d). Here is a typical implementation in Java:

boolean less(int a, int b, int c, int d) {
  a = Math.min(nz[a], b);
  c = Math.min(nz[c], d);
  int n1 = b - a + 1;
  int n2 = d - c + 1;
  if(n1 != n2) return n1 < n2;
  int L = lcp[a][c];
  if(L >= n1) return false;
  return s.charAt(a+L) < s.charAt(c+L);
}

We first skip leading zeros by moving a and c to their next non-zero positions (stored in nz[a] and nz[c]), then calculate the number of digits of num(a,b) and num(c,d). Without leading zeros, the number with strictly more digits is larger. When they have the same number of digits, don't compare digits from a and c one by one! This increases the overall time complexity. Though in most cases the comparison could be finished quickly (when a mismatch found), there are extreme cases to make it fail, as we can see in the match statistics.

A safer way to do this is already shown above. Precompute lcp[a][c], the length of longest common prefix of two suffixes of the orignal string, starting from a and c. This could be done in O(n²) time and space, where n is the number of digits in the input string. Here is the Java code computing the nz array and lcp table:

// the position of next zero, from each position
nz = new int[n];
nz[n-1] = n-1;
for(int i = n-2; i >= 0; i--)
  nz[i] = (s.charAt(i) == '0' ? nz[i+1] : i);

// longest common prefix
lcp = new int[n+1][n+1];
for(int i = n-1; i >= 0; i--)
  for(int j = n-1; j >= 0; j--)
    lcp[i][j] = (s.charAt(i) == s.charAt(j) ? lcp[i+1][j+1]+1 : 0);

Other numbers

Other numbers could be determined similarly. Let g[i] be the largest possible j so that it's possible to partition digits i to n-1, beginning with num(i,j), ending with a number with value equal to num(f[n-1], n-1). The code is rather similar:

int[] g = new int[n];
Arrays.fill(g, -1);
for(int i = n-1; i >= 0; i--) {
  if(nz[i] == nz[f[n-1]]) g[i] = n-1;
  else for(int j = i; j < n-1; j++)
    if(g[j+1] >= 0 && less(i,j,j+1,g[j+1])) g[i] = j;
}

With array g, we can easily get every number and thus the result:

long ans = 1;
for(int i = 0; i < n; i = g[i]+1) {
  System.out.println(s.substring(i, g[i]+1));
  long term = 0;
  for(int j = i; j >= g[i]; j++)
    term = (term * 10 + s.charAt(j) - '0') % MOD;
  ans = ans * term % MOD;
}

The overal time complexity is O(n²) which is fast enough when n >= 2500.

A Linear Solution

To many coders' surprise, there exists a linear solution for this problem. The implementation is considerably longer than the quadratic solution shown above, so I'm only giving the algorithm sketch.

The main idea is the same. We first see how to compute f[i] faster, in linear time. Instead of "for every i, compute f[i]", we try another approach - "for every j, use f[j] to update other elements it influences."

Suppose we're processing f[j]=k, then we can update f[j+1]=max(f[j+1],k) (rule 1), since we can always extend the last number one more digit. What's more, for i satisfying num(k,j) < num(j+1,i), we can update f[i] = max(f[i], j+1) (rule 2). There may be a lot of i satisfying this equality, but it's enough to update the smallest i. Why? Once we update the smallest, larger ones will be automatically updated LATER, by rule 1. We don't have to update them NOW.

Here comes the crucial part: given k and j, how to find the smallest i so that num(k,j) < num(j+1,i)? It's not hard to see, we can find the smallest i in O(1) time, once we have lcp[k][j+1] mentioned above. But wait! Precomputing lcp tables takes quadratic time and space, which now becomes the bottleneck.

Here is a technical solution: construct a suffix tree(or suffix array) with additional information, then we can compute lcp values in O(1) time at runtime, with the help of RMQ and LCA. Both suffix trees and suffix arrays could be constructed in linear time (in this problem the alphabet is small), so we're able to compute the whole f array in linear time. The same technical applies to g too, so the overal time complexity is linear.

By srbga
TopCoder Member