October 5, 2013

SRM 593

vexorianvexorian

DURATION

25 min

Match Summary

Round Overview
Discuss this match

This problem set was brought to you by tozangezan and rng_58 (div1 easy and div1 hard). Division 1 coders experienced an exciting match with a tricky starting problem. This 250-points ad hoc/graph theory hybrid problem was attempted by the majority of coders, but barely 40% of all the coders in division 1 could solve it correctly. Congratulations to K.A.D.R for solving this tricky problem without mistakes in 5 and a half. The medium was slightly less challenging than usual to compensate, but still required some clever analysis. tourist reminded us why he is tourist by solving it in less than 6 minutes. The hard problem was solved correctly by 4 coders, of which Petr was the fastest. Just solving the 3 problems was not enough, however. This was a match with a very fruitful challenge phase, in which Petr scored 225.0 points to secure the top division place. Followed by Egor in a very close second position. The third and fourth places, RAVEman and Apsara also had very high scores. Congratulations also to division 2 winner: skies457.

RaiseThisBarn

Problem Details

Used as: Division Two - Level One:


Value	250
Submission Rate	1255 / 1338 (93.80%)
Success Rate	1102 / 1255 (87.81%)
High Score	gaurab_sust for 248.91 points (1 mins 53 secs)
Average Score	201.77 (for 1102 correct submissions)

As the statement says, we need to count how many values i from 1 to n-1, inclusive can cut the string in such a way that the number of cows (‘c’ characters) is the same in both sub-strings (The one between indexes 0 and i-1, inclusive and the one between indexes i and n-1, inclusive (0-indexed)

There are at most n-2 candidate values for i. We can just try each of them and count the valid ones. To count the valid ones, you need to count the number of characters in each of the two substrings and compare them. You can do it manually like in the following Java solution:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
public int calc(String str) {
  int n = str.length(), res = 0;

  // for each candidate i:
  for (int i = 1; i <= n - 1; i++) {
    int c1 = 0;
    // count in first substring:
    for (int j = 0; j < i; j++) {
      if (str.charAt(j) == 'c') {
        c1++;
      }
    }
    int c2 = 0;
    // count in second substring:
    for (int j = i; j < n; j++) {
      if (str.charAt(j) == 'c') {
        c2++;
      }
    }
    if (c1 == c2) {
      // if the counts are equal, i is valid:
      res++;
    }

  }
  return res;
}

Or you can use your language’s features and library functions to simplify the counting code. Like in the following c++ and python solutions:

1
2
3
4
5
6
7
8
9
10
11
12
int calc(string str) {
  int c = 0;
  for (int i = 1; i < str.size(); i++) {
    // cut the string in two substrings:
    string a = str.substr(0, i), b = str.substr(i);
    // we can use the std::count function:
    if (count(a.begin(), a.end(), 'c') == count(b.begin(), b.end(), 'c')) {
      c++;
    }
  }
  return c;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
class RaiseThisBarn:
  def calc(self, s):
  def valid(i): #True
if is the string valid ?
  return s[: i].count('c') == s[i: ].count('c') # compare the counts of 'c'
# in both substrings
# add "1"
for each valid i in the range:
  return sum(valid(i) for i in xrange(1, len(s)))

# we could even inline the valid(i)
function inside the call to sum, but this
# is not an obfuscation contest

Submission speed is quite important in these problems that are more implementation-focused. The better you can simplify your code and know how to use your language’s tools the faster your submit your code.

WolfDelaymaster

Problem Details

Used as: Division Two - Level Two:


Value	500
Submission Rate	1082 / 1338 (80.87%)
Success Rate	606 / 1082 (56.01%)
High Score	o___o for 492.44 points (3 mins 31 secs)
Average Score	346.50 (for 606 correct submissions)

We can see this one as a standard recursive problem or as a parsing problem.

Recursion

Let us redefine a valid string as a concatenation between a valid string and a string that follows the pattern: ww…woo…oll…lff…f . The most complicated valid example case is:

1
** wolfwwoollffwwwooolllfff ** wwwwoooollllffff

“wolfwwoollffwwwooolllfff” is a valid string and “wwwwoooollllffff” follows the repeated pattern.

Let us define `f(s)` as a function that returns true if and only if the string s is valid. We know that it must end with a string that follows the repeated pattern. It is easy to generate a string of `4r` characters that follows the pattern, where r is positive. (For example, for `r=2`: wwoollff. For each `r <= |s|`, we can determine if the string ends with a repeated pattern string. Then we just need to check if the first `|s| - 4r` characters of the string make a valid string. This is the same as calling `f(s’)`, where `s’` is the string that is composed of the first `|s| - 4r` characters of the string. If there is a repeated pattern AND the remaining string is also valid then the complete string is also valid.

We need a base case, eventually the string will be a single repeated pattern, which will make the remaining string the empty string. We can consider the empty string valid.

Note that each instance of `f(s)` calls at most one other instance of `f(s’)` in which `s’` is strictly smaller. E.g: If the string finishes with ‘wolf’ it cannot finish with ‘wwoollff’. We don’t need memoization to make it run in time and it wouldn’t really be dynamic programming to fill it iteratively.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class WolfDelaymaster:
  def check(self, s):

  #
function returns w(r times) o(r times) l(r times) f(r times):
  def wolf(r):
  return ''.join(ch * r
    for ch in "wolf")

if s == '':
  return "VALID"

r = 1
while 4 * r <= len(s):
  # s[-4 * r: ] equals the last 4 * r characters in the string
if wolf(r) == s[-4 * r: ]:
  return self.check(s[: -4 * r]) #recurse
# s[-4 * r: ] equals the other characters
r += 1

#
if we didn 't find any repeated pattern, it is invalid:
return "INVALID"

String parsing

We can also try to just parse the string following the rules until we find a mistake.

We know that the string must start with w. If that is not the case, then the string is invalid.
Count the number of initial ws, let it be r. The following letters must be r ‘o’ characters, then r ‘l’ characters and r ‘f’ characters.
After finding the f characters, restart over, the next character must be w, count the number of w and repeat until we find the end of string until a valid f character.

The code is more complicated but it works:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
string check(string str) {
  int i = 0;
  int len = str.length();
  while (i < len) {
    if (str[i] != 'w') {
      return "INVALID";
    }
    int r = 0;
    // Count the number of times w repeats as we move the pointer (i):
    while ((i < len) && (str[i] == 'w')) {
      r++;
      i++;
    }
    // repeat for o, l and f: (in that order):
    char chars[3] = {
      'o',
      'l',
      'f'
    };
    for (char ch: chars) {
      // repeat r times:
      for (int k = 0; k < r; k++) {
        // next character must a) exist b) be equal to ch
        if ((i >= len) || (str[i] != ch)) {
          return "INVALID";
        }
        // move pointer:
        i++;
      }
    }
  }
  return "VALID";
}

MayTheBestPetWin

Problem Details

Used as: Division Two - Level Three:


Value	1000
Submission Rate	51 / 1338 (3.81%)
Success Rate	6 / 51 (11.76%)
High Score	aksam5800 for 621.51 points (25 mins 44 secs)
Average Score	524.80 (for 6 correct submissions)

Used as: Division One - Level Two:


Value	450
Submission Rate	235 / 760 (30.92%)
Success Rate	185 / 235 (78.72%)
High Score	tourist for 426.02 points (6 mins 48 secs)
Average Score	272.82 (for 185 correct submissions)

The largest possible difference

Let us say we have chosen the teams S and T, note that they must be complements. So we only need to choose team T, all the pets that don’t belong to team T can be assumed to belong to S. Given a selection of T, what is the largest possible difference of the time the pets will take?

Begin by assuming team T wins. Then what is the maximum possible difference in this case? Since we want T to win, then the time needed by T should be less than or equal to the time needed by S. Since we want the difference to be as large as possible, then we want the time that T takes to be as small as possible. The minimum time a pet can do is `A[i]`. The minimum time needed by T should be: `sum_(i in T)(A[i])`. Likewise, since the time needed by S should be as large as possible, the maximum time needed by S will be: `sum_(i !in T)(B[i])`. The maximum difference when we select T and it wins is: `sum_(i !in T)(B[i]) - sum_(i in T)(A[i])` .

That is in case T wins, it is possible that T loses, and maybe the maximum possible difference requires that condition. When T loses, the maximum difference is: `sum_(i in T)(B[i]) - sum_(i !in T)(A[i])`.
In total, the maximum possible difference will be: `max( sum_(i !in T)(B[i]) - sum_(i in T)(A[i]), sum_(i in T)(B[i]) - sum_(i !in T)(A[i]) )`. Note that if one of the values is negative, the other will be necessarily positive, therefore the maximum is always non-negative.

Simplifying it

Let us name things up:

`A(T) = sum_(i in T)(A[i])`
`A(S) = sum_(i !in T)(A[i])`
`B(T) = sum_(i in T)(B[i])`
`B(S) = sum_(i !in T)(B[i])`

The maximum difference is: `max(B(T) - A(S), B(S) - A(T))`. Note that `S` is the complement of `T`. Let `tA = sum_(i=0)^(i<n)A[i]`, be the total sum of `A[i]` values and `tB = sum_(i=0)^(i<n)B[i]` is the total sum of `B[i]` values.

`A(S) = tA - A(T)`
`B(S) = tB - B(T)`

The result becomes `max(B(T) - (tA - A(T)), tB - B(T) - A(T))` `=` `max(B(T) + A(T) - tA, tB - B(T) - A(T))`. Note that tA and tB are constant and do not depend on how we construct T. Both values inside the max() function depend solely on `A(T)` and `B(T)` , in fact, we only need to find `s = A(T) + B(T) = sum_(i in T)(A[i] + B[i])` , the sum of all `A[i]` and `B[i]` of the pets in the selected team T.

Dynamic programming

The value of s determines the maximum possible difference between the two teams after we pick a value of T. Let us build T by adding some elements to it: If we add pet #i, then s will increase by `A[i] + B[i]`, else s will stay the same. Initially, `s=0`. As we decide to add pets or not to the team, s will increase. The order in which we add elements to the set does not change the result. So we can decide in decreasing order of the index of the pet.

Let `f(t, s)` be the function that returns the minimum possible worst difference, if we have already decided whether to add the pets with index >t. The current sum is s:

Base case: `(t = 0)`, this means we have already made a decision for each of the pets. There is nothing else we can do, the s value will not change. The result is: `max(s - tA, tB - s)`.
Else we just need to decide what to do with pet #`(t-1)`:
- If we add the pet, s increases by `A[t-1]+B[t-1]` and we proceed to decide for the remaining `(t-1)` pets. The result will be equal to whatever result we find by calling: `f(t-1, s + A[t-1] + B[t-1])`.
- If we do not add the pet, s stays the same and we continue with the remaining pets: `f(t-1)`
Overall result for this case is to pick the minimum between those two options.
The solution to the problem is `f(n, 0)`, we need to pick pets from all the first n elements and the initial s is zero.

This is a good time to find how large can s get. Let `M` be the maximum value for `A` and `B`. (10000 per the problem’s constraints). If all values were M then s will be `2nM -= O(nM)`. In practice this means 50*2*10000 = 1000000, because n is at most 50. The number of pairs `(t,s)` is `O(n n M)`, because t is `O(n)`. There are at most 51*1000001 = 51000051 states for the function and it is acyclic (because t always drops). Each state requires `O(1)` operations besides calling other states. If we use dynamic programming so that each state is evaluated at most once, the algorithm will be: `O(n n M)` in time

This recurrence relation is similar to the one we use for the subset sum problem (In fact, it is possible to convert the problem to the subset sum problem. For each s, ask if it is possible to find that s (an instance of the subset sum problem) and remember the s that yields the best result and is possible).

Memory

The memory usage is an issue. If we use a straight-forward implementation we will need `O(n n M)` memory for the dynamic programming. Note that an array of `(n+1) times n times (M+1) = 51000051` integer values will need: 51000051*4 bytes (because each integer needs 4 bytes) which translates to around 195 MB. Above the 64 MB memory limit. There are some workarounds. The first one would be to use the subset sum idea, since we only need to know if a certain value of s is possible, we can use just 1 byte or even 1 bit for each entry in the table. Another workaround is to notice that when calculating `f(t, s)`, we only need to remember the values for `f(t-1, s)`. So in fact we only need two tables of size `n M` at a given time (One to store the results of `f(t-1, s)` and one to calculate the results for `f(t,s)`). This ensures we need only `O(n M)` memory.

Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
int calc(vector < int > A, vector < int > B) {
  const int MAX = 1000000; //Maximum value of s: 50 * 10000 * 2
  int n = A.size();
  int tA = accumulate(A.begin(), A.end(), 0);
  int tB = accumulate(B.begin(), B.end(), 0);

  vector < int > last(MAX + 1);

  //base case:
  for (int s = 0; s <= MAX; s++) {
    last[s] = max(s - tA, tB - s);
  }

  vector < int > curr(MAX + 1); //avoid allocating memory in each step (halves the exec. time)
  for (int t = 1; t <= n; t++) {

    // last[s] holds f(t-1, s)
    // curr[s]  will hold f(t, s)
    for (int s = 0; s <= MAX; s++) {
      // don't add 
      int res = last[s];
      // add A[t-1]
      if (s + A[t - 1] + B[t - 1] <= MAX) { //don't want/need to exceed the MAX
        res = std::min(res, last[s + A[t - 1] + B[t - 1]]);
      }
      curr[s] = res;
    }
    // move the contents of curr to last:
    swap(last, curr);
    /* Side note for c++ programmers: Because of std::move, doing swap
      is much faster than doing last = curr, for example.
    
     In other languages this should be of little importance because
     you'd use references to arrays/lists. */
  }
  // last[0] now holds f(n, 0):
  return last[0];
}

HexagonalBoard

Problem Details

Used as: Division One - Level One:


Value	250
Submission Rate	691 / 760 (90.92%)
Success Rate	308 / 691 (44.57%)
High Score	K.A.D.R for 241.31 points (5 mins 25 secs)
Average Score	161.72 (for 308 correct submissions)

Some simple special cases

Empty board

When there are no marked hexagons, we don’t need to paint anything and we should return 0.

A single hexagon

We obviously need only one color.

Disconnected hexagons

Generalize the last case into a case in which none of the marked hexagons touch each other:

593d12501

The result is 1 because the hexagons are not adjacent so we can use the same color in them all without issue. How can we determine if this is case?

Connectivity

It will be helpful to know how to translate the input grid so that we can tell which hexagons are adjacent to other hexagons. The input string array/vector/list is closer to a square grid than an hexagon grid, so we need some translating to do.

593d12502

For a cell `(i,j)`, cells: `(i-1,j)`, `(i-1,j+1)`, `(i,j-1)`, `(i,j+1)`, `(i+1,j-1)` and `(i+1,j)` will be adjacent to it.

Hexagon tiles and color

The more complicated cases require us to color the hexagons using a good strategy. The key to this problem is to notice a key property of the hexagon grid. Assume we had to color all the hexagons in the grid. How can we use the least colors?

593d12503

If we repeat this pattern we can actually fill any grid of `N times N` hexagons using only 3 colors. This is a known property of hexagon tilings.

The answer is at most 3

Imagine that for any input, you always paint the whole grid using the previous strategy (at most 3 colors). Then you undo painting the cells that were not initially marked to be painted. Now all the cells that you wanted to paint are painted and the rest are not and you used only 3 colors. From this we can conclude the result will never be greater than 3.

Two or Three?

After we discard the trivial cases described above (return 0 or 1). Then we are left with some hexagons that are adjacent and have to be painted. The result is not larger than 3, but it could be 2. We need to differentiate between cases that can be painted using 2 colors and those that can’t. If a test case cannot be painted using only 2 colors, then we can be certain the result is 3.

Is it possible to paint using only two colors? This is where some theory comes handy. Consider the hexagons that need to be painted as vertexes in a graph. Two vertexes are share an edge if the corresponding hexagons share an edge. We can color this graph using only two colors if and only if the graph is bipartite. We can test if a graph is bipartite in `O(|V| + |E|)` time, where `|V|` is the number of vertexes and `|E|` the number of edges using a Depth-first search as described in the wiki page. The number of hexagons is `O(n times n)` and there are at most 6 edges per hexagon. So this DFS will be `O(n times n)`. All that we need to do now is to implement this special DFS and handle the special cases.

Code

If there are 0 pain-table hexagons, return 0. If no two hexagons share an edge, return 1. If the graph is bipartite, return 2. Else return 3.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
int color[50][50];
vector < string > board;

int n, result;

// A DFS to test if the graph is bipartite, c is the color we must use
// for cell (x,y):    
void dfsBipartiteColor(int x, int y, int c) {
  // If we got to paint the cell:
  if ((board[x][y] == 'X') && (color[x][y] == -1)) {
    // Color it:
    color[x][y] = c;
    // Special case: We have foudn that there is at least one X:
    result = std::max(result, 1);

    // Try the adjacent hexagons:
    for (int nx = max(0, x - 1); nx <= min(n - 1, x + 1); nx++) {
      for (int ny = max(0, y - 1); ny <= min(n - 1, y + 1); ny++) {
        // The hexagon is adjacent and has an X:
        if ((nx - x != ny - y) && (board[nx][ny] == 'X')) {
          // continue the DFS, negate the color:
          dfsBipartiteColor(nx, ny, !c);
          // Special case: We now know there are two adjacent X:
          result = std::max(result, 2);
          // If the color is not consistent, the graph is not bipartite: 
          if (color[nx][ny] == c) {
            result = 3;
          }
        }
      }
    }
  }
}
int minColors(vector < string > board) {
  // Initialize color table with -1:
  memset(color, -1, sizeof(color));
  result = 0;
  this - > board = board;
  n = board.size();
  // Do the DFS for each cell:
  for (int i = 0; i < n; i++) {
    for (int j = 0; j < n; j++) {
      dfsBipartiteColor(i, j, 0);
    }
  }

  return result;
}

WolfDelaymasterHard

Problem Details

Used as: Division One - Level Three:


Value	1000
Submission Rate	8 / 760 (1.05%)
Success Rate	4 / 8 (50.00%)
High Score	Petr for 626.79 points (25 mins 21 secs)
Average Score	544.68 (for 4 correct submissions)

.d11000550 { width: 550px; max-width: 95%; min-width: 50%; }

Standard dynamic programming

Let us start by naming a very standard dynamic programming approach that will be too slow but we can optimize it after some effort.

`f(j)` is a solution that counts the number of valid ways to fill the prefix of length `2j`. We are using `2j` because all valid strings must be have even length (Any repeated pattern ww…wo…w has even length and the concatenation of many even length strings will also have even length). Thus we only care about prefixes of even length.

Base case: `f(0)` : The prefix of length-0 is empty and we can consider it to be valid - it is the concatenation of 0 valid strings. The number of valid ways to fill it is 1.
In another case, in order for the string to be valid the string of `2n` characters must finish with a repeated pattern (ww…wo…w). The starting index of this pattern can be any `2i` for `(0 <= i < j)`. For each of those integers `i` there is exactly one way to make the substring at range `[2i, 2j)` (using interval notation throughout the explanation) follows the repeated pattern: Fill the first `(j - i)` character positions with ‘w’ and the remaining ones with ‘o’. Let us define a function `“Pattern”(i,j)` that will return true if the pattern can be filled in range `[2i,2j)`. For each `i` such that: `“Pattern”(i,j)` is true, then we can use that pattern and fill the prefix of `2i` characters in any of the `f(i)` possible ways. For any `i` such that `“Pattern”(i,j)` is true: Add `f(i)` to the result.

We can implement this recurrence relation iteratively and we would have a `O(n^2)` algorithm. `(n <= 2000000)`, so we need to optimize this idea.

If only there was an interval

We would like to optimize the following:

1
2
3
4
5
6
7
8
9
f[0] = 0
for j = 1 to n2 {
  f[j] = 0
  for i = 0 to j - 1 {
    if (Pattern(i, j)) {
      f[j] = f[j] + f[i];
    }
  }
}

If the values of `i` for which `“Pattern”(i,j)` is true were easy to predict, we could optimize this idea greatly. Wouldn’t it be great if those values of `i` belonged to some interval? If there exists a interval `[a,b)` such that : `(“Pattern”(i,j) iff (i in [a,b)) )`. Then `f(j) = sum_(i=a)^(i<b)f(i)`. If we can find the values `a` and `b` then calculating this sum is as simple as: `F[b] - F[a]` where `F[j] = sum_(i=0)^(i<j)f(j)`, and `F[]` can be calculated by accumulating the values of `f()` as we find them.

In search for that interval

What if that was true? We should try to demonstrate that it is true. Try various possible cases and verify if it is true or not and try to find rules. It will turn out that it isn’t true, the valid values of `i` don’t always belong to a continuous interval. Things don’t end there. We can get creative. Currently the condition is `“Pattern”(i,j)`, which requires that the first half contains no 'o’s and the second half contains no 'w’s. This condition includes a variety of cases, perhaps if we limit it we can find a condition that does fit an interval?

It is true: If we limited the condition to a sub-condition `“PatternWQ”(i,j)` that required the second-half of the substring to contain only ‘?’ then the values of `i` do follow an interval.
Another restricted version of the condition is `“PatternW”(i,j)` such that the substring contains at least one ‘w’ in the first half, no 'w’s in the second half and no 'o’s in the first half:

1
** ?? w ?? w ?? ?? w ?? ?? ?? ? w ?? ?? ?? ?? ? ** ?? oo ?? o ?? o ?? o ?? o ? oo ?? ? o ? o ? o ?

This will account for almost all valid cases except those in which the first half of the string contains only ‘?’. We will focus on these two ways to do the pattern. Let us first prove that the `“PatternW”(i,j)` property has a valid interval for `i`.

PatternW proof

We begin with a value of `j`. The strings that follow `“PatternW”(i,j)` must contain at least one ‘w’, this will add an upper bound. Let maxW be the maximum position < j that contains a ‘w’:

59310001

`i` cannot be too high. For example, the substring at `[10, 2j)` does not contain any 'w’s, so we require `(i < 5)`. Formally: `(2i <= “maxW”)` or `(i <= “maxW”/2)`.

It is not evident at first, but this maxW also implies a lower bound. If `i` was too small, then the ‘w’ at maxW would belong to the second half of the string:

59310002

The substring begins at `2i`, the first half of the string has length `(j-i)`. This first half shouldn’t contain maxW: So we have: `(“maxW” < 2i + (j-i))` which is equivalent to: `(i >= “maxW” - j + 1)`. This is a lower bound for `i`.

We currently know: `(“maxW” - j + 1 <= i < “maxW”/2 + 1)`. But this condition is not sufficient. In the example we can tell that `i=2` would be invalid (Because the substring will start with ‘o’). The current conditions only consider maxW, we need more information. The first half shouldn’t contain any ‘o’. To ensure this, we have to use maxO, the maximum position of an ‘o’ This ‘o’ position must be strictly smaller than maxW, because 'o’s above maxW are those that can appear in the second half.

59310003

If maxO exists, 2*i should be strictly greater than maxO. `(i > “maxO” / 2)` .
If `i` is too large, the 'o’s can disrupt in a different way. This time the ‘o’ characters to the right of maxW may get into the first half of the substring. To deal with these, we need minO, the minimum index greater than maxW to contain an ‘o’:

59310004

We want `i` not to be so large that the o at minO is inside the first half of the substring. This means: `(2i + (j - i) < “minO”)` which is equivalent to: `(i < minO - j)`.

We have found two conditions each with an upper and lower bound: `(“maxW” - j + 1 <= i < “maxW”/2 + 1)` and `(“maxO” / 2 < i < minO - j)`. These bounds consider all the ways in which the string could be invalid (as long as it still contains w characters). The intersection of these intervals is an interval `[a(j), b(j) )`. In order to find `a(j)` and `b(j)`, we need: maxW, maxO and minO. It is possible to find those values for a given j by using binary search (e.g: What is the maximum x such that at least one of the cells to its right contains ‘w’? ). It is also possible to find all the relevant values for all n values of `j` with a single `O(n)` loop.

The remaining case

That interval accounts only for values of i such that the substring is valid and contains at least one ‘w’. What if the string contains no “w” characters? Its first half is: ??..?.

Remember that if the case was the opposite, if the second half was ??..?, then this case is also an interval. While that’s not what we are looking for it can be helpful if we flip the logic. Instead of finding the values of `i` for a given `j`, try the opposite: For a given `i`, what values of `j` are valid?

Define `“PatternQO”(i,j)` as a function that determines if the first `(j-i)` characters starting at `2i` are all “?” and there are no “w” characters in the substring `[i, j)`. For a given `i`, what are the valid values of `j` ? A very strong constraint is that the substring cannot contain any “w”, so let minW be the minimum index greater than or equal to i that contains a 'w" :

59310005

If in this example, we picked `j = 7`, the substring `[i, j)` would contain a w. This adds a constraint: `2j <= “minW”`.
Another issue is that we do not want ‘o’ characters in the first half of the string. Let minO be the minimum position greater than or equal to i to contain o:

59310006

If j was too large, the first half of the substring would contain minO and it would be invalid. We need: `2i + (j - i) <= “minO”` which becomes: `j <= “minO” - i`.

Those are two upper bounds for `j` and they are the only conditions necessary and sufficient for `“PatternQO”(i,j)`, which means that we do not only have an interval , the lower bound to the interval is i. So there exists a integer `c(i)` such that all integers in `[i, c(i))` are valid j values for `“PatternQO”(i,j)`. To find `c(i)` we can first find minO and minW. We have also shown that it should be possible to binary search for `c(i)` for each i.

How to build the dynamic programming

We now need to combine the two ideas into a single solution that builds `f(j)` iteratively. We know `a(j)` and `b(j)`, so we know that we need to sum all value `f(i)` for `i in `[a(j), b(j))`. However, this would only count the cases where the pattern starting at i includes a ‘w’. In order to account for the other cases, we have a property that works in reverse.

Assume we just found `f(j)` , then `c(j)` means that all future results `(j+1, j+2, … c(j)-1)` need to increase by `f(j)`. We can set a variable bonus that says how much we need to increase the following values of `f(j)` we find. At time `j` we increase bonus by `f(j)` and at time `c(j)`, we need to decrease bonus by `f(j)`. We can set an array `“addBonus”[j]` that says how much we should add to the bonus at time `j`.

Code - c++ binary search

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
int binarySearch(int lo, int hi, function < bool(int) > P) {
  // returns the largest integer x in range [lo, hi) such that P(x) is true
  // or lo if P(x) is false for all of them.

  while (lo + 1 < hi) {
    int ha = (lo + hi) / 2;
    if (P(ha)) {
      lo = ha;
    } else {
      hi = ha;
    }
  }
  return lo;
}

static
const int MAX = 2000000;
static
const int MOD = 1000000007;
int dp[MAX / 2 + 1];
int dpSum[MAX / 2 + 2];
int co[MAX + 1];
int cw[MAX + 1];
int c[MAX / 2 + 1];
int a[MAX / 2 + 1];
int b[MAX / 2 + 1];

int addBonus[MAX / 2 + 1];

// We already generated the string s and n is its length:
int countWords(int n,
  const string & s) {

  // Accumulate the number of 'o' and 'w' characters so that we can later
  // count the number of those characters in a given interval in O(1) time:
  co[0] = 0;
  cw[0] = 0;
  for (int i = 1; i <= n; i++) {
    co[i] = ((s[i - 1] == 'o') + co[i - 1]);
    cw[i] = ((s[i - 1] == 'w') + cw[i - 1]);
  }

  //----------------------------------------------------------------------
  // Find c[i]:
  for (int i = 0; i < n / 2; i++) {
    // Create a function kind(i), we use lambdas so that the new functions
    // have access to local variables:

    // Is substring [i, j) such that:
    // - The first half of the substring is ???..?
    // - The second half contains no 'w' (The whole substring contains no 'w') 
    auto kind = [ = ](int j) - > bool {
      int w = cw[2 * j] - cw[i * 2];
      int o2 = co[i * 2 + (j - i)] - co[i * 2];
      return (w == 0 && o2 == 0);
    };
    // [i, i) contains no 'w' (is empty) so lo = i
    // n/2 + 1 is an upper bound for what we look for:
    c[i] = binarySearch(i, n / 2 + 1, kind);
  }

  //-----------------------------------------------------------------------
  // Find a[j], b[j]:
  for (int j = 1; j <= n / 2; j++) {
    // *** Find the maximum index maxW such that (maxW < j) and (s[maxW] == 'w'): 

    // hasW(i) returns true if and only if interval [i, 2*j) contains a 'w':
    auto hasW = [ & ](int i) - > bool {
      return (cw[2 * j] - cw[i] > 0);
    };
    // We know that [j*2, j*2) is empty, so it doesn't contain w. hi = j*2
    // We use -1, as a way to say that if it doesn't find any w, return -1
    int maxW = binarySearch(-1, j * 2, hasW);

    // Default value for interval [a, b), an empty invalid one:
    a[j] = 2 * n, b[j] = -1;

    if (maxW != -1) {
      int bW = maxW / 2 + 1;
      // aW = min i: 2*j - (j - i) > maxW
      //             2*j -  j + i  > maxW
      //                        i  > maxW - j
      //                        i >= maxW - j + 1
      int aW = std::max(0, maxW - j + 1);

      // maxO = max index i between [2*aW, 2*bW) such that [i, 2*bW] contains o..w
      // hasO(i) is true if [i, maxW) contains at leat one 'o'                
      auto hasO = [ & ](int i) - > bool {
        return (co[maxW] - co[i] > 0);
      };
      // [maxW, maxW) is empty and does not contain o.
      // lo = 2*aw - 1, so that it returns that in case for no index in range
      // hasO(i) is true:
      int maxO = binarySearch(2 * aW - 1, maxW, hasO);
      // The aO limit:
      // min i: i > maxO/2
      int aO = (maxO + 2) / 2;

      // ** find minO : The minimum value such that: 2j>minO>maxW, s[minO] = 'o'
      // hasNoO returns true if [i, maxW] contains no 'o' characters:
      auto hasNoO = [ & ](int i) - > bool {
        return (co[maxW] - co[i] == 0);
      };
      // Should not exceed 2*j, we know that it must be greater than maxW
      int minO = binarySearch(maxW + 1, 2 * j, hasNoO) + 1;
      // 2*i + (j - i) < minO
      //             i < minO - j
      int bO = minO - j;
      // Finally, make the official interval [a, b) :
      a[j] = std::max(aO, aW);
      b[j] = std::min(bO, bW);
    }
  }

  //-----------------------------------------------------------------------
  // The dynamic programming part, which uses a bonus to add/remove to the future
  dp[0] = 1;
  memset(addBonus, 0, sizeof(addBonus));
  int bonus = 0;
  if (c[0] > 0) {
    bonus = (bonus + 1) % MOD;
    addBonus[c[0]] = (MOD - 1) % MOD;
  }
  dpSum[0] = 0;
  dpSum[1] = dp[0];

  for (int j = 1; j <= n / 2; j++) {
    // Add the bonus
    dp[j] = bonus;
    if (a[j] < b[j]) {
      // add the sum of all dp[k] such that (a[j] <= k < b[j] ):
      dp[j] = (dp[j] + (dpSum[b[j]] - dpSum[a[j]] + MOD) % MOD) % MOD;
    }

    // accumalted sum:
    dpSum[j + 1] = (dpSum[j] + dp[j]) % MOD;

    // update bonus:
    bonus = (bonus + addBonus[j]) % MOD;
    if ((j < n / 2) && (c[j] > j)) {
      bonus = (bonus + dp[j]) % MOD;
      addBonus[c[j]] = (addBonus[c[j]] + MOD - dp[j]) % MOD;
    }
  }

  return dp[n / 2];
}

Code - Java linear

The judge solution instead does some tricks to ensure `O(1)` complexity:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
public int MOD = 1000000007;
public char s[];

public int func(int N) {
  int i;
  int a[] = new int[N + 1], b[] = new int[N + 1]; // contains w, [a[i], b[i]] -> i
  int lastw = -1, lasto = -1, lastow = -1, lastwo = -1;

  for (i = 0; i <= 2 * N; i++) {
    if (i % 2 == 0) {
      a[i / 2] = -1;
      if (lastw != -1) {
        int minlen = Math.max((i - lastw + 1) / 2, 1);
        int maxlen = Math.min(i - lastw - 1, (i - lastow - 1) / 2);
        if (lastwo != -1) minlen = Math.max(minlen, i - lastwo);
        if (minlen <= maxlen) {
          a[i / 2] = i / 2 - maxlen;
          b[i / 2] = i / 2 - minlen;
        }
      }
    }
    if (i != 2 * N && s[i] == 'w') {
      lastw = i;
      lastow = lasto;
      lastwo = -1;
    }
    if (i != 2 * N && s[i] == 'o') {
      lasto = i;
      if (lastwo == -1) lastwo = i;
    }
  }

  int c[] = new int[N + 1]; // doesn't contain w, i -> [i+1, c[i]]
  int nextw = 2 * N, nexto = 2 * N;

  for (i = 2 * N; i >= 0; i--) {
    if (i != 2 * N && s[i] == 'w') nextw = i;
    if (i != 2 * N && s[i] == 'o') nexto = i;
    if (i % 2 == 0) {
      int len = Math.min((nextw - i) / 2, nexto - i);
      c[i / 2] = i / 2 + len;
    }
  }

  int dp[] = new int[N + 5], d[] = new int[N + 5], sum[] = new int[N + 5];
  int dsum = 0;
  dp[0] = 1;
  for (i = 0; i <= N; i++) {
    if (a[i] != -1) dp[i] = ((dp[i] + sum[b[i] + 1]) % MOD - sum[a[i]] + MOD) % MOD;
    dsum = (dsum + d[i]) % MOD;
    dp[i] = (dp[i] + dsum) % MOD;
    sum[i + 1] = (sum[i] + dp[i]) % MOD;
    d[i + 1] = (d[i + 1] + dp[i]) % MOD;
    d[c[i] + 1] = (d[c[i] + 1] - dp[i] + MOD) % MOD;
  }

  return dp[N];
}

public int countWords(int N, int wlen, int w0, int wmul, int wadd, int olen, int o0, int omul, int oadd) {
  int i;
  s = new char[N];
  for (i = 0; i < N; i++) s[i] = '?';
  for (i = 0; i < wlen; i++) {
    s[w0] = 'w';
    w0 = (int)(((long) w0 * wmul + wadd) % N);
  }
  for (i = 0; i < olen; i++) {
    s[o0] = 'o';
    o0 = (int)(((long) o0 * omul + oadd) % N);
  }
  int ans = func(N / 2);
  return ans;
}

Chat on Discord

October 5, 2013

SRM 593

DURATION

categories

Tags

share

Join Topcoder Challenges

Match Summary

RaiseThisBarn

WolfDelaymaster

Recursion

String parsing

MayTheBestPetWin

The largest possible difference

Simplifying it

Dynamic programming

Memory

Code

HexagonalBoard

Some simple special cases

Empty board

A single hexagon

Disconnected hexagons

Connectivity

Hexagon tiles and color

The answer is at most 3

Two or Three?

Code

WolfDelaymasterHard

Standard dynamic programming

If only there was an interval

In search for that interval

PatternW proof

The remaining case

How to build the dynamic programming

Code - c++ binary search

Code - Java linear