September 12, 2022 Single Round Match 837 Editorials


The problem is small enough to be solvable by brute force: try all reasonable options where the stairs start and where they end, for each of them check whether none of the given cubes are outside the staircase (too high), and of all those options pick the smallest one.

Of course, if we think a bit more, we then have to write less code.

Observation 1: the last column that has some cubes in the input will be the last column of the optimal staircase:

Proof: If you build some additional columns of cubes to the right of that one, I can just discard them and have a better solution than you (fewer extra cubes and still a valid staircase).

Observation 2: Now that we know where the staircase ends, we want to make it as small as possible.

Proof: Obvious. A bigger staircase completely contains a smaller one. If already the smaller one covers the entire input, the bigger one clearly needs more cubes.

Observation 3: Each given column of cubes gives us some lower bound for the size of the pyramid. The maximum of all those lower bounds is the smallest size that works.

Example: If we already have 10 cubes in the rightmost column, the staircase must obviously have size at least 10. As we go to the left, each step we take adds 1 to the requirement. For example, if the stack of 10 cubes is in the third column from the right, the staircase must be at least a size 12 to contain this entire stack. (The next column will have 11 and the last one will have 12 cubes.)

public int reconstruct(int[] columns) {
    int N = columns.length;
    while (columns[N-1] == 0) --N; // ignore empty columns on the right

    int minSize = 1;
    for (int n=0; n<N; ++n) {
        if (columns[n] == 0) continue;
        minSize = Math.max( minSize, columns[n]+(N-1-n) );

    int cubesNeeded = 0;
    for (int c=1; c<=minSize; ++c) cubesNeeded += c;
    for (int x : columns) cubesNeeded -= x;
    return cubesNeeded;


This problem teaches a standard combinatorial trick: sometimes it’s easier to count the complement.

Let’s start with a simple question: in base B, how many positive integers have exactly D digits?

The first of those digits must be non-zero, so for that digit we have B-1 options. Each of the remaining D-1 digits can be arbitrary. Thus, the total count of such numbers is T = (B-1) * B^(D-1).

(From the constraints we know that this value is below 10^9.)

As there are T ways to select a D-digit number, there are allPairs = T^2 different questions of our desired form “X + Y = ?”.

Instead of counting the ones that do have at least one carry, we’ll do the opposite: we’ll count those that don’t have any carries, and we’ll subtract that count from the total.

How do we count these pairs (X, Y)? There are D pairs of digits. Each of those pairs must have the property that it does not produce a carry – in other words, at each position the corresponding digits of X and Y sum up to at most B-1.

If the sum of the two digits is S, we have exactly S+1 options what the two digits are: the digit in X can have any value between 0 and S, inclusive, and the digit in Y is then determined (as S minus the digit in X). As S can be anything between 0 and B-1, inclusive, this gives us a total of otherPair = 1+2+…+B = B*(B+1)/2 ways to select one pair of digits.

The only pair of digits where the math differs are the leading digits of X and Y. This is because there we have an extra constraint: both have to be non-zero. We can easily count that this leaves us with firstPair = 1+2+…+(B-2) = (B-2)*(B-1)/2 ways to select these two digits.

Thus, the total number of questions that contain at least one carry can be computed as allPairs – firstPair * otherPair^(D-1).

public long count(int B, int D) {
    long oneNumber = B-1;
    for (int d=1; d<D; ++d) oneNumber *= B;

    long allPairs = oneNumber * oneNumber;
    long firstPair = B-1; firstPair *= B-2; firstPair /= 2;
    long otherPair = B+1; otherPair *= B;   otherPair /= 2;

    long noCarryPairs = firstPair;
    for (int d=1; d<D; ++d) noCarryPairs *= otherPair;

    return allPairs - noCarryPairs;


This problem allows us to practice our design of greedy algorithms. 

There are many ways how to place the blocks into the staircase. Are some of them always better than others? How can we tell?

We can start by imagining a valid staircase painted over some arbitrary placement of blocks. We can now push all the blocks as far to the right as we can (without them leaving the inside of the staircase). As we are pushing the blocks towards an area where the staircase is higher, they can never hit the ceiling. Thus, the rightmost block will eventually reach the right end of the staircase, the next block will eventually touch the rightmost one, and so on. We will end up with a configuration in which all the consecutive blocks touch and the last one is on the right end of the staircase.

We have just shown the following claim: There is always an optimal solution of the form we just described.

How does this help us? Well, now we know that it is enough to find the best solution among the ones that have the more restricted form. Once we do that, we can be sure that we have the globally optimal solution.

We already know that if we fix the order of the blocks, their positions are now forced. But what order of the blocks should we choose? Is there always one order that works the best?

Whenever we ask such questions, a good way to answer them is to look at two consecutive elements of the sequence (in our case, two adjacent blocks) and ask the question when exactly we should swap them.

In this problem it turns out that the answer is simple: whenever you have a taller block to the left of a less tall block, it’s always a good idea to swap them. (The proof should be obvious from a picture. If you prefer being more formal, think about how to compute the final size of the staircase from a given sequence of blocks – or read this section below and then return here and try to complete the proof.)

Once we have the optimal order of blocks, all that remains is to compute the minimum size of the staircase that contains them all. Clearly, for our configuration (blocks shifted as far right as possible) a pyramid contains a block if and only if it contains its top left corner. Thus, the top left corner of each block gives us one lower bound on the staircase size – exactly as in the div2 easy problem.

public long build(int[] W, int[] H) {
    // sort according to height, small to big
    int N = W.length;
    for (int a=0; a<N; ++a) for (int b=a+1; b<N; ++b) if (H[a] > H[b]) {
        int tmp;
        tmp = W[a]; W[a] = W[b]; W[b] = tmp;
        tmp = H[a]; H[a] = H[b]; H[b] = tmp;
    long columnFromRight = 0;
    long minSize = 0;
    for (int n=N-1; n>=0; --n) {
        columnFromRight += W[n];
        minSize = Math.max( minSize, H[n] + columnFromRight - 1 );
    long cubes = minSize * (minSize + 1) / 2;
    for (int n=0; n<N; ++n) cubes -= (1L * H[n]) * W[n];
    return cubes;


Here the first key observation is that the order of insertions does not matter. For each bucket we only care about the total number of insertions that hit it.

The second key observation is that whenever we pick a bucket to be backed up K > 0 times, it is always best to schedule the backups as regularly as possible. E.g., if we have a bucket that will receive 37 elements, the optimal solution is to do 12 inserts, backup, 12 inserts, another backup, and 13 inserts. (The numbers 12, 12, 13 can be shuffled.)

Formal statement: Whenever a bucket is backed up K > 0 times, the total number of steps during all L insertions is minimized if no two blocks of inserts between backups differ by more than 1.

Proof: If we have X inserts in one block and then X+2 in another block, we do 1+2+…+X steps in the first and 1+2+…+X+(X+1)+(X+2) steps in the second block. Moving one insert from the second block to the first one clearly decreases the total number of steps. (The same is clearly true if the second block is even bigger.)

Corollary: The condition “no two parts differ by more than 1” actually always determines the block sizes exactly (up to order): The size of each block will be L/(K+1), rounded either up or down to an integer. If L isn’t divisible by (K+1), the number of blocks where we round up will be exactly L mod (K+1).

With these observations we can now do dynamic programming: for each b and t, we want to compute dp[b][t] = the minimum number of steps in which we can process the first b buckets if we have t backup tapes available.

Each of these values can be computed in O(t) time by trying out all possibilities for how many tapes are used on the last bucket in the range. Once we fix the number of tapes, we know the optimal split for that bucket and we can compute the total number of steps for it in constant time. Thus, we can find the optimal solutions for all H buckets and all numbers of tapes between 0 and T in O(HT^2) time.

long ceilinglog2(long x) {
    long answer = 0;
    while ((1L << answer) < x) ++answer;
    return answer;

long triangle(long n) {
    return n*(n+1) / 2;

long bucketCost(long items, long tapes) {
    long total = 0;
    if (tapes == 0) {
        total = triangle(items);
    } else {
        long smallPiece = items / (tapes+1);
        long largePiece = smallPiece + 1;
        long largeCount = items % (tapes+1);
        long smallCount = (tapes+1) - largeCount;
        total += smallCount * triangle(smallPiece);
        total += largeCount * triangle(largePiece);
        total += 2 * items * ceilinglog2(items);
    return total;

public long optimize(int H, int T, int N, int L, int[] W, int seed) {
    int sumW = 0;
    for (int w : W) sumW += w;
    int[] who = new int[sumW];
    for (int where=0, i=0; i<H; ++i) 
        for (int j=0; j<W[i]; ++j) who[where++] = i;

    int[] bkt = new int[N];
    int[] cnt = new int[N];

    long state = seed;
    for (int n=0; n<N; ++n) {
        state = (state * 1103515245 + 12345) % (1L << 31);
        bkt[n] = who[ (int)(state % sumW) ];
        state = (state * 1103515245 + 12345) % (1L << 31);
        cnt[n] = (int)(1 + (state % L));

    long[] totals = new long[H];
    for (int n=0; n<N; ++n) totals[ bkt[n] ] += cnt[n];

    long[][] dp = new long[H+1][T+1];
    // what's the best solution for the first h buckets using t tapes?
    for (int h=0; h<=H; ++h) for (int t=0; t<=T; ++t) {
        if (h == 0) {
            dp[h][t] = 0;
        dp[h][t] = bucketCost( totals[h-1], 0 ) + dp[h-1][t];
        for (int u=1; u<=t; ++u) {
            dp[h][t] = Math.min( dp[h][t], 
                bucketCost( totals[h-1], u ) + dp[h-1][t-u] );

    long answer = dp[H][0];
    for (int t=1; t<=T; ++t) answer = Math.min(answer, dp[H][t]);

    return answer;


We’ll start by stating a well-known result: When given a collection of strings we want to concatenate so that the result is maximal, we can do this greedily by sorting the collection with a custom comparator and then concatenating them in the sorted order.

Note that standard “less than” for strings is not exactly the comparator we need. Sure, when the two strings you have are “dad” and “mom”, we have “dad” < “mom” and it’s better to put “mom” first. But the rule no longer works as soon as both strings start the same. For example, “ba” < “baa”, but if we put “baa” first, we get the suboptimal concatenation “baaba” instead of the optimal “babaa”.

The correct comparator: X should be used before Y if and only if X+Y > Y+X.

If you don’t know this result yet, we recommend that you now spend some time convincing yourself that this is actually a valid complete partial order – in particular, that this new way of comparing strings is transitive.

Armed with this observation, we can now approach the given problem. We are interested in two bits of information:

  • whether there’s a way to cut the claimed string into a valid sequence of pieces that IS sorted according to the above rule
  • whether there’s such a way for which the valid sequence of pieces ISN’T sorted according to the above rule

The four possible combinations correspond to the four answers we are supposed to give.

We can determine these answers using exponential dynamic programming. Imagine that we are cutting the string into pieces one piece at a time. The current state at any moment can be described by the following:

  • the length of the prefix of “claim” we already cut into pieces
  • the subset of upperBounds that was already assigned to the pieces we produced
  • the length of the last piece (so that we know what it is and can compare it to the next one when we make the next cut)

A useful observation is to realize that upperBound[x] <= 10^7 implies that each piece will have at most 5 characters.

int getNumber(String piece) {
    int answer = 0;
    int currentWays = 1;
    for (int i=1; i<piece.length(); ++i) {
        currentWays *= 26;
        answer += currentWays;
    for (int i=0; i<piece.length(); ++i) {
        char letter = piece.charAt(i);
        for (char x='a'; x<letter; ++x) answer += currentWays;
        currentWays /= 26;
    return answer;

public String analyze(int[] upperBounds, String claim) {
    int N = upperBounds.length;
    int L = claim.length();
    if (L > 5*N) return "lie"; // too long

    // for each piece of the claim, precompute its number for quick bound checks
    int[][] pieceNumber = new int[L][6];
    for (int l=0; l<L; ++l) for (int i=1; i<=5 && l+i <= L; ++i) {
        pieceNumber[l][i] = getNumber( claim.substring(l, l+i) );

    // for each pair of consecutive pieces,
    // precompute whether they are in the correct order for max concatenation
    boolean[][][] canFollow = new boolean[L][6][6];
    for (int l=0; l<L; ++l) for (int i=1; i<=5 && l+i <= L; ++i) 
    for (int j=1; j<=5 && l+i+j <= L; ++j) {
        String current = claim.substring(l, l+i+j);
        String flipped = claim.substring(l+i, l+i+j) + claim.substring(l, l+i);
        canFollow[l][i][j] = (current.compareTo(flipped) >= 0);

    boolean[][][] reachableWell = new boolean[L+1][1<<N][6];
    boolean[][][] reachableBadly = new boolean[L+1][1<<N][6];

    // initialize the first piece of the string
    for (int l=1; l<=5 && l<=L; ++l) for (int n=0; n<N; ++n)
    if (pieceNumber[0][l] < upperBounds[n]) reachableWell[l][1<<n][l] = true;

    // do the forward dp for the rest
    for (int l=1; l<=L; ++l) for (int sub=0; sub<(1<<N); ++sub) 
    for (int last=1; last<=5 && last<=l; ++last) {
        if (!reachableWell[l][sub][last] && !reachableBadly[l][sub][last])
        for (int n=0; n<N; ++n) {
            if ((sub & 1<<n) != 0) continue;
            int nsub = sub | (1<<n);
            for (int nxtl=1; nxtl<=5 && l+nxtl<=L; ++nxtl) {
                if (pieceNumber[l][nxtl] >= upperBounds[n]) break;
                if (reachableWell[l][sub][last] 
                    && canFollow[l-last][last][nxtl])
                       reachableWell[l+nxtl][nsub][nxtl] = true;
                if (reachableWell[l][sub][last]
                    && !canFollow[l-last][last][nxtl])
                       reachableBadly[l+nxtl][nsub][nxtl] = true;
                if (reachableBadly[l][sub][last])
                       reachableBadly[l+nxtl][nsub][nxtl] = true;

    boolean canWell = false, canBadly = false;
    for (int l=1; l<=5 && l<=L; ++l) {
        canWell = canWell | reachableWell[L][(1<<N)-1][l];
        canBadly = canBadly | reachableBadly[L][(1<<N)-1][l];

    if (!canWell && !canBadly) return "lie";
    if ( canWell && !canBadly) return "good";
    if (!canWell &&  canBadly) return "bad";
    return "unknown";


categories & Tags


Sign up for the Topcoder Monthly Customer Newsletter

Thank you

Your information has been successfully received

You will be redirected in 10 seconds