Match Editorial

MATCH EDITORIAL LINKS:

Room 1 Review	Archive
Rookie Review	Discuss this match
Problem Set	Want to write a feature?
Lessons Learned

SRM 91
May 22, 2002

Match summary

SRM 91's problems were brought to you by axchma. In Division-II, there was a simple numeric problem, a problem involving computing TopCoder winnings, and a state machine problem. In Division-I shared the TopCoder-based problem, as well as a simulation problem and a combinatorical problem.

In Division-I there was much less emphasis than usual on data. The Level 2 and Level 3 problems each took only an int as input and returned an int, and their solutions did not require any sophisticated data structures. This is a pretty rare occurrence in Division-I. In retrospect, the Level 2 and Level 3 problems seem to share a lot of common themes; I wonder if this was intentional.

Division-II had to deal with more data as input, but the solutions to Division-II problems really did not require any sophisticated data structures at all, either, as can be seen in the individual problem analyses.

Summary

Problem		Submissions	Success Rate	Average Points	High Score
Division-I	Level 1	227 of 232 (97.8%)	204 of 227 (89.9%)	201.29	ZorbaTHut with 239.34
	Level 2	142 of 232 (61.2%)	119 of 142 (83.8%)	358.25	eduar09 with 536.13
	Level 3	18 of 232 (7.76%)	11 of 18 (61.1%)	547.00	kalmes with 762.35
Division-II	Level 1	393 of 412 (95.4%)	327 of 393 (83.2%)	228.65	PongsriBunny with 248.17
	Level 2	343 of 412 (83.3%)	251 of 343 (73.2%)	334.10	ddwsdl with 473.85
	Level 3	182 of 412 (44.2%)	58 of 182 (31.9%)	525.43	Ninjak with 822.95

The Problems

The problems have been sorted by ascending difficulty, as I perceive it.

Division-II, Level 1: Perfect

This problem was a test of whether one knew how to compute factors of some number n. As or after the factors are computed, their sum can easily be used to determine whether n was deficient, perfect, or abundant. The solution only requires an int for iteration and an int for accumulation.

Factoring numbers is a hard problem, and there are many advanced factorization algorithms. However, we are given a pretty small input constraint (2 <= n <= 50000), meaning that even the most basic algorithm will be sufficient.

The most basic consists of iterating from i = 1 to floor(n / 2). If n is divisible by i (that is, n % i == 0), add i to our running sum of factors. Then all that is needed are a few simple conditional statements to compare this sum to n and return the appropriate string. This can be made more efficient by noting that in most cases two divisors can be found at once, and one only needs to iterate up to floor(sqrt(n)). If n % i = 0, then i and n / i are factors of n. However, if one is using this method one must be careful not to count sqrt(n) twice if n is a perfect square.

Division-I, Level 1 / Division-II, Level 2: ChallengePhase

Here we have a new entry in the line of self-referencing TopCoder problems. However, unlike previous problems of this nature, we are not helping TopCoder compute prizes, we are helping ourselves compute prizes.

The solution is rather direct. For each score of myScore - 50, myScore, and myScore + 50, compute your expected prize winnings (we shall call them plose, pbase, and pwin, respectively). To do this, count the number of scores that are greater then the score we are evaluating. If the count is greater than three, the prize is zero, otherwise the prize is the (2-count)^th entry in the prize table. Once plose, pbase, and pwin are computed, simply return (pwin - pbase) - (pbase - plose) = pwin - 2 * pbase + plose. Again, no data structures are required, only some accumulators and iterators.

Division-I, Level 2: StarCraft

This problem presents some basic rules for turn-based warfare between a set of identical zealots and a set of identical zerglings. The solution requires an efficient implementation of a simulation of these rules for a variable number of zerglings and a fixed number of zealots, locating the smallest number of zerglings that can manage to kill off all the zealots.

This problem is one of the most deceptive problem statements I've ever seen. It provides some very verbose rules, some of which are very similar to each other. The rules for how zealots attack zerglings are really exactly the same as those for how zerglings attack zealots, yet they are described quite differently. One might note that there is a difference in that there is an upper limit to the number of times that a zealot can be attacked in a single round. However, one can maintain symmetry here; the same upper limit can be thought of as applicable to zerglings, it's just that a zergling will always die long before that upper limit is ever reached.

The discussion of how each side picks its targets is slightly misleading as well. The verbose rules can be reduced to the following:

both sets of units are organized as a queue
the attacking units attack one by one
a defending unit is attackable if it still alive and has not been attacked 10 times this round
when a unit attacks, it deals 1 or 2 damage (2 in the case of an attacking zergling) to the first attackable defending unit
if no attackable defending units remain, the round is over

Once one clears up the rules in this manner, the implementation is simple.

Now that we have solved the simulation problem, it's time to address efficiency. The minimum number of zerglings will be 4, and the maximum number will be 2587, as one can glean by looking at the examples provided in the problem statement. However, iterating from 4 to 2587 and running a complete simulation with that many zerglings will possibly take too long (depending on how efficiently one implements the solution).

What we have here is a searching problem. We have a well-defined range (4..2587), and we even have a function (our simulation) that can tell us if the result we are searching for is less than or not less than any particular value. With this in mind, it becomes obvious that we want to use a binary search.

The reason that a binary search is appropriate is that we have a function that can narrow our search space by half at each step. If we try some value z for the number of zerglings, and find from our simulation that this value is too low, we know that z + 1 must be the new lower bound. Otherwise, we know that z is the new upper bound. When we get the lower bound and upper bound to meet, we have narrowed z down to the only minimum value that passes the simulation, which is our answer.

If we had used our simple linear search, we would have had to perform up to 2584 simulations, which might take on order of minutes to complete for most solutions. If we use binary search, we halve our search space at each step, thus reducing the number of simulations to ceil(log₂ 2584) = 12 steps. Binary search is a simple but powerful tool in situations where it is appropriate.

In summary, the only two difficulties of this problem are implementing an efficient search, and reading and understanding the problem statement. Not that I am faulting the problem writer on this count; the obfuscation was most likely intentional, to make the problem more challenging (and it's fun to write problems like that). Problem statements like this are a pretty common occurrence in ACM challenges (although StarCraft references may be few and far between).

Division-II, Level 3: Rumba

The Rumba problem basically provides a set of rules, describing a sort of state machine. All that is necessary to solve this problem is to correctly construct this state machine and evaluate it. The construction can actually be implicit. The state machine can be represented as a "dance graph" of reachable states. We simply need to traverse this graph, using either a depth-first or breadth-first traversal.

The dance graph does not ever need to be explicitly constructed; the structure of this graph can be implicit in how we call our functions. Each vertex of the dance graph consists of a combination of our location in the move sequence and our position (open, close, or fan). Directed edges between vertices represent a valid transition from one state to another. We also have four initial vertices, representing the initial states (undefined location in the move sequence, as the dance has yet to begin, and one of each of the possible positions, since we can start in any position).

We can represent the current state by our position in the move sequence and by the set of positions that we can possibly be in at the current point in the routine. That is, we can have three boolean variables, representing open, close, and fan. Since initially the position doesn't matter, any position is possible, so all three variables are initialized to true.

We then iterate through the sequence of moves. For each move, we first determine whether or not a position is possible which makes the move valid. For instance, if the fan variable is false, it means a move if "HOCKEY STICK" is invalid. If the next move is valid, we then modify our three variables based on what the possible outcomes are. This information is all conveniently provided at the beginning of the problem statement, where the list of valid positions for commencing and ending each move is given.

This solution is equivalent to a breadth-first search through the dance graph. The representation of the graph is implicit, but the frontier is represented by the three boolean variables. The computation of new values for these variables at each step is equivalent to enumerating all the vertices that are reachable in one step from vertices in the frontier. A depth-first traversal could also be done in a similar manner, most likely involving a recursive function operating on a single value representing an actual position rather than a set of variables giving a set of possible positions.

Division-I, Level 3: DVD

The Level 3 problem is the sort of problem that few people can solve in a challenge situation, but in retrospect turns out to be not quite as difficult as it seemed before. Only 18 out of 232 even submitted a solution for this problem, and unfortunately I was not one of them. However, Room 1 submissions tend to be quite edifying, and this match proved to be no exception.

Algorithm

NDBronson's solution is an elegant combination of a binary search with a recurrence relation (and it happens to be commented as well, for some strange reason). The binary search uses a predicate (call it isPossible), which takes as input a number n and the number of sheets and returns true if it is possible to form all the numbers from 1 through n with those sheets. The binary search is basically the same as that we might have used for the StarCraft problem, with isPossible as the predicate rather than the simulation.

The problem is now reduced to solving isPossible. To do this we iterate through each digit (0..9) and count how many times it appears in the numbers (1..n). If this number is greater than twice the number of sheets for any digit, we return false. Otherwise, if all the digits pass, we return true. The problem is now reduced to a new problem: counting the number of occurrences of a digit in a sequence of numbers.

This is where our recurrence relation comes in. As NDBronson did, we will call the function that implements the recurrence relation occurrences, and it will take as input a digit (which we will simply call digit) and the upper bound of the sequence (which we will call max). It will return the number of times digit appears in the numbers between 1 and max, inclusive.

The first step is to count the number of times the digit occurs in the ones place. This is obtained by integer division of max by 10. However, if the ones digit of max itself is greater than or equal to the digit we are counting, we need to add an additional one to the count.

Next we count the number of times the digit occurs in any location but the ones place. This is where the recurrence occurs. There is a slight trick here, however. Intuitively we will want to count the occurrences of digit in 1..(max / 10) and multiply by 10, because there our 10 digits we can append to each of the numbers in the smaller sequence to get numbers in our current sequence. However, for max / 10 itself, there may be fewer than 10 such numbers. Therefore we call occurrences with the same value for the digit, but with max / 10 - 1 (integer division) as the new upper bound. We multiply this by 10 and add it to our previous result.

We then count the number of times digit occurs in (10 * (max / 10))..max. We do this by counting how many times it occurs in max / 10 (a simple iteration over the digits, counting the ones that match) and multiplying this count by one plus the ones digit of max (e.g. max % 10 + 1). The reason we add the 1 is that we are counting digits in 0..d, not 1..d.

We then add all these values we've accumulated and return the sum. This sum is the number of occurrences of digit in 1..max.

Analysis

We will presume that the answer to our solution is between 0 and MAX_INT, which is 2³¹-1. Since with each iteration of the binary search we reduce the search space by half, there will be at most 31 or so calls to isPossible. This means that the runtime of isPossible (that is, occurrences) is the overwhelming factor in runtime here.

The recurrence in occurrences consists of iterating through 10 possible digit values, reducing the input number by 1 / 10. So we know the depth of the recursion is at most log₁₀ max, and there is no branching as occurrences only calls itself once during its lifetime. Therefore occurrences is O(log n).

I think it's reasonable to state that the upper bound of the search space is O(n). Therefore, by combining binary search, which is O(log n), with occurrences, which is also O(log n), the overall runtime complexity of this algorithm is the product, which is O(log² n). That's pretty fast.

By Logan
TopCoder Member

Member Comments

Logan,

I think you do a *very* good job in analyzing the problems in such a way that even I can understand them. However, I think you over-analyzed the Rumba problem. There is no need to build a state machine nor walk any graphs. All you need to do is step through the steps and evaluate whether the opening position is a possible outcome of the prior step's closing position (or recursively prior on the backward step).

Pops

Another very nice analysis, thanks a lot. However, some comments:

Div-II easy problem: I'd say the real most basic iteration is for ( i=1; i<n; ++i ), not to the floor of n/2.

One thing that should be mentioned for the binary search is that the functions are monotone, otherwise we couldn't apply it.

Div-I hard problem: It suffices to check the 1-digit, since this is always the first we run out of. No need to check the other digits.

pochmann

Hi Logan,

Once again, a very nice problem set analysis. I'm a little confused on how your state machine implementation of the div. 2 hard would look though (codewise). If you have time, could you possibly post a solution using this state machine idea?

Steven