September 9, 2010

SRM 481

mystic_tcmystic_tc

DURATION

25 min

Match Summary

Coders from both divisions had to face a tricky problem set in which more than half of them submitted at least one wrong solution to one of the problems. Nonetheless, in division 1 there were still many coders who were able to solve all three problems correctly, which caused overall speed and challenge skills to be necessary to stand up from the others and reach the important top places in the scoreboard. Burunduk1accomplished the first place thanks mostly to consistently great performances in all the problems and a succesful challenge. Almost 200 points behind, dzhulgakov and LayCurse got the second and third places, respectively.

In division 2 only six coders were able to solve all three problems correctly. Among them, newcomer ysyshtc stood up due to speed and also two succesful challenges. In a close second place rohan_rock, which was able to jump to green rating and also get very close to division 1. Finally, another newcomer, yixiaccomplished the third place.

CircleMarket

Problem Details

Used as: Division Two - Level One:


Value	250
Submission Rate	832 / 972 (85.60%)
Success Rate	615 / 832 (73.92%)
High Score	jsrvh for 249.79 points (0 mins 50 secs)
Average Score	182.49 (for 615 correct submissions)

Let us begin by parsing the problem statement. We can note that no matter the store is open or not, the next step is always to move to the next store and it will always take travelTime seconds. The circular aspect makes it so after visiting the n-th store, we go back to the first store. It says the process ends once it is no longer possible to make a purchase, which can be seen as when all the stores are past their closing time. The maximum closing time in this problem is 1000000. Since every move between stores takes travelTime seconds and travelTime is at least 1, then we would need to simulate at most 1000000 moves. We can conclude that simple simulation should be fast enough for the 2 seconds time limit.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
public int makePurchases(int[] openTime, int[] closeTime, int travelTime) {
  int n = openTime.length, end = 0;

  // At what time to end the loop? It is equal to the largest closing
  // time in the input. Alternatively, we could just use 1000000. 
  for (int i = 0; i < n; i++) {
    end = Math.max(end, closeTime[i]);
  }
  // In order to avoid counting purchases twice, we need to track from
  // which stores purchases have already been made:
  boolean[] purchased = new boolean[n];

  int t = 0, c = 0;
  // t : the current time, c : the number of purchases made so far
  int i = 0;
  while (t <= end) {
    // A run from store 0 to store n-1 :
    for (int i = 0; i < n; i++) {
      // if it is possible to make a purchase, make it.
      if ((!purchased[i]) && (openTime[i] <= t && t <= closeTime[i])) {
        c++;
        purchased[i] = true;
      }
      // increase time by travelTime
      t += travelTime;
    }
  }
  return c;
}

With the previous solution, the loop has to run till max of closeTime with travelTime as increment. There is an alternate solution wherein the loop has to run through the total number of stores for the given problem.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
int makePurchases(vector < int > openTime, vector < int > closeTime, int travelTime) {
  int size = openTime.size();
  int time_close_to_open_time = 0;
  int purchase_counter = 0;

  for (int counter = 0; counter < openTime.size(); counter++) {
    time_close_to_open_time = counter * travelTime;
    if (time_close_to_open_time >= openTime[counter] && time_close_to_open_time <= closeTime[counter]) {
      // Check if in the first round purchase can happen
      purchase_counter++;
    } else {
      if ((((openTime[counter] - counter * travelTime) % (size * travelTime)))) {
        time_close_to_open_time = openTime[counter] + ((size * travelTime) - ((openTime[counter] - counter * travelTime) % (size * travelTime)));
      } else {
        // Check if openTime exactly matches with the loop
        time_close_to_open_time = openTime[counter] + counter * travelTime;
      }

      if (time_close_to_open_time >= openTime[counter] && time_close_to_open_time <= closeTime[counter]) {
        purchase_counter++;
      }
    }
  }

  return purchase_counter;
}

ChickenOracle

Problem Details

Used as: Division Two - Level Two:


Value	500
Submission Rate	434 / 972 (44.65%)
Success Rate	51 / 434 (11.75%)
High Score	everything. for 453.34 points (9 mins 18 secs)
Average Score	282.29 (for 51 correct submissions)

Used as: Division One - Level One:


Value	250
Submission Rate	597 / 654 (91.28%)
Success Rate	297 / 597 (49.75%)
High Score	shackler for 247.12 points (3 mins 4 secs)
Average Score	167.37 (for 297 correct submissions)

This problem was perhaps too tricky and it was very easy to make a wrong solution that would still pass the example cases. Taking extra care during the analysis was a vital requirement in accomplishing a good score. We can begin by dividing it into a pair of similar problems. Can a correct answer be “The chicken”? And can a correct answer be “The egg”? If we knew how to answer each of these sub-problems, we could find the solution by simply checking if there is a unique correct answer or ambiguity or no answer. Both sub-problems are also very similar, the only difference is that in one (n-eggCount) people supposedly told the correct answer, and in the other (eggCount) people did.

Both subproblems can then be solved by answering the following query: “Is there a valid scenario in which there are n people, x of them told the truth, lieCount of them were told the false answer and liarCount intentionally gave you an answer opposite to the one they were originally given?”. Note that the liars and the people that were lied to can intersect and that is indeed the core of the problem, but assuming that the intersection is a preset value y, these y people actually gave you the correct answer even though they are intentionally lying to you. There is another group of people that also gave you the correct answer but intentionally so. Those people are the total number of people that were not lied to and did not lie intentionally. In terms of set theory, those people are the complement of the union between the liars and the people that were lied to. Speaking of set theory, perhaps a Venn diagram can make everything clearer:

venn

In the graphic, c + y will be equal to x (The total number of people that told you the correct answer) and a+b will be equal to n-x (The total number of people that told you the wrong answer). Since we know the number of elements in LIARS (liarCount) and the number of people in the intersection is assumed to be y we can conclude that a=lieCount-y. Similarly, b = liarCount-y. Since all the other literals are known, c = n - a - b - y. Knowing a, b, c and y we can verify: If (a+b) is equal to (n-x) and (c+y = x) then the scenario is possible. There is an implicit condition, an important one that was missed by many during the match, we do not want y to be so small, that the sum a+b-y is larger than n, this case can be detected by making sure that c is positive. Also note that since the numbers in the input can only be as large as 1000000, we can simply iterate through all the possible values of y until one for which the previous conditions hold.

We can then code a solution:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// Is the following scenario possible?:
//
// n people.
// x people told the truth.
// lieCount people were lied to.
// liarCount people lied intentionally
//
boolean check(int n, int x, int lieCount, int liarCount) {
  // Iterate through all the possible values of the intersection y:
  for (int y = 0; y <= lieCount && y <= liarCount; y++) {
    int a = lieCount - y;
    int b = liarCount - y;
    int c = n - a - b - y;

    // Do all the required conditions hold?        
    if (a + b == n - x && c + y == x && c >= 0) {
      return true;
    }
  }
  return false;
}
//

public String theTruth(int n, int eggCount, int lieCount, int liarCount) {
  // Call the sub-routine, first assuming that the egg is the correct answer
  boolean egg = check(n, eggCount, lieCount, liarCount);
  // Then assuming that the chicken is the correct answer:
  boolean chicken = check(n, n - eggCount, lieCount, liarCount);
  if (egg && chicken) {
    return "Ambiguous";
  } else if (egg) {
    return "The egg";
  } else if (chicken) {
    return "The chicken";
  } else {
    return "The oracle is a lie";
  }
}

**There is no need to iterate over all possible values. It suffices to check minimum and maximum (the range) and also parity.It is important to be careful when picking the minimum and maximum values though, because it may be possible to allow cases in which the size of the union between liars and people that were lied to is greater than n.

Sample code using range and parity check:

1
2
3
4
5
6
7
8
9
10
11
boolean egg = false, chicken = false;
//min and max of possible number of people who say that egg is right answer.
int low = (int) Math.abs((n - lieCount) - liarCount);
int high = n - (int) Math.abs(lieCount - liarCount);

//egg is right answer
if (eggCount >= low && eggCount <= high && eggCount % 2 == low % 2)
  egg = true;
//chicken is right answer (range is reversed by subtracting from n)
if (eggCount >= (n - high) && eggCount <= (n - low) && eggCount % 2 == (n - high) % 2)
  chicken = true;

BatchSystem

Problem Details

Used as: Division Two - Level Three:


Value	900
Submission Rate	236 / 972 (24.28%)
Success Rate	22 / 236 (9.32%)
High Score	jyliu27 for 808.01 points (9 mins 48 secs)
Average Score	560.49 (for 22 correct submissions)

It is usually the case that solving an easier version of a problem first can provide insights to solving the harder version. Assume that every user had exactly one job then it becomes a common problem with a known solution, but we will assume it wasn’t and try solving it anyway. Since no matter the order that is picked to process the jobs, the number of jobs n is always the same, then minimizing the average waiting time (TOTAL_SUM_OF_WAITING_TIMES n) is the same as minimizing such total sum of waiting times. Now pretend that we knew knew the durations { D1, D2, D3, … DN } of each of the jobs in the same order 1…N as the one used in the scheduling. Then the total sum of durations is:

    D1
  + D1 +       D2
    D1 +       D2 +       D3
    ...
    D1 +       D2 +       D3 + ... + DN
  -------------------------------------
  n*D1 + (n-1)*D2 + (n-2)*D3 + ... + DN

In other words, the owner for the job that requires D1 seconds will have to wait exactly D1 seconds. The owner of the second job D2 will have to wait D1+D2 seconds and so and so. We can note that of all the durations, D1 has the most weight in the total sum. It may appear that picking the smallest duration for the first job will always be beneficial. This can be shown by contradiction: Assume that the minimum duration d was put in any of the other indexes different to 1, then a larger duration (d+x) will be in the first duration, which gives the total sum of waiting times:

n*(d+x) + (n-1)*D2 + ... (n-k)*d + ... (n-(n-1))*DN

It can be shown that such total sum of waiting times is strictly larger than a sum of waiting times in which the the first and k-th durations are swapped:

  n*d + (n-1)*D2 + ... (n-k)*(d+x) + ... (n-(n-1))*DN

n*(d+x) + (n-1)*D2 + ... (n-k)*d + ... (n-(n-1))*DN &gt; n*d + (n-1)*D2 + ... (n-k)*(d+x) + ... (n-(n-1))*DN
n*(d+x)           +      (n-k)*d                    &gt; n*d                + (n-k)*(d+x)
              n*d + n*x + n*d - k*d                 &gt; n*d  + n*d + n*x - k*d - k*x
                                                 0  &gt; -k * x

Similarly, it can be shown that once the first element in the schedule has been decided, the second element must have the second-smallest duration, and it can continue until we conclude that the optimal way to schedule the jobs is one in which the durations are sorted in ascending order.

How does the knowledge that we must always pick the shortest job first help with the more complex problem? Both versions of the problem are actually closer than it seems. It appears that for convenience, all the jobs from a user can be processed continuously without processing jobs from another user before they finish processing. Is it ever necessary to do the opposite? (Interrupt the processing of a user’s jobs with a job from another user). The answer is no. To prove it, imagine that all jobs from each user are contiguous in a schedule, procceed to move a job from one of the users X such that it is put between two jobs from another user Y. Assume that at first the jobs for user X were scheduled to run after Y. If X has more than one job, the total waiting time for user X is not going to change, but the waiting time for user Y will be increased and thus will the average waiting time increase as well. If X had only one job, it would decrease X’s waiting time at the expense of Y’s waiting time, but note that it does not matter where X’s job is placed, Y’s waiting time will always change to the same value as long as X’s job is placed before the last of Y’s jobs, but placing X’s job completely before all of Y’s will reduce X’s waiting time the most.

Now that we have shown that the optimal answer will always have all of the jobs from each user in a contiguous sequence we can now pretend that the jobs from each user add up to form a larger job that is unique to the user, so it is now necessary to order theese larger jobs such that the average waiting time is minimized, this is exactly the same problem as the easier version that we have already solved. So in fact, placing the users with the smaller total job duration first will yield an optimal answer.

There is still the need to use the lexicographically-first tie breaker. In order to get the lexicographically-first solution, always sort the jobs from each user in ascending order. Then, in case two users have the same total duration, place the one with the smaller first job index first.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
public int[] schedule(int[] duration, String[] user) {
  int n = duration.length, c = 0;
  int[] res = new int[n];
  boolean[] used = new boolean[n];

  // Pick the available job such that has the user with
  // the lowest total duration
  //
  while (c < n) {
    int pick = 0;
    long lowestDuration = Long.MAX_VALUE;
    //int lowestDuration = Integer.MAX_VALUE;
    for (int i = 0; i < n; i++) {
      if (!used[i]) {
        long dr = 0;
        // Calculate the total duration of all the jobs owned
        // by this job&apos;s user:
        for (int j = 0; j < n; j++) {
          if (user[i].equals(user[j])) {
            dr += duration[j];
          }
        }
        // note that due to the order in which processes are visited
        // it is not necessary to explicitely tie-break to pick the
        // process with the smaller index in case the total durations
        // are the same
        if (dr < lowestDuration) {
          pick = i;
          lowestDuration = dr;
        }
      }
    }
    // Process all the jobs from the picked user next.
    // Note that due to the order in which jobs are processed in this loop,
    // it is not necessarily to explicitely sort the jobs from this user.
    for (int i = 0; i < n; i++) {
      if (user[i].equals(user[pick])) {
        res[c++] = i;
        used[i] = true;
      }
    }
  }
  return res;
}

BatchSystemRoulette

Problem Details

Used as: Division One - Level Two:


Value	500
Submission Rate	240 / 654 (36.70%)
Success Rate	136 / 240 (56.67%)
High Score	ltaravilse for 481.06 points (5 mins 40 secs)
Average Score	287.50 (for 136 correct submissions)

This problem actually requires we to solve the div2 900, BatchSystem first. Once we know what is the rule that optimal schedules must follow, we can procceed to solve the rest of the problem. Assue that all jobs are grouped by user. For each user X, we have three kinds of users to care about:

Users with a smaller total duration. The optimality rule makes it mandatory for the jobs from these users to always come before the jobs from our user X. So we can just add their total duration to the expected waiting time of each of user X’s jobs.
Users with a larger total duration. Similarly, the optimality rule forces the jobs from these users to always be processed after user X’s. So, their durations will never be added to the expected waiting times of X’s jobs.
Users with exactly the same total duration as user X: The computer may pick any permutation of these users with equal probability. If user X’s total duration was D, we could count the total number of users with such a total duration equaln. The computer may pick any of the equaln possible positions for user X (among those that have D duration) this choice is unform. The expected waiting time induced for waiting for users with the same total waiting time is then: (0 + D + 2*D + … (equaln-1)D) / equaln which is the same as : ( ( equaln(equaln-1) / 2 ) * D ) / equaln = D*(equaln-1) / 2.

So every user will have to wait an expected value of D*(equaln - 1) / 2 + sum of total durations that are lower than the user’, before any of his/her jobs are executed. There remains to calculate the expected waiting time for each of the user’s jobs. For this sub-problem, we can ignore all of the other jobs (as their times are already considered by the previous formula) and focus only on the jobs from each user. The relative order for each of the jobs of the same user can be any possible permutation and each of them is picked with same probability. This sub-problem is simpler than it may seem: For a fixed job X, there are t-1 other jobs that belong to the user. For each of the t-1 other jobs, there is a 1/2 probability that the job will be before X in the permutation and a 1/2 probability it will be after X. Therefore the expected time job X will start is sum_of( (1/2) * duration[j]) for all the other jobs j. The total waiting time for job X can be calculated by adding X’s duration to the previous result. So, the result would be: sum_of( (12) * duration[j]) + duration[x]. Which can also be changed to : sum_of(duration[j])2 + duration[x] or : (sum_of_all_durations - duration[x]) 2 + duration[x].

Adding up all of the values, we can determine the total expected waiting time for each process. A c++ implementation follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
vector < double > expectedFinishTimes(vector < int > duration, vector < string > user) {
  int n = duration.size();
  // Find users and their total durations:
  map < string, long long > userDuration;
  for (int i = 0; i < n; i++) {
    userDuration[user[i]] += duration[i];
  }
  vector < double > res(n);
  for (int i = 0; i < n; i++) {
    int eqn = 0;
    double before = 0;
    // For each pair user, total duration:
    //   * count the number of users with the same
    //     total duration (eqn).
    //   * Accumulate the total duration from users
    //     with smaller total duration (before).
    for (map < string, long long > ::iterator it = userDuration.begin(); it != userDuration.end(); it++) {

      if (it - > second < userDuration[user[i]]) {
        before += it - > second;
      } else if (it - > second == userDuration[user[i]]) {
        eqn++;
      }
    }
    // The time this process will have to wait is:
    //  before :
    //      The waiting time caused by users with lower total durations.
    // + (eqn-1) * userDuration[user[i]] / 2.0 :
    //      The expected waiting time caused by users with the same total duration.
    //
    // + (userDuration[user[i]]-duration[i]) / 2.0 :
    //      The expected waiting time caused by jobs from the same user.
    //
    // + duration[i]:
    //      The job's duration
    res[i] = before +
      (eqn - 1) * userDuration[user[i]] / 2.0 +
      (userDuration[user[i]] - duration[i]) / 2.0 +
      duration[i];
  }
  return res;
}

TicketPrinters

Problem Details

Used as: Division One - Level Three:


Value	900
Submission Rate	45 / 654 (6.88%)
Success Rate	21 / 45 (46.67%)
High Score	Burunduk1 for 702.75 points (16 mins 1 secs)
Average Score	475.55 (for 21 correct submissions)

There were a couple of different solutions to this problem. Most people used a dynamic programming solution during the match. But in this case, I’ll explain a solution that is arguably easier to come up to but perhaps requires more coding time. Let us first divide the problem in two parts: 1) Decide the route for the printers and 2) decide what ticket number to print on each printer. These subproblems are not independent from each other - If in the route, a printer X is very far from the start location, it may be more convenient to print a number that is closer to its starting number in it. For every possible route of printers, we would have to calculate the assignment of numbers to printers that minimizes the printing time.

Printer route

This is where the problem becomes easier. Something to note is that since we are minimizing the time it takes to print all the tickets, it is always convenient to start the printing program just after the first arrival to a printer. So, it is not necessary to visit the same printer twice unless it is to reach a new printer. Printer currentPrinter will always be visited in time 0, so we can change the problem to finding a sequence of printers other than currentPrinter that will be the order in which printers will be visited. There will be a number of printers to the left of currentPrinter and a number of printers to the right of currentPrinter, note that among the left printers, the printers closer to currentPrinter will always have to appear before in the sequence than the other printers, the same happens with the right printers. Then we can have the sequences left = { currentprinter-1, currentprinter-2, …} and right = {currentprinter+1, currentprinter+2, …}. The total sequence will be of size |left| + |right| and have left and right as subsequences. This is the same as picking |left| cells out of a row of ( |left| + |right| ) cells, and then inserting the contents of the left sequence in the picked elements and the contents of right in the remaining ones. As a conclusion we can tell that the total number of ways to visit the printers is : Combinations(|left| + |right|, |left|). The maximum number of printers is 16, in the worst case, currentPrinter will be right in the middle of the street, so there will be 8 printers to the right and 7 printers to the left. The maximum number of ways to visit the printers is therefore C(15,7) = C(15,8) = 6435, which is a relatively low number, which suggests that maybe brute force through all the possible orderings is possible, provided we can solve the other sub-problem with a fast enough algorithm.

Assigning tickets to printers

Assume that the previous sub-problem was already solved, so for each printer, we have the time before the installation of the program was possible. We need to assign tickets in such a way that minimizes the time to wait before all jobs are finished. Note that thanks to the first program mentioned in the statement, printing can be done in paralel, and in fact, the total time in seconds to print a ticket with number X starting from number Y in a printer is just abs(X-Y)+1, so the time necessary to print the j-th number on printer i is: (abs(wantedValue[j]-startingValue[i])+1) , however, the total time also requires us to add the time that was necessary to reach the printer : (arrivalTime[i] + abs(wantedValue[j]-startingValue[i])+1 ) let us call this value requiredTime[i][j], the required time to print ticked j using printer i.

The problem becomes to pick a ticked number for each printer such that the maximum requiredTime among all of the used pairs of printers and tickets is minimized. A useful simplification would be to ask “Is it possible to find a valid assignment of tickets such that none of the required times is greater than X?”. In this problem, an assignment is valid if each ticket is assigned to exactly one printer. The time condition will make it so assigning certain tickets to certain printers will not be possible (as their requiredTime may exceed X). This sort of assignment problem in which every object of a kind must be matched to another object of another kind but there are restrictions about matching some of them is the known problem of finding The maximum matching in a bipartite graph - Since printers and tickets are different kinds of nodes, the graph is bipartite - If the maximum matching is equal to the number of tickets, then an assignment in which no required times exceed X is possible. If the maximum matching is less than the number of tickets, then such assignment is not possible.

Since it is possible to answer the question “Is it possible to find a valid assignment of tickets such that none of the required times is greater than X?” . We can see that if it is not possible to find an assignment for a value X, it will not be possible to do such thing for smaller values of X. Thanks to this, we can use binary search through all the values of X until we find the minimum value of X such that it is possible to make an assignment in which no time takes more than X. Which solves the small optimization problem.

A summary of our solution so far:

Iterate through all the possible ways to visit the printers.
For each printer visit order, calculate requiredTime[i][j] for each pairing between a printer and a ticket.
Perform a binary search to minimize a value of X such that:
A perfect matching between printers and tickets is possible when there is an edge connecting every pair (i,j) such that (requiredTime[i][j] <= X).

We can tell that the number of printer sequences is O(Combinations(n,n/2)), binary search takes O(log(maximum result)) time and the bipartite matching can be implemented in O(n*3) time. For a total O(Combinations(n,n/2)log(maximum_result)nnn), which at first may appear large, but using the appropriate matching algorithm is fast enough to run bellow the time limit, for example Combinations(n,n/2) is in practical purposes Combinations(15,7). Some optimizations are possible but not necessary like for example, we can perform the binary search on top of the brute force, this way whenever the bruteforce hits a travel distance larger than X, the bruteforce will stop which will reduce the number of operations.

You may take a look at pieguy’s solutionfor an implementation of this idea.

Chat on Discord

September 9, 2010

SRM 481

DURATION

categories

Tags

share

Join Topcoder Challenges

Match Summary

CircleMarket

ChickenOracle

BatchSystem

BatchSystemRoulette

TicketPrinters