August 26, 2010

SRM 480

mystic_tcmystic_tc

DURATION

25 min

Match Summary

Thank you for participating in this NSA-sponsored TSA-themed SRM. I am not a representative of TopCoder Security Agency but that’s not important. This 9PM SRM attracted 1293 contestants, which is an unusually big number for this time slot. The problems looked easy and approachable, so the average number of solutions submitted / contestant was pretty high, above 1.6 in both divisions. However, the examples were carefully designed to be weak (which sounds weird) and this resulted in lively Challenge Phase and deadly System Test. The second and third places in both divisions were determined during the Challenge Phase, the pursuit for these ranks was intense during the contest. Finally, around 80% of Division 1 coders and around 85% of Division 2 coders managed to have at least one correct submission.

Both divisions had only one contestant that managed to solve all three problems, both of them became champions in their respective divisions. In Division 2, we have chenhsi and SAPikachu as the third and second place winners, respectively, both obtaining their respective positions with the help of at least 200 points gained from the Challenge Phase. In the first place is bupjae who showed remarkable performance by solving all the three problems, a feat that no other Division 2 coder managed to accomplish during the contest. In Division 1, msg555 and slex managed to snatch the third and second position due to their strong performance in the Challenge Phase. However, the champion Petr stands alone at the top as the only person who managed to solve the intimidating 1100 pointer during the contest, allowing him to solve all three problems of this match. TopCoder Security Agency is grateful for otherwise they could never decrypt the messages that the enemy is sending.

Cryptography

Problem Details

Used as: Division Two - Level One:


Value	250
Submission Rate	709 / 762 (93.04%)
Success Rate	674 / 709 (95.06%)
High Score	Mimisi for 249.97 points (0 mins 17 secs)
Average Score	224.74 (for 674 correct submissions)

Brute Force

This problem can be solved straighforwardly by trying to increase each of the numbers and counting the product. Since the maximum product is guaranteed to be at most 2^62, we will not need to worry that any of these calculations will overflow. See arciu’s solution.

Math

This solution increases the lowest number among the available numbers and claims that the resulting product has the maximum value. This can be proven fairly easily:
Let X = A1 * A2 * A3 * … * AN, where Ai is the i-th number.
If we choose to increase the value of Aj, then the resulting product is:
A1 * A2 * … * (Aj + 1) * … * AN = X + A1 * A2 * … * A(j-1) * A(j+1) … * AN.
Therefore, in order to maximize the value of A1 * A2 * … * A(j-1) * A(j+1) * … * AN, we should have as small Aj as possible.
See kit1980’s solution for an in-contest implementation.

Remarks

The original problem had no 2^62 limit and instead asked for the maximum value modulo 1,000,000,009. This allows only the second solution to pass, which would make an extra-hard D2-Easy Problem. Nobody likes that, hence the simplified problem was used.

InternetSecurity

Problem Details

Used as: Division Two - Level Two:


Value	500
Submission Rate	493 / 762 (64.70%)
Success Rate	182 / 493 (36.92%)
High Score	chenhsi for 458.80 points (8 mins 40 secs)
Average Score	296.15 (for 182 correct submissions)

Used as: Division One - Level One:


Value	250
Submission Rate	520 / 531 (97.93%)
Success Rate	378 / 520 (72.69%)
High Score	daizhy for 247.53 points (2 mins 50 secs)
Average Score	195.15 (for 378 correct submissions)

This is an implementation problem, the goal is to implement the algorithm described in the problem statement as effortlessly as possible. The algorithm itself looks something like this:

1
2
3
4
5
6
7
(A) read the list of keywords
for each website.
  (B) x = newDangerousWebsite() //it returns the id of a new dangerous website or -1
while (x != -1)
  mark x as dangerous(C) mark all of x 's keywords as dangerous
x = newDangerousWebsite()
return the answer

(A)Input parsing

Parsing single space separated list of strings is a pretty standard task presented in SRMs. Most coders probably have this algorithm available either in their bags-of-tricks or provided by the language. Should you not, it is highly encouraged that you do so, since by having such algorithm pre-implemented, you can add several points to the scores of your upcoming contests.

(B) and ©: counting the number of dangerous keywords of a website and marking keywords as dangerous

This is the part that the algorithm dedicates most to. Both can be easily implemented with the set data structure, where all dangerous keywords are stored. For (B), we simply iterate over all website keywords generated by (A) and check whether the set contains the keyword, in which case we increase the number of dangerous keywords for the current website by 1. In the end we will have the number of dangerous keywords that this website contains and if it is at least threshold, we can return it as the return value of newDangerousWebsite(). For ©, we simply add the new keywords to the set, which is one of the basic operations with a set.

wata implements the ideas above in his in-contest solution.

Remarks

Except for www.topcoder.com, the website names are fictional. If you decide to open any of them in your browser, do so at your own risk :-).

SignalIntelligence

Problem Details

Used as: Division Two - Level Three:


Value	900
Submission Rate	43 / 762 (5.64%)
Success Rate	4 / 43 (9.30%)
High Score	jjosi for 627.44 points (20 mins 43 secs)
Average Score	510.71 (for 4 correct submissions)

Simplifying the problem

Written in the problem statement:

In order to allow unique decryption, the following property must hold after all numbers from the signal are allocated. For every two integers from the input signal, there must be at least one ‘0’ character between the sequences of '1’s that represent these integers.

We claim that we can remove this restriction by increasing each number by 1, solving the simplified problem, and then returning the answer minus 1. This is true since we can treat each number as several '1’s followed by a single ‘0’ that does not overlap with each other. We have to decrease the answer by 1 since the last number will have a single trailing ‘0’ as described above.

In all discussions below, we will assume that the problem without the restriction above is to be solved.

Partial Solution

Let us call the numbers as N1, N2, N3 … . Suppose we must preserve the numbers’ order, that is, for i < j, the representative of Ni must be located to the left of the representative of Nj. In other words, suppose we are given the order of numbers in which they must appear in the encrypted string. We can then greedily find the shortest length string that encrypts this sequence of numbers by allocating the numbers one by one, starting from the first number. We allocate each number to the leftmost available place in each step. Pseudocode-wise, the algorithm looks like this:

1
2
3
4
5
6
7
8
9
function partialsolution(int[] N) //N[i] is the i-th number
int nextavailableposition = 1 //we can initially place to this position
int lastallocatedposition = 1
for i = index of each element of N
//we allocate N[i]
lastallocatedposition = nextavailableposition + N[i] - 1
//we update nextavailableposition. Remember that it must be a power of 2.
while (nextavailableposition <= lastallocatedposition) nextavailableposition *= 2
answer = lastallocatedposition

Solution

This problem can be solved greedily with the following powerful observation.

Let X1, X2, X3, …, XN be the numbers. We must allocate them in the encrypted string in some order. Suppose that Xi is the last number allocated number in this order. Then, it is optimal to allocate the rest of the numbers starting from the smallest one and in increasing order. That is, if Xi is the last number allocated, then the smallest possible length of the encrypted string is partialsolution(sortascending(X1,X2,…,X(i-1),X(i+1),…,XN) + Xi).

We’ll see that this is true by contradiction. Suppose there exist j and k (both are not equal to i), j < k such that Xj > Xk and the representative of Xj is to the left of the representative of Xk. The proof will be complete if we show that we can swap the locations of the representatives of these two numbers, since we can then apply this until all numbers except Xi are sorted. We suppose that Xj is allocated at position 2^Pj and Xk is allocated at position 2^Pk. Since Xk < Xj, it follows that Xk can be placed in the former position of Xj. Now, since the length of Xj must be less than 2^Pk (since otherwise Xk cannot be originally allocated at 2^Pk), we can place Xj in the former position of Xk without affecting the position 2^(Pk+1). Hence, the proof is complete.

Reference

The algorithm is surprisingly concise (which explains the ‘900’ points). jjosi implements the ideas presented above very clearly and his solution is a recommended reference.

Remarks

I am surprised by the low number of correct submissions for this problem. This problem was going to be used as Easy problem in Division 1 and I am now glad that we have refrained from making such decision. Otherwise I expect that coders in Division 1 would have a memorable experience of another SRM with less than 50% of its contestants having a positive score.

NetworkSecurity

Problem Details

Used as: Division One - Level Two:


Value	450
Success Rate	176 / 359 (49.03%)
High Score	Petr for 425.24 points (6 mins 56 secs)
Average Score	270.99 (for 176 correct submissions)

Observation

Theorem 1: In an optimal solution, there are no data gates put at any Client->Client data cable.

To see this, suppose that there is a data gate on some Client->Client cable, for example, Ci->Cj. Let’s check a data path Cx1,Cx2…Cxi,Cxj,…Cxn,Sy (C are clients and S is server). Suppose there is another data path that connects Cxn to Sy other than Cxn->Sy. Then, we extend the current data path from Cx1 to Sy by using this data path between Cxn and Sy. Let’s apply this operation as many times as needed until there is no other data path from Cxn to Sy (which will happen at some point since the network is acyclic). Now, since there exist no other data path from Cxn to Sy, the cable Cxn->Sy must have a data gate installed on it. Hence, we can remove the data gate on Cxi->Cxj and the network will still remain secure. Hence, the observation is proven.

Preliminary Solution

Consult the following pseudocode.

1
2
3
4
5
6
7
8
9
10
unmark all clients
//clientFind() picks a client that only has outgoing cables to marked clients, -1 otherwise.
client = clientFind()
while (client != -1)
  for each server S such that there exist a cable client - > S
If there doesn 't exist another client C such that we can reach C from
client and there exist a cable C - > S
install a data gate on the cable client - > S
mark client
client = clientFind()

To see why this work, we will first see that:

Theorem 2: If a data gate is installed on a cable by the algorithm above, then there will be a data gate on this cable in any optimal solution.

When we’re going to install a gate on cable client->S by the algorithm above, from the if condition in the algorithm it follows that this cable is the unique path from client to S. Therefore we must install a data gate on it since otherwise the network will definitely be insecure. The theorem is proven.

Finally, we now need to prove the following:

The network produced by the algorithm above is secure.

We can establish the following properties by induction on the number of clients that are marked:

If C is a marked client and S is a server such that there exists a data path C->S, then one of these data paths contains a cable with data gate installed on it.
If client is chosen from clientFind() and C is a client such that there is a data path client->C, then C is already marked.

The details are straightforward and are left to the reader.

Implementation

The intended solution is surprisingly short. We iterate over all clients C. For each server S such that there’s a cable C->S, we check whether there exist another data path from C to S not through this cable. If there isn’t, we install a data gate on this cable (by simply increasing the answer by 1). The proof of correctness follows from the previous section. Refer to rng_58’s extra-elegant solution.

Remarks

Due to the weak examples, many submissions on this problem failed. Fortunately, the main purpose of this problem was to allow coders to solve this problem quickly and concentrate on the very challenging 1100 pointer, and it served this purpose very well. There were a lot of quick submissions on this problem.

StringDecryption

Problem Details

Used as: Division One - Level Three:


Value	1100
Submission Rate	1 / 531 (0.19%)
Success Rate	1 / 1 (100.00%)
High Score	Petr for 552.71 points (37 mins 11 secs)
Average Score	552.71 (for 1 correct submission)

Simplified version

We will first discuss the algorithm to solve the problem where the procedure is applied only once as opposed to twice in the original problem. It is very encouraged that the readers solve this problem first and seek for an N^2 algorithm which works. This will greatly help in understanding the solution for the original problem.

We will name each pair of integer and character (that was used to encrypt the message) as i_pair and c_pair. So for example if “1111” is encrypted to “41”, 4 is the i_pair part and 1 is the c_pair part.

Suppose that X is the resulting string after one encryption. The decryption procedure basically consists of breaking X into several pieces and interpreting each of these pieces as i_pair + c_pair. Suppose we go through the string X from left to right, selected several i_pair + c_pair pieces and the last currently selected c_pair is at position A within X.

Definition 1

DP[A] is the number of ways to select pieces in such way that the last c_pair is at position A.

Note: the positions in X are 0-based, we explicitly append a sentinel (non-digit) character to the beginning of X and assume that DP[0] = 1.

The algorithm looks like this:

    Algorithm 2 (str2 is the resulting string after one encryption step)
    For i = 0 to length(X)
      for j = i+2 to length(X)
        //we try to extend to j
        if (str2[j] != str2[i] && str2[i+1] != '0') 
          DP[j] += DP[i]

Refer to Definition 1 if you fail to understand some part of this. We try to make the digits on positions i+1 to j-1 the next i_pair and the digit at position j the next c_pair. We will not overcount, since no leading zeroes are allowed and no identical consecutive c_pair is allowed (which makes every choice we make distinct).

Solution

We are now ready to brace ourselves to the heart of this 1100 pointer. Don’t worry, the algorithm is not difficult, it’s only hard to come up with. Let’s denote the original string as STR0, the string after one encryption step as STR1, and the resulting string (i.e., the string after two encryption steps) as STR2.

As previously, we will go through STR2 from left to right, and try to break it into i_pair + c_pair pieces, thus obtaining some content for STR1 string. At the same time, we will also try to break STR1 string into i_pair + c_pair pieces as well.

Suppose we go through STR2 from left to right, selected several pieces and the last c_pair is located at position A. Additionally, suppose that we select pieces within STR1 generated according to the selection made within STR2. Obviously, there are several (maybe none yet) complete pieces within STR1 and there can also be one piece that is not yet completed. Let B be the c_pair part of the last complete piece of STR1, and let C be a boolean indicating whether or not there is a piece in STR1 which is not yet complete.

Definition 3

For a resulting string STR2, we define DP[A][B][C] as the number of ways to select pieces in STR2 (and in STR1 at the same time) so that A is the position of the last c_pair in STR2, B is the last selected c_pair value in STR1 and C is a boolean indicating whether there’s a non-completed piece in STR1 right now.

The idea is most easily grasped with an example.

STRING : 1124512344
INDEX : 0123456789

Let us refer to the main iteration of Algorithm 2. We add an additional loop for B and C (Definition 3). Let’s suppose i = 2 and j = 5 (Algorithm 2). As described just after Algorithm 2, we’re going to have i_pair = 45 and c_pair = ‘1’ in STR2. So the idea is, we’re going to add 45 copies of '1’s to STR1. Among these 45 copies of '1’s, at most one of them can be a c_pair for STR0 (since we’re not allowed to have several identical consecutive values of c_pair). Obviously, it is not possible to select a c_pair within them if B = ‘1’, since otherwise we will have a consecutive identical c_pair which violates the “maximum-set” condition from the statement. Otherwise. if B is not equal to ‘1’, we can always choose any of the 43 '1’s (all '1’s except the first and the last one) as c_pair and leave the rest of trailing '1’s for the next c_pair. We can choose the first ‘1’ as c_pair only if there is already some trailing numbers before this in STR1 (that is, if C = TRUE). We can also choose the last ‘1’ as c_pair, and there won’t be any trailing number left, so C will become FALSE after this. Don’t forget that we also have the option not to turn any of these characters into c_pair.

We are now ready to present you a pseudocode.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
for int i = -1 to N - 1
for int j = i + 2 to N - 1
//we need to ensure that no consecutive c_pair appear in STR2
if (code[j] != code[i])
  //we let X be the number of copies of code[j] that we're going to add
  int X = getnumber(i + 1, j - 1)
for (int B = each digit)
  for (int C = {
      TRUE,
      FALSE
    })
    //we don't turn any of these X copies into c_pair
    dp[j][B][TRUE] += dp[i][B][C]
//if B == code[j] we can't pick any of them as c_pair
if (B == code[j]) continue;
//pick the first one as c_pair
if (C == TRUE) dp[j][code[j]][TRUE] += dp[i][B][C]
//pick the last one as c_pair
dp[j][code[j]][FALSE] += dp[i][B][C]
//pick the middle ones
dp[j][code[j]][TRUE] += dp[i][B][C] * (X - 2)

There are obviously a lot of cases that this algorithm doesn’t cover. Fortunately, all of them are easy to see once you see the idea of how the algorithm above works. They are: 1) handling that we must have no leading zero, and 2) special case when X = 1.

For the full implementation details, I suggest you to take a look at the only submission made during the contest, and it’s made by Petr.

Remarks

The original problem statement is posted in the forums and we decided that it’s very difficult. Hence, to make this more approachable, we have removed the alphabets and added a lot of strong example test cases. Our hope was that once a solution passes the examples, it will pass the System Test as well. Actually, even after these strategic moves, we were still not sure that anyone would actually get this problem correct. Petr may have proven that we were worrying too much, but perhaps not too much since without him this problem would remain unsolved during the contest.

Final Remarks

As a final remark, I hope you enjoyed the contest and the editorial at least as much as I enjoyed them.

Chat on Discord

August 26, 2010

SRM 480

DURATION

categories

Tags

share

Join Topcoder Challenges

Match Summary

Cryptography

InternetSecurity

SignalIntelligence

NetworkSecurity

StringDecryption