# PoisonedWine Marathon Round Solution

## Introduction

The last marathon round for this years TCO featured an interesting non-standard problem, with a time limit of 60 seconds (a first for a TCO marathon round, at least in the last couple of years) and no visualizer. Another particularity was that due to the huge randomness of an individual seed, and due to the fact that only 50 test cases were used for the provisional standing, the leaderboard during the contest was almost meaningless. In a lot of ways, this contest was more about each competitor competing against himself, since we were in the dark about the true ranking of others. The final leaderboard can change quite a bit compared to the provisional one.

## Test case analysis

First, let’s take a look at the distribution of test cases.

I will use **B** to mean the number of poisoned bottles (numPoison).

I will use **S** to mean the number of test strips (testStrips).

I will use **R** to mean the number of test rounds (testRounds).

I will use **N** to mean the number of bottles (numBottles).

I will call the cases with B <= S **sparse**, and the cases with B > S **dense**.

I will also use the terms seed and test case interchangeably to mean the same thing.

It turns out that due to absolute scoring, performance on the sparse cases had a lot more weight than performance on the dense ones.

Specifically, for the first *10000* seeds, *2252* are sparse, and *7748* are dense.

However, these 2252 seeds contribute over 75% of the final score, and the rest of 7748 dense cases contribute only 25%, meaning that a 1% improvement on the sparse test cases was 3x more valuable than a 1% improvement on the dense test cases.

Due to this distribution, and due to the fact that it was easier to improve on the sparse test cases, I focused most on my energy on improving the sparse seeds.

## Algorithms Overview

I had 2 distinct algorithms for **dense** and **sparse**, which intersected when B was close to S.

The algorithm for the dense seeds was *dynamic programming (DP)*, optimized with ternary search. It usually ran in well under one second. For dense seeds, it’s not feasible to identify exactly the poisoned bottles since we have less strips than the number of poisoned bottles, and we lose one strip for every positive test.

The algorithm for the sparse seeds consisted of 2 phases, one of which used the DP algorithm above and another one which used a classic algorithm. For sparse seeds, we could conceivably find out exactly the poisoned bottles and score a perfect 1.0 score. For example, for 1000 bottles, 10 strips, 1 round and one poisoned bottle we can always score a 1.0 score.

For some sparse seeds, with B close to S, the DP algorithm was better than the 2-phase algorithm. I used simulations to decide between applying the DP algorithm or the 2-phase algorithm.

I will now describe each element of my solution.

## DP algorithm

Let’s make the following assumption about our algorithm: we will only test disjoint intervals of bottles. Therefore, in a round, we will test u intervals of total size len. After this round, we will have on round less (obviously) and we will have to test len bottles less. If there only was a way to know the number of strips we are left with, we could see the making of a DP recursive algorithm with the meaning:

dp [ n ][ s ][ r ] = expected number of **free** bottles we can detect assuming there are n bottles left to test, we have s strips left and there are r rounds remaining (including the current one).

Then the expected number of bottles we can save is dp[ N ][ S ][ R ].

How do we compute this? We make the assumption that during a round, we will test out m bottles (out of the n), and that we will use u out of the s strips we have to do that. We will then divide the m bottles we want to test uniformly among the u strips. For simplicity, let’s assume that m is divisible by u, and let’s call m / u = x the batch size. For the actual implementation, we will vary u and x and we will compute m as x * u.

Let’s suppose that we test u strips of x bottles each. What are the possible outcomes?

We need one final assumption: the probability of a random bottle to be poisoned is always B / N, independent of the actual testing results. Now the probability of an interval of x bottles having at least one poisoned bottle is equivalent to 1 – [the probability of there being no poisoned bottle amongst the x]. The probability of a bottle not being poisoned is 1 – [probability of a bottle being poisoned] = 1 – B / N. Since we assume that these probabilities are independent, [the probability of no bottles being poisoned amongst x bottles] = (1 – B / N) * (1 – B / N) … (1 – B / N) for x times = (1 – B / N) ^ x. We can pre-compute this probability for every x, since B / N is constant for a seed.

Ok, so we know that the probability for one interval out of the u tested of being not poisoned or poisoned. Due the fact that we assume probabilities are independent, this is the same for every interval! Now we have a Bernoulli Distribution with u samples: https://en.wikipedia.org/wiki/Bernoulli_distribution

What are the possible outcomes? There are 2 ^ u of them, since each interval tested can turn out to be poisoned or not. However, the main insight is that we don’t care about the exact subset of tested intervals which turn out to be poisoned, we only care about the number of them. We can precompute this number with a simple DP: prob_not_poisoned[ s ][ u ][ x ] = What is the probability of exactly u intervals not being poisoned out of s tested, if the length of each interval is x and the probability for any bottle to be poisoned is B / N ? This DP, for a given x, is a classic problem and is left as an exercise to the reader 🙂

Let’s take an example with a specific outcome:

In the above picture, we use u=4 strips out of our remaining ones to test out intervals of size x (x is irrelevant here). We are left with n – 4 * x bottles we need to test in remaining rounds (the green box).

After testing the 4 intervals (u1, u2, u3, u4), the black ones came out positive (at least one poisoned bottle, and the gray ones came negative). Then we can add the u2 and u4 to the set of free (non-poisoned bottles). Our next state is dp [ n – 4 * x ][ s – 2 ][ r – 1], since we lost two strips (the 2 black intervals). We saved 2 * x bottles (the gray intervals). The above is just one of the possible 2 ^ 4 = 16 outcomes for testing 4 intervals. The probability for this outcome is prob_not_poisoned[ 4 ][ 2 ][ x ].

Now we have all the tools in order to compute our initial DP: For a given number of bottles n, with s strips and r rounds remaining, we try to use exactly u disjoint intervals, with 1 <= u <= s, each having length x, with 1 <= x <= n / u. We vary u and x over the allowed interval.

Then, for each possible number of not poisoned intervals safe:

- The probability of this outcome is prob_not_poisoned[ s ][ u ][ x ]
- The expected outcome (expected number of free bottles) is prob_not_poisoned[ s ][ u ][ x ] * (dp[ n – x * u ][ s – (u – safe) ][ r – 1 ] + safe * x)

Let’s unpack the last expression. If exactly safe out of the u tested intervals turn out to be not poisoned, then it means that u – safe intervals had at least one poisoned bottle, which means that we lose u – safe strips. For the next round we are left with s – (u – safe) strips then. Similarly, if we tested x * u bottles this round, and we never retest a bottle, we are left with n – x * u bottles to test for the remaining rounds. The expected number of bottles we can save then is dp[ n – x * u ][ s – (u – safe) ][ r – 1 ] + safe * x (the answer for the subproblem + the number of non poisoned bottles we detected this round).

Finally, we multiply this with the probability of obtaining this outcome, prob_not_poisoned[ s ][ u ][ x ].

To summarize, the pseudocode for this DP algorithm looks like that:

// Assume N=nrBottles, S=testStrips, R=testRounds, B=numPoison to be global variables Precompute prob_not_poisoned table. Fill the dp table with zeros. double ComputeExpectedValue(double n, double s, double n, double u, double x) { double result = 0.0; for (int safe = 0; safe <= u; ++safe) { double prob = prob_not_poisoned[u][safe][x]; double ev = dp[n-u*x][s-(u-safe)][r-1] + safe*x; result += prov * ev; } return result; } // Main DP algorithm for (int n = 1; i <= N; ++n) { for (int s = 1; s <= S; ++s) { for (int r = 1; r <= R; ++r) { if (S - s >= B) { // no more poisoned bottles remaining dp[n][r][s] = n; } for (int u = 1; u <= s; ++u) { for (int x = 1; x * u <= n; ++x) { double cur = ComputeExpectedValue(n,s,r,u,x); if (cur > dp[n][s][r]) { dp[n][s][r] = cur; best_move[n][s][r] = (u, x); } } } } } }

We now know the move we should make at the very beginning, best_move[n][s][r] = (how many strips to use, the length of an interval to test). We know send the test over the library method, and update the number of remaining strips to s – number_of_positive_tests. Rinse and repeat for the remaining of the rounds.

A couple of final things:

- For all rounds but the last, u = s, meaning that it’s optimal to always use all of the available strips except for the last round
- We can improve the above algorithm by noting that for a given u, we don’t need to try all x between 1 and n / u. If we vary x, the function we get is unimodal, eg it will increase to a point and then it will decrease. Meaning that instead of linear search over x, we can do a ternary search in the interval [1, n / u]. The intuition is that for a small value of x, we are too conservative, meaning the chance of an interval being poisoned is going to be low, but so is the potential of marking bottles as free. For a large value of x, there is a high chance of all tested intervals being poisoned, which gets us no information and loses us strips.
- Applying the 2 points above, we get the a complexity of O(N * S * R * S * log N), since we take u = s and for x we only need to test logN values.
- Finally, during the ‘live’ run (when we actually query the TC library), we adapt the probability of a bottle being poisoned from B / N to the expected number based on the number of poisoned intervals.

## 2-phase algorithm

Phew, this was long description! 🙂 In case you’re still with me, the **2-phase** algorithm is considerably easier to explain, even though it brought a significantly higher number of points. The wonders of absolute scoring 🙂

Our aim for sparse tests is simple: First we would like to isolate each of the B poisoned bottles to one interval, and then test those intervals independently. For the first phase, we are going to use a number of rounds r to try to isolate each of the poisoned bottle to one interval. Since the intervals are disjoint, we could then use the remaining number of rounds to shrink those possible ‘poisoned intervals’. For example, for B=11, our aim for the first phase would be to obtain exactly 11 disjoint intervals which tested positive. Clearly, each of them would have exactly one poisoned bottle. This only works if S >= B, and that’s the reason for categorizing between dense and sparse seeds.

For the first phase, assuming we allocate r rounds to it, we want to partition the initial N bottles in as many disjoint intervals as possible, in order to increase the chance that every one of the B poisoned bottles is in it’s own interval. If strips weren’t destroyed after positive tests this would have been easy, eg with r rounds of testing and s strips we could just split all numBottles in r * s disjoint intervals of equal length. However, with the constraints of the problem, we can just use our precomputed DP algorithm to generate the partition for us! The first move for this phase would just be in dp[ N ][ S ][ r ], where we vary r as discussed.

We are now left with the second phase.

If we are detected less than B – 1 poisoned intervals, this means what at least one poisoned interval contains more than one poisoned bottle, which is not ideal. At this point we can either redo the sieving algorithm on the remaining poisoned intervals (by joining them all up into one interval, and repeating the sieving algorithm) or we can proceed with trying to save as many bottles as possible with the intervals we have. I decide between the two options by using simulations, which will described later.

The more interesting scenario is when we get exactly B poisoned intervals after the sieving phase. We are then left with testStrips – B strips (since we lost exactly B) and with testRounds – r (since we used r rounds for the sieving phase).

Now, we can apply a powerful algorithm individually on each interval, since we know exactly one bottle is poisoned.

The above picture is an example of what we are shooting if B=3. We isolated exactly 3 poisoned intervals, which means each of them has exactly one poisoned bottle and we can proceed with our B=1 algorithm in each of them in parallel.

## B=1 algorithm

Assume N bottles, one round, S strips and one poisoned bottle Write all N bottles in binary, starting with 0 eg for N = 8, we would write 000, 001, 010, 011, 100, 101, 110, 111 We need exactly ceil(logN) strips to always detect the poisoned bottle. Strip i, used in test i will test out all bottles which have the i-th bit set to 1 in their binary notation. Eg for the case above we would have the following tests: Strip 0 (test 0) : {1, 3, 5, 7} Strip 1 (test 1) : {2, 3, 6, 7} Strip 2 (test 2) : {4, 5, 6, 7} Let’s suppose that test 0 and test 2 comes out positive, and test 1 comes out negative. Then it must be that the poisoned bottle had a 1 in the 0-th and 2-th bit, and it had a 0 in the 1st bit. It’s easy to see how this works for a general N.

Using this algorithm, with s strips we can reduce an interval containing exactly one poisoned bottle by a factor of 2 ^ s in a single round. If there is more than one poisoned bottle remaining (if the length of the interval is greater than 2 ^ s) and we have rounds remaining, rinse and repeat the algorithm. Also, note that depending on where the poisoned bottle is, we can lose any number of strips. More precisely, if the poisoned bottle has index b, we lose a number of strips equal to the number of 1 bits in b.

However, if we have more than 1 round (let’s say r rounds) we can actually reduce the interval by a factor of (r + 1) ^ s, since we can note that each strips can be destroyed (test out positive) in round 1, 2, .. r or not at all (let’s say r + 1 if the strips doesn’t get destroyed). Since each strip is independent, we should be able to distinguish amongst (r + 1) * (r + 1) … * (r + 1) (s times) bottles. This omission (even though I encountered this algorithm while researching the problem) cost me quite a few points I believe. I provide some reference material at the end of the post to help with understanding these tricky algorithms.

Finally, let’s detail how we used the above **B=1** algorithm with more than one interval.

Let’s suppose we are left with 3 intervals containing exactly one poisoned bottle, of size 2000, 1000 and 500, and with 3 strips, after the sieving phase. Let’s suppose for simplicity we have one round remaining.

How do we allocate strips between the intervals?

Since allocating s strips to an interval reduces its size by a factor of 2^s (*in terms of the size of poisoned part*), we would like to allocate strips to intervals in order to minimize **sum of all intervals I of |I| / s(I), where s(I) is strips allocated to the interval I**., which is equivalent to minimizing the length of the poisoned intervals after one round.

In our example, we would allocate 2 strips to the 2000-length interval, 1 strip to the 1000-length, and 0 strips to the 500-length. After doing one round of the B=1 algorithm on each of those intervals, we would be left with poisoned intervals of length 2000 / (2 ^ 2), 1000 / (2 ^ 1), 500 / (2 ^ 0) = 500, 500, 500 . We started out with 2000 + 1000 + 500 = 3500 potential poisoned bottles and after one round we are left with 500 + 500 + 500 = 1500, which is less than half. Not bad, and this is the reason why this algorithm (operating on interval with exactly one poisoned bottle) is so effective. Also, depending on where the poisoned bottles are, we can be left with any number of strips, where each strip has an expected probability of 0.5 of not being destroyed.

The above picture details the partition, where the red interval has length 2000, the cyan one length 1000 and the blue one has length 500. We can distinguish amongst each block of 500 bottles for each interval.

## Simulations

Finally, let’s see how we decide between the various stages of the algorithms. Since we have a time limit of 60 seconds, and since for a given seed it’s easy to generate valid assignments of B poisoned bottles amongst N bottles (by just generating random combinations on N choose B), we could do the following:

- Generate T valid samples for given numBottles, testStrips, testRounds, numPoison, using the algorithm above
- Simulate each algorithm discussed above for each of the samples, as well as combinations (using different algorithms for a specific phase in the 2-phase algorithm for example). Compute the expected score for each combination of algorithms as the mean score over all T valid samples (this is very similar to how it was done for TCO Round 2 AbstractWars).
- Choose the combination with the maximum expected score, and use this combination for the actual live run (meaning when we actually call the TC library to run the tests, instead of our Simulation library)

This helped out quite a bit for sparse seeds where the DP algorithm was better, and also for seeds where it was better to continue with the poisoned intervals after the sieving phase even though some contained more than one poisoned bottle. I decided to use **T=1000** samples to test out the algorithm in order to be able to test out more combinations without fearing Time Limit Exceeded. Also, using more samples (10000) barely increase the score.

## Closing words

What I described above is by no means the only possible solution, and the Post your approach thread has a lot of insightful answers, many touching on information theory. Many competitors also used the DP algorithm and the 2^s trick, some using the (r + 1) ^ s trick as well. I invite you to read that thread. There also multiple nuances which were not described above (it turns out that for B=2, with one round remaining we can also do slightly better by employing 2-disjunct matrices).

Thanks to charles.lobo for kindly providing a template for this blog post, and to gorbunov for suggesting I write this blog post.

**Reference materials:**

- Explanation for the B=1 case: https://medium.com/i-math/a-king-1000-bottles-of-wine-10-prisoners-and-a-drop-of-poison-2dd1959a8dd2
- Explanation for the B=2 case: https://www.youtube.com/watch?v=ashnUa2U16A
- Explanation for the B=1 case, with more than one round: https://puzzling.stackexchange.com/questions/12923/nine-bottles-of-wine-one-poisoned-two-mice-two-tests