March 27, 2018 On Marathon Match 99

Marathon match 99 featured a task co-written by timmac and Nickolas. The question was to guess optimal tactics to play slot machines in a casino. The specifics of the task was that the slot machines are biased, ie. some symbols appear on their wheels more frequently than the others. This implied that a player was supposed to select the most profitable machine and play on it for the rest of time.
You can read the full problem description here.

How to solve it?

The main question, as it turned out, is how to find the most profitable slot machine? Once you found it, and spent preferably the smallest amount of time on this search, you can just play this machine for the rest of the time.
The following is a purely theoretical view on the task, as I wrote no single line of code during or after the match, on the subject.
Let us assume that we can gain some profit out of playing time-consuming version of the play. If this is false, we need to detect this situation early and use a different strategy that will be discussed later. First thing to note is that the size of each wheel is not known initially. However, the amount of possible wheel sizes is not big: the sizes of wheels lie in range [10..30], which means that there are 30-10+1=21 possible wheel sizes. Also note that all three wheels will have the same size in any given seed. Daiver19 proposed on the post-match discussion forum to calculate probabilities for each of the 21 possible wheel sizes. He suggests to calculate the probabilities of wheel #1 being of each of the sizes in range [10..30]. Do the same for wheel #2 and #3. Then, in order to determine an actual most probable size of all three wheels, take into account the fact that they all must be of exactly the same length. So the probability of the wheels being of size exactly X is p[wheel #1 is of size X] by p[wheel #2 is of size X] by p[wheel #3 is of size X]. Take the size that has maximum probability as a reference. If this maximum probability is too low, then perhaps we ran too little simulations and should continue gathering information about the machines.
How do we calculate and update the probabilities of a wheel #N having size of X? One possible solution would be to notice that there are at most 7^3=343 possible triplets of letters that can appear vertically on each wheel. Each triplet can appear either zero or several times during the observation on a wheel. Thus it would be natural to introduce a map (implemented as an array) from triplet ID to the amount of times it was appearing on a wheel #N. Store this statistics for each wheel. Then, for each wheel try to merge the triplets observed on that wheel. This is an interesting subtask.
How do we merge the triplets from the same wheel? One way to do it is to use the fact that we cannot do too much “long” play, as the time will run out. The expected amount of plays for each slot machine is 10-20. Thus, we will end up with up to ~20 triplets for each wheel. I suppose that it is possible to implement the join procedure using backtracking with cuts then. Most of the branches will not be evaluated, either because the next triplet does not match the ending of the previous chain, or because the chain becomes too large or too short.
In theory, we will end up with very few feasible backtracking branches, and will be able to evenly split the 100% probability between all of them.
Let us take an example. Suppose that for wheel #1 the following triplets were observed:
BAC, DAB, ACD, ABB. If I did not mess things up, there is just a single way of joining them into a loop. Do it as a mind exercise.
We have the following map of triplets for wheel #1:

ID	Triplet	Repeats
149+07+2=51	BAC	1
349+07+1=148	DAB	1
049+27+3=17	ACD	1
049+17+1=8	ABB	1

All the other 339 records have 0 repeats and are omitted.
Here I used A=0,B=1,C=2, etc., and triplet is just a heptimal (base 7) number.
Then we have the following table for lengths of wheel #1:

Length	Wheel #1 chain	Wheel #1 probability
6	BACDAB	100%

Here I have chosen the length of 6 on purpose to simplify things. However, in practice length 6 will not be possible, because 6 is less than the minimum allowed length of 10.
So now we know how to determine the most probable lengths of wheels. So far so good. How can it help us in calculating the best slot machine?
When we do know the length of all wheels, and in turn also know the most probable content of wheels, we can calculate the expected amount of money won on a machine. To do so, first calculate the probability that symbol A appears on wheel #1. Do the same for symbols B, C, D, E, F, G. Do the same for wheel #2 and #3.
Now we have a table:

Symbol	Wheel #1 probability	Wheel #2 probability	Wheel #3 probability
A	2/6
B	2/6
C	1/6
D	1/6
E	0
F	0
G	0

Now multiply the probabilities for A symbol to appear on all the 3 wheels simultaneously. We will get some %. Multiply these % by the coins won by triple A = 1,000.Calculate the expected winning for symbol B the same way. Do so for the other symbols. Then just sum up the expected wins fy all 7 possible triple-same-symbol. This is your expected winning on a machine.
The rest of the solution is just repeating the procedure above for all the given machines, stopping earlier and selecting the best machine. Last step is to use this best machine for the rest of the time, assuming it gives expected profit bigger than 1.
A tricky case is when we do not have enough time and/or coins to check all the machines and gather their statistics. Just use fast play in this case, by gathering simplified statistics on each machine and skipping less profitable machines. Daiver19 suggests to use Bayes theorem here (and actually in the “slow” game as well).

Issues during the match

This time the competitors were not provided with the official local tester also known as the visualizer. However, during the match it was decided by admins to allow competitors to share their own local testers, and muhspeed did so. Please note however that unless officially announced, like it was the case this time, competitors should not share the code, even if it is not the solution’s code.
During the match I assumed that the absence of the local tester was done on purpose, to stimulate competitors to write their own and thus practice programming. This implementation of the local tester is easier for some and harder for other topcoders, as it requires some precision, patience, and time. Anyway, developing a non-trivial solution without the local tester seems hardly possible to me, as the debugging on the server via console output once per 30 minutes is not feasible.

Next match

The next match is a special one, numbered 100. It is scheduled to start on April 18, 21:00 UTC-4. Those veterans who were here back when Topcoder started running Marathons know that MM1 was not actually the first of the chain. There were Beta Marathon and Intel Multi-threading series starting from 2005. As Topcoder problem statements tend to count everything starting from zero, let us call this Beta match zero-th.
Do not miss the round-numbered match!