Single Round Match 801 Editorials
HouseNumbering
The constraints are small, so we can afford to generate all the house numbers, take each of them apart into individual digits, and keep track of the counts. In order to extract digits from a number, we can repeatedly look at its remainder modulo 10 (which is the last digit) and then divide it by 10 (which removes the last digit).
public int[] prepareDigits(int firstHouse, int numberOfHouses) {
int[] answer = new int[10];
for (int i=0; i<numberOfHouses; ++i) {
int currentHouse = firstHouse + 2*i;
while (currentHouse != 0) {
int lastDigit = currentHouse % 10;
++answer[lastDigit];
currentHouse /= 10;
}
}
return answer;
}
If you found this problem easy, try solving it for much bigger constraints. There are efficient solutions even for firstHouse, numberOfHouses <= 10^15. (If you don’t know how to deal with this harder task, start with an easier case where firstHouse = 1 and observe patterns.)
CreateMixture
The problem gives us a lot of freedom, for example it allows us to use three inputs to the mixing machine. This is almost always a bad idea. E.g., if we mix one input with chemical X and two with water, we will get a mixture with 333.3333(3) promille of chemical X, and that probably won’t help us produce the desired concentrations that have an integer number of promille. As a fraction, the concentration we just produced is 1/3, while the ones we want have 10, 100 or 1000 in the denominator.
The output concentration is always (sum of inputs) divided by (number of inputs). If we want to have 1000 in the final denominator, always using all ten inputs seems like a good idea. Let’s see whether we can solve the problem this way. Below, all concentrations are given in promille (and all of them are integers).
Suppose we use all ten inputs. What mixtures can we get in the first use of the machine? If we set n inputs to chemical X and 10-n to water, we will get the concentration (n*1000 + (10-n)*0) / 10 = n*100. That is, we can produce the new concentrations 100, 200, …, 900.
For example, if we mix the inputs 1000, 1000, 1000, 0, 0, 0, 0, 0, 0, 0, the sum of all inputs is 3000 and thus their average is 300.
What happens if we now use one of these new mixtures along with water as the inputs? For example, if we use 100, 100, 100 and seven times 0, we will get the concentration 300 / 10 = 30. Just like we could produce 100, 200, …, 900 from 1000 and 0, we can now produce 10, 20, …, 90 from 100 and 0.
How can we produce something with concentration 470? Remember that we want to use all ten inputs. If we want the average to be 470, we need to have the sum of all concentrations = 4700. Can we do that already? Sure we can: four inputs will be pure chemical X with concentration 1000, one input will be 700 and the other five will be water. That is 4000 + 700 + 0 = 4700.
The above observation tells us how to make any concentration of the form ab0 (i.e., 10, 20, …, 100, 110, …, 990): in the first step produce b00 by using b*(concentration 1000) + water, in the second step produce ab0 by using a*(concentration 1000) + 1*(concentration b00) + water.
And now we only need to do the same one more time: if we have a general concentration abc, we can first produce concentration c00, then use that to produce bc0, and then use that to produce abc.
For example, here is a complete recipe that produces the concentration 947. (Every other concentration can be produced in exactly the same way.)
Make concentration 700 by combining inputs with concentrations 1000, 1000, 1000, 1000, 1000, 1000, 1000, 0, 0, 0.
Make concentration 470 by combining inputs with concentrations 1000, 1000, 1000, 1000, 700, 0, 0, 0, 0, 0.
Make concentration 947 by combining inputs with concentrations 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 470.
Thus, there is always a solution that produces the desired mixture in (at most) three steps.
public int[] mix(int concentration) {
if (concentration == 0 || concentration == 1000) return new int[0];
int d3 = concentration%10; concentration /= 10;
int d2 = concentration%10; concentration /= 10;
int d1 = concentration%10; concentration /= 10;
int[] answer = new int[30]; // filled with 0 by default
for (int i=0; i<d3; ++i) answer[i] = 1;
for (int i=0; i<d2; ++i) answer[i+10] = 1;
for (int i=0; i<d1; ++i) answer[i+20] = 1;
answer[19] = 2;
answer[29] = 3;
return answer;
}
SecondDiameters
We will start by creating O(n^2) records of the form (points i and j have square distance d(i,j)).
We will put all of these records into buckets according to their square distance, and we will sort these buckets. Doing so takes O(n^2 log n) time.
For each point i, we can now find the square of its second diameter as follows:
For each bucket in descending order: examine the records in the bucket until you either go through the bucket or find a record that does not involve point i.
Stop as soon as you find the second largest bucket with a record that does not involve point i.
Clearly, for each point we will look at n+2 or fewer records: we will always see exactly two records that don’t involve point i, and there are only n records that do involve i. Thus, in O(n) we will determine the second diameter for S[i], and therefore in O(n^2) time we are done with the task.
(Lazier implementations work too, e.g., you could just sort all the records in a single array. I chose the above description and implementation because it’s easier to argue about its time complexity.)
EquivalentStrings
Given a string, consider a directed graph where the individual characters are labeled nodes, and each node has edges to other nodes that are and must remain before this node. Valid ways to reorder the string correspond to valid topological orders of the graph. (*)
The lexicographically smallest topological order is always unique. (When choosing the next node, we can never have a tie because nodes with the same digit cannot swap places.) Thus, we can find a canonical representation for each equivalence class of strings by constructing the lexicographically smallest topological order in the above graph. Two strings are equivalent iff their canonical representations are the same.
When constructing the topological order of a string of length n, we don’t need all Theta(n^2) edges. All we need are the shortest ones: for each node d it is enough to have an edge to at most three other nodes: the last node with d-1, the last node with d, and the last node with d+1 to the left of this node. This way we can compute the canonical representation of a string in linear time.
(*) It should be obvious that valid ways to reorder the string are a subset of valid topological orders. To have a formally correct proof of this claim, we need to show that each valid topological order can actually be produced from each other by a series of allowed swaps -- i.e., that there cannot be multiple groups of topological orders such that going from one group to another always requires many changes. To show this, take the lex min topological order and any other valid order. Let x be the first node from the lex min order that is out of place in the other order. All prerequisites of x are already placed in the common prefix of the two orders, so we can freely shift x to the left using a series of swaps until it reaches its proper place. Repeating this argument gives us the proof we needed.
PolylineCover
We will always use a rectangle with dimensions 200000 x 100000. This rectangle has exactly the maximum allowed area. Below we will show constructively that it can cover any valid polyline.
Suppose the rectangle is placed with vertices at (0,0), (0,200000) and (100000,200000). We’ll call the line x=50000 the major axis of our rectangle.
Let’s mark two points X, Y on our polyline. If its length L is at most 100,000, we’ll mark its endpoints. Otherwise, we will mark the two points that are (L-100000)/2 from each end. E.g., for the maximum length polyline with L=200,000 we will mark the points that are at distance 50,000 (measured along the polyline) from each end.
We will place the rectangle so that its major axis lies on the segment XY and the centers of the rectangle and segment XY coincide. We claim that the entire polyline is now covered. This is because (when measured along the polyline) each point on the polyline has distance from X or from Y at most 50,000, and thus its actual Euclidean distance from X or Y is at most 50,000. And all such points are covered by our rectangle.