# January 5, 2023 Single Round Match 843 Editorials

## AveragesOfThree

The problem becomes simpler if instead of the given array B we look at an array C in which each C[x] is 3*B[x].

When the values in B are the averages of three consecutive elements, the values in C are their sums.

Then:

- C[0] is the sum of 0 + 0 + A[2], hence A[2] = C[0].
- C[1] is the sum of 0 + A[2] + A[3], hence A[3] = C[1] – A[2].
- C[2] is the sum of A[2] + A[3] + A[4], hence A[4] = C[2] – A[3] – A[2].
- … and so on.

Each of the following values in C is the sum of two values we already know and one value we don’t, and so we can easily determine the unknown one.

```
public int[] restore(int[] B) {
int N = B.length;
int[] A = new int[N+2];
A[0] = 0;
A[1] = 0;
for (int n=0; n<N; ++n) A[n+2] = 3*B[n] - A[n+1] - A[n];
return A;
}
```

## DateCorrector2023

This problem only requires careful implementation of the given rules. For each format we need to do the following:

- check whether the input has this format
- parse the date
- check whether it is a valid date in 2022
- if yes, format the new date

Regular expressions are a great way for checks of the type we need here. Using one is much more convenient than doing the check manually.

An easy way to parse the date is to split the given string using the correct separator, and then to convert each of the pieces into an integer.

When checking the validity of a date, don’t forget that neither the day nor the month can be zero. (Years 2022 and 2023 are not leap years, so there is no special case related to February 29th in this problem.)

Fixing the date can simply be done by using a string search and replace command in your language to replace the substring “2022” with “2023”. An alternate good way to do this step is to use your language’s string printing/formatting routines – e.g., each language has a way to format a number to a given length, adding leading zeros as needed.

Below is an example implementation that tries to avoid copy and paste. For each format we store the separator used and the order in which day, month, year appear. Then we use those to construct the appropriate regular expression that matches the format, and once the token matches it, we can extract the date, check it, and if it’s good, we use string.replace() to fix the year.

```
public String fix(String token) {
int[] daysOfMonth = { 0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31 };
String[] separators = {"-", "/", "[.]"};
int[][] orders = { {2,1,0}, {1,0,2}, {0,1,2} };
String digit = "[0-9]";
String[] digitPatterns = { digit+"{2}", digit+"{2}", digit+"{4}" };
String[] datePatterns = new String[3];
for (int t=0; t<3; ++t) {
datePatterns[t] = digitPatterns[ orders[t][0] ];
datePatterns[t] += separators[t];
datePatterns[t] += digitPatterns[ orders[t][1] ];
datePatterns[t] += separators[t];
datePatterns[t] += digitPatterns[ orders[t][2] ];
}
for (int t=0; t<3; ++t) {
if (token.matches( datePatterns[t] )) {
String[] pieces = token.split( separators[t] );
int[] values = new int[3];
for (int i=0; i<3; ++i) {
values[ orders[t][i] ] = Integer.parseInt( pieces[i] );
}
int dd=values[0], mm=values[1], yyyy=values[2];
if (yyyy != 2022) continue;
if (mm < 1 || mm > 12) continue;
if (dd < 1 || dd > daysOfMonth[mm]) continue;
return token.replace("2022", "2023");
}
}
return token;
}
```

## ChristmasSongTrauma

Let T be the total length of the tape and S the length of the shortest song on the tape. As soon as visitTime > T-S, Tommy is guaranteed to hear all songs and thus all entry times are equally bad. We will handle this case and below we will assume that visitTime <= T-S and thus during the optimal visit Tommy hears fewer than N songs.

We can now proceed as follows: Imagine any possible visit for Tommy. Suppose a song is playing when he enters. What would change if he instead entered exactly when that song started playing?

Clearly, in the new visit he would hear at most as many songs as in the original one: he would hear more of the first song (which does not add any new heard songs) and correspondingly less at the end (which may cause him to hear fewer songs, but clearly cannot make him hear more).

Hence, an optimal solution must be among the N possible visits that start exactly as one of the songs begins playing. We can check all these by brute force.

Now we know the optimal number of songs Tommy can hear. And not just that. If we extend the above thought process, we can easily find all optimal entrance times: for each song s1 such that it is optimal to enter as song s1 starts, the optimal entrance times during s1 form a prefix of s1. Why and which one? Let s2 be the last song Tommy will (partially) hear if he enters as song s1 starts. Then the latest time when he still can enter optimally during s1 is the entrance time for which song s2 will end exactly when he leaves. This entrance time still has the same number of heard songs as the original one, and any later entrance time during s1 will have at least one more song and thus it will no longer be optimal.

This gives us an O(N^2) solution. There are solutions with a better time complexity (think “sliding window”) but this was not necessary.

```
public double fewest(int[] playTime, int visitTime) {
int N = playTime.length;
int tapeLength = 0;
for (int t : playTime) tapeLength += t;
int shortest = playTime[0];
for (int t : playTime) shortest = Math.min( shortest, t );
if (visitTime > tapeLength - shortest) {
return 1.; // guaranteed to hear all songs
}
int bestCount = N+1, bestWays = 0;
for (int n=0; n<N; ++n) {
// evaluate and count the best solutions if he enters during song n
int count = 0, length = 0, where = n;
while (length < visitTime) {
count += 1;
length += playTime[where];
where = (where + 1) % N;
}
if (count > bestCount) continue;
int ways = Math.min( playTime[n], length-visitTime+1 );
if (count == bestCount) { bestWays += ways; continue; }
bestCount = count; bestWays = ways;
}
System.out.println(bestCount);
System.out.println(bestWays);
return 1. * bestWays / tapeLength;
}
```

A note on incorrect implementations:

Some solutions may have included a weaker version of the check we made at the start of this writeup: they answer 1 if visitTime >= T and do the rest of the solution if visitTime is smaller. This isn’t a problem on its own, the conclusion is still correct, but depending on the rest of your implementation the weaker check may leave you with an extra tricky case you’ll now need to handle later: with T-S < visitTime < T it is possible to hear N+1 song segments, but those entry times are still optimal as the first and last song segment belong to the same song.

For example, consider the following input: songs are {50, 100, 100}, so T = 250, and visitTime = 201. From the reasoning we made above we already know that Tommy is doomed to hear all three songs. However, if we just blindly implement the above solution, on this test case it will fail. Why? Consider what happens if Tommy enters as song #1 starts. He will hear 100 seconds of that song, then 100 seconds of song #2, and then one second of song #0. We can now get the same outcome if he enters up to 49 seconds later: he will hear a shorter part of song #1 at the beginning and then more of song #0 at the end. However, what happens if Tommy enters during the second half of song #1? He will hear the end of song #1, then the full songs #2 and #0, and then the beginning of another copy of song #1. Here it is easy to make the mistake to interpret this situation as him hearing four distinct songs (or to get an invalid memory access).

One possible easy way to implement a solution that doesn’t have the above issue is to use brute force for not just N but up to 2N starting times: not just the times when a song begins but also the entrance times that are visitTime-1 seconds before a song ends. These starting times divide the tape loop into blocks, and within each block all entry times have the same outcome as entry when the block begins.

## TallyMarksSystem

As we are allowed the operation plus, we never need to consider actual numbers written using tally marks. E.g., instead of “55511” we can always do “(5+5+5+1+1)”. This simplifies the implementation.

Going backwards from the N we are supposed to construct seems too slow. For instance, there are O(N) ways to write it as x+y, and iterating over all those and recursively finding best representations for x and y would give us an O(N^2) solution.

Still, from this idea we can keep the observation that an optimal solution for any N (other than the base cases 1 and 5) is either the sum or the product of two optimal solutions, both for values strictly smaller than N.

Now, a better approach is to realize that the optimal representation of N will be roughly logarithmic in N (e.g., as an upper bound we can write in base 2 or base 5), so the answers for all possible N will be small. As we construct all possible short expressions, they will frequently have different values. Our solution will therefore find the optimal expressions for all numbers between 1 and N, inclusive, essentially by induction on the number of symbols needed.

To start, we know that only two numbers have a 1-symbol representation: 1 and 5.

Suppose that for each x=1, 2, …, k we already know all numbers for which the optimal representation has x symbols. How can we now find the numbers for which the optimal representation has k+1 symbols?

Each such number must be either the sum or the product of two numbers p, q such that the optimal representations of p and q have a total of k+1 symbols. Using the already computed data we can easily iterate over all such pairs (p, q). For each of them we examine the sum p+q and the product pq, and if they are <=N and previously unknown, we remember the optimal solution for them and store them in the list for k+1 symbols.

```
public int construct(int N) {
int MAXN = 2_000_000;
int[] best = new int[MAXN+1];
for (int n=0; n<=MAXN; ++n) best[n] = 999;
best[0] = 0;
best[1] = 1;
best[5] = 1;
// Both dimensions of the array below are determined by running the code
// once and observing when every number is constructed.
// It’s possible to implement it more nicely but I didn’t want to deal
// with arraylists of arraylists, sue me :)
int[][] constructed = new int[21][1000000];
int[] cc = new int[21];
cc[0] = 0;
cc[1] = 2;
constructed[1][0] = 1;
constructed[1][1] = 5;
for (int cnt=2; cnt<=20; ++cnt) {
cc[cnt] = 0;
for (int a=1; a<cnt; ++a) {
int b = cnt-a;
if (b < a) continue;
for (int ii=0; ii<cc[a]; ++ii) for (int jj=0; jj<cc[b]; ++jj) {
int x = constructed[a][ii], y = constructed[b][jj];
if (x+y <= MAXN) {
int z = x+y;
if (best[z] == 999) {
best[z] = cnt;
constructed[cnt][cc[cnt]++] = z;
}
}
if (1L*x*y <= MAXN) {
int z = x*y;
if (best[z] == 999) {
best[z] = cnt;
constructed[cnt][cc[cnt]++] = z;
}
}
}
}
}
return best[N];
}
```

A note on incorrect solutions:

In order to speed up their solutions some people made the incorrect assumption that addition is useless beyond adding 1 or 5 to the current number. This is false. The smallest counterexample to the solution that only tries to create a new value y via multiplication and then tries y=x+1 and y=x+5 is the prime number y = 139. As it’s prime, there’s no good way to make it as a product of two smaller numbers. The numbers 138 and 134 each require 8 symbols, so the best we can get that way is an expression with 9 symbols. However, we can write 139 as 14 + 125, and we can make 14 using five symbols (“11 * 511” = two times seven) and 125 using three (“5 * 5 * 5”).

## Component

We start by handling the corner cases: for S = 1, 2, or N the answer is 1 as that component size cannot be avoided.

A key observation is that at any moment during the construction of the graph, the current state can be represented as a multiset of current sizes of components that are strictly smaller than S. We will build the graph incrementally and keep track of the state until we either create a component of size exactly S or reach a state in which each component is bigger.

In particular, observe that we do not need to keep track of bigger components because they do not matter – the vertices in them can no longer be a part of the component we seek. In other words, we can always treat the bigger components as a single component containing of all their vertices – doing so clearly doesn’t change the answer, and it reduces the number of states considerably.

Once we get here, we can estimate that the number of states of the above type cannot be that large (the number of all integer partitions grows exponentially but very slowly) and we can do a quick DP to verify this: the count of states for S=1 is 1, and the count for any other pair (N, S) can be determined by iterating over the number x of components of size S-1, and for each such x counting the states for N-x(S-1) vertices and S-1 as the upper bound on component size. If we do this, we’ll learn that for the given constraints there are never more than 15,000 states.

Now, we can order these states by the number of components they contain, and in this order (smallest to highest) we can do dynamic programming on them. For each state we are going to compute the probability of hitting the desired component size if we start in that particular state.

In order to process a state, consider all edges that currently lead between two distinct components. (Remember that all already-bad vertices are treated as a single too-large component for this purpose.) One of these edges will eventually be added as the next non-trivial update, and each of them is equally likely. Thus, we can calculate the probability for the current state simply by averaging the probability of “winning” after adding each of the available edges.

Hence, for each of the available edges we consider what happens. If adding it produces a component of size S, we immediately “win”. In all other cases we get a new state, and as it has fewer components, we already know the probability of “winning” from this state.

Of course, we can easily implement the above in a faster way: instead of treating each edge separately, we can treat each pair of components separately. E.g., if we have components of sizes P and Q, there are P*Q edges between them and adding any one of those will have the same effect: we get a new component of size P+Q instead.

Using the above approach, we can process a state in time that is roughly cubic in the number of components it contains: for each pair of components we try adding an edge between them and while doing so, we spend linear time to create and process the new state.

**misof**