Problem of the Week: On Graph Theory and Math

Recently the 2019 Humblefool Cup Preliminary Round was held and this week we’ll be taking a look at the Division 1, Level Two problem statement from the contest “DismantleTheTree”.

The summary of the problem statement is as follows –

Given a weighted tree, you can select any simple path provided all edges on the path have an equal weight W, and remove the edges of this path from the tree for a total cost of W.

Find the minimum cost to erase all edges of the tree.


Let’s approach this problem with a depth first search algorithm first, and then we’ll also check out a clever hashing algorithm that solves this problem too.

Our depth first search algorithm needs to account for the fact that each node can either be part of a simple path whose weight is some (W_i) or not. Let’s call a path with all its edges equal to some common W as a “good” path.

A node having multiple neighbours can potentially be a part of multiple good paths.

Our task is to remove the least number of good paths, and thus minimize the cost incurred by removing these good paths.

With this in mind, let’s construct our depth first search algorithm.

When we’re at a given node, let us recurse on the node’s children and obtain the answer for this subtree first, and then solve the problem with our current node in mind.

The idea behind solving the subtree of the child entirely is that we must only consider the cost to remove simple paths. If we solve a subtree, we may combine any simple path that begins at the root of this subtree to another simple path that ends at this subtree’s root’s parent (if their edges have the same weight) – forming another simple path and not recounting this. (This is a separate case that we’ll handle at the end of our DFS.) Similarly, if we cannot do this, then the two simple paths must be counted separately which is what our DFS would do anyway.

After solving all subtrees of a node’s children, we must look at all the edges of the current node, and this current node’s children themselves.

If our current node has X children, all with edge weights equal to Y, then the cost of removing these would ultimately equal ceil(X/2) * Y.

But for now, lets just simulate the pairing and handle any left-out edges later to see if they can be paired with the node’s parent.

If we simulate the pairing of equal weighted edges – that is, for an edge of weight X, we check if another edge of weight X is present and if so, increment our cost by a value of X and make sure to never use these two edges again – then we have calculated the cost of removing all good paths that pass through our current node.

But we do have a case left to be considered – edges of the current node that couldn’t be paired with any other edge of the current node.

Note that all the left-out edges will be unique, as if any two were equal our algorithm would have already paired them up by this time.

If we iterate over our left-out edges, we can compare them with the edge connecting the node with its parent, and if these two are equal then we can ignore the cost of this left-out edge as we would simply remove it along with its parent (thus reducing the cost); but otherwise, if the two edges are not equal in weight, then the left-out edge’s weight must be added to the cost.

Thus, calling this DFS routine on our tree’s root node will give us the cost to remove all longest good paths from our tree, minimizing our cost.

This idea was similar to that used by jeel_vaishnav whose submission is available over at – https://community.topcoder.com/stat?c=problem_solution&cr=40777828&rd=17426&pm=15336

But this wasn’t the fastest submission in the contest – that one was by Stonefeang whose algorithm was as follows –

  • Hash all edges by the pair (connected_node, edge_weight) as the key and the frequency as the value.
  • Define initially, the answer as the sum of the weights of all edges.
  • Iterate over the keys of the hashmap, and for each key subtract the product of (the edge weight and half the value of this key in the hashmap) from the answer.

To understand why this works, let’s take a look at a few simple examples and then you’ll see how this algorithm works even for larger and more complex trees.

Consider a simple chain of 4 nodes.

Node 1 is connected to node 2, node 2 is connected to node 3, and node 3 is connected to node 4.

Let all edges have weight W.

Thus we know the answer to remove all edges is also equal to W as we can simply select the diameter of this tree and remove it.

The hashmap stores the following: [(1, W):1, (2, W):2, (3, W):2, (4, W):1].

This is because there’s only one edge of weight W connected to node 1 and 4, but there are 2 edges of weight W connected to nodes 2 and 3.

Our algorithm will initialize ans = 3*W, and will now subtract 0, then W, then W, then 0 as it iterates over each element in the hashmap.

But there’s another way of interpreting this – consider the diagram below:

From left to right, lets label our nodes 1 to 4.

Note the red color extended to half the edges of node 2, and the green color extended to half the edges of node 3? Those denote HALF the edges of each node, which is what our answer is subtracting.

When we subtract them, we remove them from our answer – implying that we’re left with two half edges at the end, one connected to node 1, and the other connected to node 4, which add up to our answer (W/2 + W/2 = W)

With the above example, we know that by inserting another node at either end and connecting it with edges equal to W in the above tree, that our algorithm will still work in the same way.

So let’s try a different configuration.

Here too, note that node 3 has more neighbors, and yet the algorithm will subtract the yellow half-edges, and ultimately will only consist of all the black half-edges in the tree which sum up to W1 + W2.

Considering one last different configuration –

Note the new node at the right end of the tree connected with a weight W3 which appears nowhere else among its neighbors’ edges, the algorithm above does not subtract any half-edge belonging to this edge, and thus its cost is unchanged since it was initially added into the final answer.

Thus here too, the algorithm correctly describes the cost as W1+W2+W3.

Since we’ve shown that we can add nodes repeatedly to any other node without the algorithm causing any problems, we know that this algorithm will work for any tree constructed as each tree can be constructed in the manner described above.

Find the submission of Stonefeang over here at – https://community.topcoder.com/stat?c=problem_solution&cr=40066523&rd=17426&pm=15336