Match Editorial

SRM 210
Wednesday, September 1, 2004

Match summary

The problems of this match contained a variety of traditional programming challenge topics, but they shared a common theme of relevance to "real world" concerns in software development. Besides enhancing the validity of TopCoder competitions as an indicator of on-the-job performance, real world problems also net the problem writer a handsome bonus payment. Unfortunately, these problems have a somewhat tarnished reputation since they frequently involve tedious simulations of complex but uninteresting systems. Therefore, my goal was to create a problem set that was both relevant in the real world and fun to solve.

In Division 1, snewman struck early and quickly, turning in the fastest times on both the easy and medium problems. However, venco pulled into first place at the end of the coding phase with a blistering time of just under fifteen minutes on the hard problem. The standings continued to reshuffle during a lively challenge phase, as snewman gained 100 points from successful challenges while venco lost 50 points from an unsuccessful attempt. Impressively, sjelkjd managed to take down four solutions to the medium problem for 200 points, the most of any coder in this challenge phase. These feats were insufficient to stave off defending champion tomek, who came from behind with 100 points from three successful and one failed challenge to win the match. This places him in an exact tie with SnapDragon for the highest rating and sets the stage for the 2004 TopCoder Open.

In Division 2, Cmonkey, newcomer Sumudu, and bugloaf turned in the top three scores with solid performances during the coding phase. In all, twenty-two coders advanced to Division 1.

The writer's solution to all six problems are posted in the practice rooms.

The Problems

TitleString

Used as: Division Two - Level One:

Value 250

Submission Rate 197 / 216 (91.20%)

Success Rate 177 / 197 (89.85%)

High Score eleusive for 248.65 points (2 mins 6 secs)

Average Score 214.67 (for 177 correct submissions)

This may have been the easiest problem, but consider this: every modern word processing application has a feature that does the same thing, and chances are that every desktop computer you see has some sort of word processor software installed. Therefore, if you can solve this problem, you too can write code that is used by millions of people on a daily basis! This is the sort of thinking that was responsible for the dot-com bubble.

In the context of this problem, if a character is a letter, and if it is positioned after a space character or at the beginning of the string, it is the first character of a word and should be capitalized. The previous sentence describes a correct solution in its entirety, and can quickly be translated into code.

DrivingDirections

Used as: Division Two - Level Two:

Value 500

Submission Rate 167 / 216 (77.31%)

Success Rate 113 / 167 (67.66%)

High Score therealmoose for 475.48 points (6 mins 31 secs)

Average Score 314.47 (for 113 correct submissions)

Anybody who has used driving directions to find a house party in a complicated neighborhood should be familiar with this task. The key is to realize that left turns become right turns and vice-versa when traversing the path in the opposite direction. Therefore, if there are n directions, element i of the result (assuming array indices are zero-based) is the street name in element n - 1 - i of the original directions concatenated onto the reversed turn instruction of element n - i, or "Start on " if i is 0. Of course, in the real world, the situation gets more complicated with one-way streets, roads that change names, and the difficulty of typing in computer programs while driving a car.

TopographicalImage

Used as: Division Two - Level Three:

Value 1000

Submission Rate 32 / 216 (14.81%)

Success Rate 28 / 32 (87.50%)

High Score Minilek for 676.02 points (22 mins 1 secs)

Average Score 514.34 (for 28 correct submissions)

This problem manifests itself in many image analysis applications. For example, the task of counting bacteria in a microscope image could be formulated similarly to this problem.

A solution can be implemented directly from the definitions given in the problem statement. Let visited be a boolean array that tracks whether point has been assigned to a peak. The highest unvisited point M_i can be found by looping over all the points in the landscape, disregarding the visited points and keeping track of the highest one. The peak P_i can then be found with a recursive function that floods outward from M_i, stopping when it encounters a visited point or an uphill path. The pseudocode of this function follows:

int countPeakPoints(Point startPoint)
   mark startPoint as visited
   area = 1
   for each neighbor n of startPoint
      if n has not been visited and the path from startPoint goes downhill
         area = area + countPeakPoints(n)
return area

In the parlance of graph theory, this is known as a depth first traversal. The steps of finding the highest point and calling countPeakPoints on it should be repeated until all points have been visited.

IPConverter

Used as: Division One - Level One:

Value 250

Submission Rate 216 / 235 (91.91%)

Success Rate 179 / 216 (82.87%)

High Score snewman for 241.62 points (5 mins 19 secs)

Average Score 190.51 (for 179 correct submissions)

Who hasn't needed to dial their IP address into a telephone at some point? I certainly haven't. However, with the gradual convergence of the telephone system with the internet, it is conceivable that you'll have to punch in your IP address for some reason, and you'll want some way to disambiguate it.

Most correct solutions of this problem fell into two similar categories. Some coders wrote three nested loops to try all the positions of the periods, performing the validity checks on the numbers within the innermost loop. Others wrote a function that checked the first one, two, and three digits, then recursed on the remaining digit string.

HierarchicalTree

Used as: Division One - Level Two:

Value 500

Submission Rate 163 / 235 (69.36%)

Success Rate 67 / 163 (41.10%)

High Score snewman for 409.76 points (13 mins 59 secs)

Average Score 268.91 (for 67 correct submissions)

This problem was inspired by some code I wrote for TopCoder's very own website a couple years ago. If you look at the component catalog at TopCoder Software, you'll see a list of components organized into categories and subcategories. A category might have any number of subcategories, but each subcategory has exactly one parent, so it is easier to store the catalog as a table of category-parent pairs in a database. The backend has to pull the pairs off a database, solve exactly this problem, and pass the category tree to the web server for you to peruse.

This problem was not conceptually difficult, but implementing it and covering all the corner cases required close attention. A solid command of your language's class libraries was very helpful. An effective approach is to process the node-parent pairs one by one, building up a tree. As each pair is processed, ensure that root is not specified as the child of another node, and that a node does not acquire two different parents. This catches cycles within the root component, but separate disconnected components may still exist after all the pairs are processed. After performing the recursive traversal starting at the root node to determine the descendant counts, compare the number of descendants of root with the total number of nodes seen. If there are more nodes than descendants of root, some nodes must be disconnected, and the tree is invalid. See the writer solution for an concise implementation in Java.

NegativePhotoresist

Used as: Division One - Level Three:

Value 1000

Submission Rate 37 / 235 (15.74%)

Success Rate 36 / 37 (97.30%)

High Score venco for 800.21 points (14 mins 59 secs)

Average Score 597.44 (for 36 correct submissions)

This probably wouldn't be a viable method of manufacturing a circuit board, but the ability to cope with last-minute design changes is an unfortunate necessity in software development. In any case, this problem was a good excuse to perform a binary search on Floyd-Warshall. Many experienced coders begin salivating as soon as they see the words "all pairs shortest path" in close proximity because the well-known Floyd-Warshall algorithm efficiently calculates the length of the shortest path between all pairs of vertices in a graph and is refreshingly easy to implement. See these lecture notes for an in-depth discussion of how Floyd-Warshall works.

Tilting the mask by an angle theta has the effect of multiplying the y-coordinate of each pinhole by cos(theta). By computing the adjacency matrix (being careful to avoid overflowing a 32-bit integer by squaring 100000) with these transformed y-coordinates and then running Floyd-Warshall, the total shortest path length between all pairs of pinholes as a function of theta can be calculated. As theta increases from 0 to Pi / 2, the length of a connection between two pinholes never increases. Therefore, except for the degenerate case where all the connections are parallel to the x-axis (which was implicitly forbidden by the problem constraints), the sum of the shortest path lengths is a monotonically decreasing function of theta. This immediately suggests the use of a binary search to find the minimum value of theta for which the sum of the shortest path lengths is less than the limit.

By the_one_smiley
TopCoder Member