2006 TopCoder Open - Computer Programming Tournament

Secondhand Shopping: Efficient Component Reuse
05.04.06 3:20 PM

By dplass
TopCoder Member

Part of the TopCoder Software development process, and indeed, integral to its business model is to re-use components that are already in the catalog. This lowers costs for both obvious and non-obvious reasons. The measurement of cost includes a variety of variables, including the actual cost of building the component, the amount of time it takes to find the component to be able to re-use it, the time it takes to learn how to use and integrate it, and the maintenance time. Non-reused components only encompass the actual development and maintenance time.

Obviously, you lower costs by amortizing the design and development expenses (which include time, money and resources) over many projects. A less obvious way of reducing costs is that by using well-defined components, you establish a component "vocabulary" that TopCoder architects, members and even clients themselves can use. This vocabulary then promotes a higher efficiency when describing new projects using these components. This is akin to the ubiquitous use of Design Patterns in software design.

The challenge, however, is for the architect or designer to be able to identify, classify, and therefore find such common components, especially when he or she is not familiar with the entirety of a catalog. There are hundreds of modules in the TopCoder catalog, which encompasses a variety of useful components, ranging from front-end components (such as a calendar JSP tag) to database access mechanisms, as well as some domain-specific projects (such as a currency manipulation library.) No one person could reasonably be expected to be intimately familiar with every component in the library.

In his presentation, TopCoder project manager scamp described a Component Based Software Engineering (CBSE) methodology and how a high "Reuse Maturity Level" can reduce costs. A component library needs to provide three services:

Publish the components - this also includes the specifications and code itself, and notifying users about the new or updated component.
Manage the components - which includes organizing them, publishing results of QA tests, and maintaining the versions of components.
Allow users to consume the components - this includes searching for (and finding!) components, and also the component specifications. This is significant because there may be a need to redevelop a component for a different operating system or implementation type. Users need to be able to request a new component if needed as well.

As I mentioned before, being able to quickly find a component that you're looking in a library for is essential so you can take advantage of the components that are in there. If you can reduce this time, and reduce the perceived effort to re-use an existing component in comparison to the perceived cost of building it from scratch, you can increase the re-use of previously built, tested and documented components. A point that I made during the talk was that this perceived effort actually may rely on ancillary documentation such as tutorials and "recipe books" that the user can rely on to decide whether or not they should use the component, and more importantly, how to use the component. If the user can't figure out how to use it, they won't use the component, and waste time re-implementing something that's already available.

scamp showed a slide with a very interesting graphic which showed 4 regions regarding a user's knowledge of a component library:

the set of components in the library that the users knows
the set of components that the user is familiar with from the library
the set of components that the user believes is in the library, but might actually not be in the library
the set of components actually in the library that the user has no knowledge about

The results that a search engine returns will include areas from each of the above regions. As a user becomes more familiar with the library, region 4 will shrink and regions 1 and 2 will grow. Region 3 is an interesting section as well, because the components that are not actually in the library represent an opportunity to expand the set of components in the library.

A way to determine how well a search engine is doing is to measure its precision and its accuracy. You strive to maximize both of them, without sacrificing either. If a user can't find what he's looking for, he will give up. On the other hand, sometimes it's hard to specify exactly what they're looking for, so the search can use two approaches to find components in the library: 1 - within the actual components, such as source code, keywords, etc., and 2 - extra metadata added to each component. This includes submitter-defined categories and user feedback to improve the quality of the search results.

I thought the talk was very interesting, but sadly there were only a handful of TopCoder members there, probably because of a conflict with the Algorithms Semi-finals room 2. mess and some other folks from TopCoder participated (with me!) in peppering scamp with questions, feedback and other talking points. I think that everyone involved with TopCoder Software component design and development competitions should really try to learn more about CBSE and the needs of Component Library search with regards to facilitating re-use. The CBSE model has other implications as well -- the "next next thing" will likely be Service Oriented Architecture (SOA) which takes components to the next level. In an SOA, components are re-used not only at design time, but at run-time, by being deployed in such a way that they are available on a network that provides access to a multitude of "clients."

Tomorrow there is another Developer Forum on SOA right after CDDC 2 -- I hope to see more of you all there.

Enjoy!
--DP