Data Management Algorithm
When mergers and acquisitions expand healthcare systems, identifying duplicate and erroneous information from combined sources can be difficult.
A healthcare company wanted to tackle this problem by developing a smarter, more automated way to recognize duplicate data for doctors, practices, and clinics in order to establish a single source of truth. This process, often referred to as Master Data Management (MDM), can be extremely manual and error prone.
The healthcare company used an Analytics Starter Pack from Topcoder to bring true MDM to their business. After analyzing the problem, we created and launched two consecutive crowdsourcing challenges to arrive at a winning analytics solution.
Members of Topcoder's global network of talent were first tasked with creating an algorithm to identify and recognize duplicate and erroneous information. This challenge allowed members to compete without library or software constraints. We then followed the first challenge up with a data science ideation challenge that tested the top five algorithms from the previous contest against a variety of error types to determine their relative strengths. The winning solution reduced the dimension of the problem using a clever hashing technique to create a subset of records most likely containing duplicates, and then created a predictive model using text field data to identify duplicate records for human review.
The completed data management solution eliminates the need for human intervention in 96% of records, making mergers and acquisitions easier than ever for the healthcare company—and making their vision for MDM a reality.