Living Progress - Data to Drops - Python Test Harness

Key Information

Register
Submit
The challenge is finished.

Challenge Overview

Project Overview

Millions of people around the world rely on water points for their daily existence. Too often the water points fail and communities are left without the water they desperately need. A lack of basic information on these failures has made keeping water flowing a major challenge for governments and aid agencies. Across the globe, many communities have come to rely on public water access points. People will come to these water points and open the tap or pump the handpump to fill their containers. As long as the water is flowing, they will carry the water home, and use it for sustenance: drinking, cooking, cleaning, bathing, and more. The water from these points are a critical foundation for success, health, and prosperity.

Unfortunately, these water points systematically fail due to technical breakdowns, water scarcity, vandalism, and misuse. When these water points fail the very foundation for community wellbeing fails. People have to revert to distant water sources, dirty water, or exorbitant prices. Better understanding the causes of failure will allow NGOs and governments to better avoid these failures, ensuring that water services last over time.

The recently launched Water Point Data Exchange (WPDx) has made significant progress in analyzing these failures and establishing a path forward to lasting services. WPDx consists of a data exchange standard and a central repository of compliant data. The water point data is aggregated from governments, NGOs, academia, and other sources and then standardized for integration into the central repository. This unprecedented library of information is already providing a foundation for improved research and effective policies to help keep water flowing.  The major limitation of WPDx is the presence of several open text fields among the standardized attributes. These fields (such as water point status, and water point type) allow for much needed flexibility, but severely curtail analysis. This solution will provide secondary processing on the WPDx data to convert those open text values into meaningful categories that allow for analysis.

This challenge is part of the 
HPE Living Progress Challenge Blitz Program (Secure top placements in the leaderboard to grab additional cash prizes)

Competition Task Overview

The purpose of this challenge is to create a test tool that will compare the results generated by submissions from another challenge with our own results to determine the accuracy of the submissions:

1. The command line application should open and read a spreadsheet, submissions.csv.  The submissions.csv file will have 3 columns:  submission id (6 digit Topcoder submission id), output file path, accuracy.  The filename should be configurable or passed as a command line parameter.

2. 
The application should read the submissions.csv file.  For each line in the submission id the application should do the following:

2.1 Read the csv output file specified by the path, the csv will look exactly like this: https://drive.google.com/file/d/0ByjxTGykXQjAa1FCRV9BNEYyQjA/view?usp=sharing

2.2 Read the csv which contains our own data, this path should be configurable or passed as a command like parameter.  The data will look exactly like this: https://drive.google.com/file/d/0ByjxTGykXQjAa1FCRV9BNEYyQjA/view?usp=sharing

2.3 Complare the results (the last column - Water Source Type) from the submission csv with our own csv. We’ll test the solutions with up to 10K records which are not provided to you in advance.   The solutions will be evaluated for accuracy.  The accuracy metric is fairly simple: Accuracy = # of correct results / # of total results
2.4 
Record the accuracy statistics in the submissions.csv file.  Accuracy is measured as the # of correct results / # of total results.

3. The app should provide summary and progress information to the console screen to verify that execution is proceeding 

Technology Overview

Linux
Python 2.7



Final Submission Guidelines

Submission Deliverables

1. Please submit all code required by the application in your submission.zip
2. Document the build process for your code including all dependencies (pip installs etc..)

ELIGIBLE EVENTS:

2016 TopCoder(R) Open

REVIEW STYLE:

Final Review:

Community Review Board

Approval:

User Sign-Off

SHARE:

ID: 30054410