NASA and Planetary Resources Inc., are partnering to develop crowd-sourced software solutions to enhance detection of Near-Earth Objects (NEOs) using NASA-funded data. We will do so by running multiple Marathon Matches. Your task in this contest is to write a problem statement for the first Marathon Match based on the available information. The main goal of the first Marathon Match is to create an algorithm that can reject false positives from a list of asteroid detections.
Asteroids pose both a possible threat and an opportunity for Earth: they could impact us, causing damage, OR possibly be mined for resources that could help extend our ability to explore the universe.
Scientists find asteroids by taking images of the same place in the sky and find the star-like objects that move. With many telescopes scanning the sky during the time around the new moon, the data volumes prevent individual inspection of every image. Traditionally, the identification of asteroids and other moving bodies in the Solar System has been achieved by acquiring images over several epochs and detecting changes between frames. With the vast amount of data available now flowing from modern instruments, there is no good way for professional astronomers to verify every detection. In particular, looking in the future as large surveys grow ever larger, the ability to autonomously and rapidly check the images and determine which objects are suitable for follow up will be crucial.
The Catalina Sky Survey (CSS) uses a crowded field galaxy photometry program that identifies centroids of targets that are distinctly separate from other objects. This output is fed into a custom program that sees which sources move. However, false detections do occur. The CSS operator have to manually reject these false positives. We would like to automate the rejection process as accurately as possible.
Additional background information is available at http://www.topcoder.com/asteroids/asteroiddatahunter/
Sample input data and detail descriptions thereof will be provided in the forums. We have 180GB of data available. The input to the algorithm will be:
- 4 raw images of the sky, captured roughly 10 minutes apart. The resolution of the images is 4110 x 4096 and they contain 16 bit values.
- Detection file associated with the 4 images. The file contains a list image coordinates and additional information for the detected objects.
- Known object file that contains a listing of known objects near the standard field center, at the exact times of the 4 observations.
- Rejection file that contains the false positive detections. We want to reproduce these rejections from only images, detection and known object files.
Detections may be rejected because of common reasons such as faintness, bad image data, star bleed and detector artifacts. A document describing these common reasons will be posted in the forums.
While the majority of rejections will fall into one of these categories, this is not a comprehensive list of reasons for rejection. There will be other - potentially rare - rejection reasons.
A training package will be provided for the marathon match contestants in order to perform off-line training.
Each test case will consist of a set of 4 images with the associated detection file. The rejection file will be used for scoring.
The proposed scoring function is the Average Precision calculated on a sorted list of the detections returned by the algorithm. Detections to be rejected should be first in the returned list. You are welcome to propose different scoring functions.
The proposed class definition:
Parameters: vector <int>, vector <string>, vector <string>
Method signature: vector <int> reject(vector <int> imageData, vector <string> detectionData, vector <string> knownObjects)
What to submit
Please submit a document containing one or more problem statements.
In addition to writing prospective problem statements, you may explain your decisions, offer alternate choices, and suggest further ways to improve the match.
This match is not rated.