Challenge Overview

Problem Statement

Konica-Minolta: Detecting Abnormality for Automated Quality Assurance - Overall Track

Overall Track Prize Distribution

1st place - $10,000

2nd place - $7,000

3rd place - $5,000

4th place - $3,000

5th place - $1,000

Subtrack Prize

Bonus - $1,000 per track winner not among the top-5 in the Overall Track.

Introduction

This contest aims to create new image recognition technology that could detect abnormality of a product to be used for visual inspection purpose. As there are three levels of background images that abnormality need to be detected, there are 4 tracks in total: level-1 track, level-2 track, level-3 track, and overall track. Please submit to the overall track contest. All other 3 tracks will be automatically calculated when you submit to this track.

Requirements to Win a Prize

In order to receive a prize, you must do all the following:

Achieve a score in the top 5, according to system test results calculated using the Contestant Test Data. The score must be higher than the baseline result (i.e., 500,000). See the "Data Description and Scoring" section below.
Create an algorithm that reads in 1,358 testing pairs of 390 * 390 real image and 500 * 500 reference image and outputs corresponding 390 * 390 predicted masks in at most 24 hours, running on an Amazon Web Services m4.xlarge virtual machine. If your solution relies on GPU, please propose the request when we contact you and should run on Amazon Web Services p2.xlarge (Tesla K80 GPU) virtual machine.
Within 7 days from the announcement of the contest winners, submit:
- A complete report at least 2 pages long, outlining your final algorithm, and explaining the logic behind and steps to its approach. More details appear in the "Report" section below.
- All codes and scripts used in your training and final algorithm in 1 appropriately named file (or tar or zip archive). The file should include (1) the scripts and codes for training and model creation; and (2) your final model, saved so that can make predictions without being retrained. Prospective winners will be contacted to setup your training and final model in the appropriate AWS environment.
- Submission created on AWS instance must have single script file to run training, and another script file to run and output predicted mask images. As mentioned in above, the script to output predicted mask images for test images (total of 1358 images) must run within 24 hrs.
- Submission created on AWS will need to work without any Internet connection. That means, if you need to download some libraries from Internet, please include these in your deliverables on AWS.

If you place in the top 5 or subtrack winner, but fail to do any of the above, then you will not receive a prize, and it will be awarded to the contestant with the next best performance who did all the above.

Background

In the manufacturing industry, quality assurance and production efficiency are both required. In order to ensure quality, it is necessary to inspect each item for scratches, defects, and foreign substances. For example, scratches on the surface of resin parts, written errors in labels, or dirts on fabrics, etc. As the automation of massive product creation becomes real, the quality assurance process cannot be handled by only human eye but needs some automated technology to detect these problems. And one of the methods would be to take shots of the created product to detect by image recognition. However, detecting problems or abnormality from images are not simple. It is hard to completely manage the same photographic environment (like illumination difference, or blur of the object, etc.) and also the problem that needs to be detected could be indefinite in form, or fine/minute.

In this contest, you will be given a portion of these shots which could have a certain type of abnormality (dirt) on it, with the reference image that has no abnormality (dirt) on.

The client wishes Topcoder community to come up with good deep learning model or algorithm that can detect these abnormalities (dirts) automatically that could lead to further improvements of the quality assurance process. Good Luck!

Objective

All real images and dirt masks are in the shape of 390 * 390, while all reference images are in the shape of 500 * 500. Real images and dirt masks are cropped to center evenly from the reference images. Please note that reference images and dirt images were shot separately so the images are not pixel-to-pixel same. 4,119 annotated images (including a real image, a reference image, and a dirt mask) are provided as the training data for your development and validation.

You are asked to build a model which can take a pair of 390 * 390 real image and 500 * 500 reference image as input and output the corresponding dirt mask. There are 3 levels of background images to detect dirts from, where level-1 has solid background, level-2 is images with characters, and level-3 having natural photos as background image. We believe that the level 1 is the easiest and the level 3 is the hardest.

Data Description and Scoring

There are 3 sets of images: train, dev, and test. The train and dev data can be downloaded through Google drive links. The test data will be available in the last 72 hours (i.e., after Feb 21, 11 PM EST) through this Google drive link.

Please note that there are some images without any dirt. It is evenly important for the client that your model or algorithm will not detect false dirts for images that do not have any dirt on.

Because of undisclosable reasons, each image were processed with the same Gaussian filter only on B-channel within the RGB image. It will be up to each contestant whether to use this B-channel for the solution or not.

Also, dirts on images are not manually drawn ��false�� dirts. Images with dirts on are created by combining actual dirt patterns with background images. This was done from the client��s previous research that manually drawn dirt will not produce a good algorithm or model. Ground truth mask images were created from this actual dirt patterns.

The number of images for each dirt pattern is almost evenly distributed. The train, dev, and test sets contain 60%, 20%, 20% dirt patterns respectively. There��s no overlaps between each set. The images without any dirt are also partitioned in this 60%-20%-20% way.

For this, your solution may not be dependent on these dirt patterns and should work reasonably the same to any new dirts. Moreover, the solution should not be tweaked specifically for the test data set that we will be providing in last 72 hours. Your solution created on AWS will be reviewed and if it does not meet this requirement, then you will not receive a prize, and it will be awarded to the contestant with the next best performance.

The statistics of these sets are summarized in the following table. In total, there are 6,882 images.

    +------+---------+---------+---------+-------+
    |      | level-1 | level-2 | level-3 | total |
    +------+---------+---------+---------+-------+
    |train |   751   |  1,495  |  1,873  | 4,119 |
    +------+---------+---------+---------+-------+
    |dev   |   245   |   510   |   650   | 1,405 |
    +------+---------+---------+---------+-------+
    |test  |   264   |   499   |   595   | 1,358 |
    +------+---------+---------+---------+-------+

The images are numbered from 1 to 6882. In each set, images are put into the corresponding folders: level-1, level-2, and level-3. For each image named $ID (e.g., 1), we will provide the following data:

$ID.tif (e.g.,1.tif). This is the real image and is always available.
$ID_ref.tif (e.g.,1_ref.tif). This is the reference image and is always available.
$ID_mask.tif (e.g.,1_mask.tif). This is the groundtruth and is only available for train data. For dev and test data, it��s hidden. Note that this is only for the visualization purpose and is not required to submit.

$ID_mask.txt (e.g.,1_mask.txt). This is the plain text verison of the groundtruth and is only available for train data. For dev and test data, it��s required to submit. The conversion process is shown as the following Python script. Please be aware of the order of x and y.

            from PIL import Image
            DIMENSION = 390
            def convert_to_binary(truth_image_file, truth_txt_file):
                im = Image.open(truth_image_file) #Can be many different formats.
                pix = im.load()
                out = open(truth_txt_file, 'w')
                for x in xrange(DIMENSION):
                    for y in xrange(DIMENSION):
                        out.write(str(int(pix[x, y] > 0)))
                    out.write('\n')
                out.close()

Please note that the reference pictures of the following IDs are wrong. We have ruled them out from evaluation. But please keep the _mask.txt files in your submission as placeholders.

        [340, 491, 817, 937, 1125,
         1751, 1858, 2091, 2378, 2385, 
         3030, 3050, 3326, 3333, 3435, 
         3761, 3905, 4505, 4620, 4661, 
         4750, 4839, 5020, 5160, 5425, 
         5481, 5597, 5607, 6100, 6163, 
         6349, 6460, 6555, 6578, 6683, 
         6760, 6877]

Your example submissions will be evaluated against the train set. Your full submissions will be evaluated against the dev and test sets. The provisional test scores are based on the dev set, while the system test scores are based on the test set.

We are using macro-F1 as the evaluation metric. The final score is computed as the following formula.

    Final Score = 1000000.0 * macro-F1

macro-F1

The macro-F1 score is the average F1 score for each image.

For each image, the F1 score is defined based on pixels. For each pixel in an image, there are 4 cases:

True Positive: Your prediction is white (1) and the truth is white (1).
False Positive: Your prediction is white (1) and the truth is black (0).
False Negative: Your prediction is black (0) and the truth is white (1).
True Negative: Your prediction is black (0) and the truth is black (0).

Let the total number of True Positive be TP, the total number of False Positive as FP, the total number of False Negative as FN, and the total number of True Negative as TN.

The precision (P) is defined as TP / (TP + FP) and the recall (R) is defined as TP / (TP + FN). The F1 score is computed as 2 * P * R / (P + R). Specially, when TP = 0, we define F1 is 1 if FN = 0 and FP = 0 (i.e., there is actually no dirt in all images); otherwise, F1 is 0.

Implementation

During the contest, only your results will be submitted. You will submit code which implements only one function, getURL(). Your function will return a String corresponding to the URL of your answer (.zip).

This .zip file should include results (i.e., $ID_mask.tif) of all training, development, and testing data (6,882 in total). Each result is a .txt file of a 390 * 390 binary mask matrix as we described before. If some result files are missing, your submission will receive a score of -1. Note that, although you don't know the testing set in the beginning of the contest, you know their IDs and they are 390 * 390.

You may use different names for the .zip file and any of the following structure should work. When we are evaluating the submission, we will iterate through the zip file and find every file ending with "_mask.txt" and extract the number before it. Please make sure there's no duplicated filenames, if you have multiple folders in the .zip file.

    submission.zip               submission.zip                         submission.zip
    |-- 1_mask.txt               |-- train/level-1/1014_mask.txt           |-- level-1/1033_mask.txt
    |-- 2_mask.txt               |-- train/level-1/1016_mask.txt           |-- level-1/1070_mask.txt
    |-- ...                      |-- ...                                   |-- ...
    |-- 6882_mask.txt            |-- test/level-3/99_mask.txt              |-- level-3/999_mask.txt

You may upload your .zip file to a cloud hosting service such as Dropbox which can provide a direct link to your .zip file. To create a direct sharing link in Dropbox, right click on the uploaded file and select share. You should be able to copy a link to this specific file which ends with the tag "?dl=0". This URL will point directly to your file if you change this tag to "?dl=1". You can then use this link in your getURL() function. You can use any other way to share your result file but make sure the link you provide should open the filestream directly.

Your complete code that generates these results will be tested at the end of the contest.

Report

The report should be a .doc/.docx or .pdf file, which describe the details of your algorithms and methods. It should include, but not limited to the following points.

Your analysis of the given training data and any assumptions made based on the analysis.
How you come up with the final model? What else did you considered/tried towards your final solution?
What are the major challenges during your development? What else can you improve if you have more time and resources?
Detailed instructions on how to repeat your deployment on the AWS server, including installing dependencies, step-by-step running commands and organized bash scripts.
Feedback to this challenge, if there's any.

General Notes

This match is rated
If your solution includes licensed software (e.g. commercial software, open source software, etc), even just in the training stage, please ask in the forum. If the proposal gets approved, you must include the full license agreements with your submission. Include your licenses in a folder labeled "Licenses". Within the same folder, include a text file labeled "README" that explains the purpose of each licensed software package as it is used in your solution.
In this match you may use any programming language and libraries, including commercial solutions, provided Topcoder is able to run it free of any charge. You may also use open source languages and libraries, with the restrictions listed in the next section below. If your solution requires licenses, you must have these licenses and be able to legally install them in a testing VM (see "Requirements to Win a Prize" section). Submissions will be deleted/destroyed after they are confirmed. Topcoder will not purchase licenses to run your code. Prior to submission, please make absolutely sure your submission can be run by Topcoder free of cost, and with all necessary licenses pre-installed in your solution. Topcoder is not required to contact submitters for additional instructions if the code does not run. If we are unable to run your solution due to license problems, including any requirement to download a license, your submission might be rejected. Be sure to contact us right away if you have concerns about this requirement.
You may use open source languages and libraries provided they are equally free for your use, use by another competitor, or use by the client.
The usage of external resources (pre-built segmentation models, additional CT imagery, etc) is allowed as long as they are freely available.
Use the match forum to ask general questions or report problems, but please do not post comments and questions that reveal information about the problem itself or possible solution techniques.

Terms and NDA

This challenge will follow the below standard Topcoder Terms and NDA

[1] Standard Terms for TopCoder Competitions v2.1 - https://www.topcoder.com/challenge-details/terms/detail/21193/

[2] Appirio NDA 2.0 - https://www.topcoder.com/challenge-details/terms/detail/21153/

Definition

Class:	KMDetectAbnormal
Method:	getURL
Parameters:
Returns:	String
Method signature:	String getURL()
(be sure your method is public)

Examples

"0"

Returns: "Seed: 0"

This problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior written consent of TopCoder, Inc. is strictly prohibited. (c)2020, TopCoder, Inc. All rights reserved.

Konica-Minolta: Detecting Abnormality for Automate - [Level 2] Detecting Abnormality for Automated QA