[72hrs] Device Image five-class classification challenge

Key Information

The challenge is finished.
Show Deadlines

Challenge Overview

Challenge Overview

  • The goal of this challenge to build a multiclass image classifier.

  • The classifier should return precisely five probabilities:

    • Probability that ‘device A’ is in the image.

    • Probability that ‘device B’ is in the image.

    • Probability that ‘device C’ is in the image.

    • Probability that ‘device D’ is in the image

    • Probability that none of the above devices are in the image

  • In the last challenge, a classifier was created that returned three probabilities: 

    • Probability that ‘old  device’ is in the image (called device A in this challenge)

    • Probability that ‘new device’ is in the image (called device B in this challenge)

    • Probability that neither new device nor old device is present in the image.

    In this challenge we will add 2 more devices to the list.

  • The returned probabilities should be a float in the range [0, 1].

  • The provided training data contains a mixture of images and videos. Participants can extract images/frames from the videos and use it for training.

  • The amount of training data is probably not that large, so participants are encouraged to use data augmentation techniques if required, and possibly generate similar images with altered lighting, brightness, skewness, rotation etc to ensure that your trained model generalizes well.

  • The path of training images, testing images and configurations should not be hard coded into the code. They should be either passed via the command line arguments, or should be configurable via a dedicated JSON based config.json file.

  • The tool should be a command line tool, with separate commands to run the train, validate and test steps. 

  • The Code should be implemented in Python 3.7 (or 3.6 in case some important library is not available in Python 3.7)

  • Extra $50 will be awarded if the CoreML model (.mlmodel file) is also included in the submission. This additional bonus will be provided only if the submission is in the top 2. (For reference, read the section below - An Important Note on Preprocessing and output layer)


Data access & description

  • The training dataset can be found in the Forum. 

  • The top winning submission of the last challenge can be found in the forum. It is strongly recommended that members go through the winning submission of the last challenge and ensure that the best practices of these submissions are incorporated in submissions to this challenge. 

  • Note - it is strongly recommended that the overall code structure of your submission is similar to the winning submission of the last challenge. You are recommended to use the last challenge winning submission as the starting point.

Output Format

The output generated by the algorithm should be a CSV file containing the filenames and its corresponding probabilities of all the image files in the target test folder. The csv file should be preferably named 'output.csv', and should have headers: 

filename, prob_device_A, prob_device_B, prob_device_C, prob_device_D, prob_no_device


An Important Note on Preprocessing and output layer

After this challenge, we will take the model created as a result of this challenge, convert it into an Apple CoreML model, and integrate it into an iPhone Native App.

So it is important that whatever pre-processing that needs to be done on a test image should be very clearly documented, such that after reading it, an iOS developer can easily implemented those pre-processing in the iPhone app. So in addition to the documentation, please include a document named ‘preprocessing_output_details’ detailing exactly the operations used in pre-processing, such as normalization, color scale changes, image resizing etc. Also, the line numbers of the pre-processing steps should be included in the document, so that in case of a confusion, the iOS developer can quickly refer to this documentation.

In addition to preprocessing, add details about the output layer that is used in the neural network model. This is required because the CoreML model at times does not contain the final layer, and it needs to be implemented manually. The details of the output layer should also be added to the file preprocessing_output_details.

Note - Ideally, if changing it is not required, the pre-processing used in the 1st placed submission of the last challenge can be maintained without any changes. This is because we already have a tested implementation of the last challenge winner’s pre-processing. 

Although it should be noted that this is not a strict requirement and if essential, the participants are free to change the pre-processing if necessary.


Submission Evaluation

  • Important - The performance will be evaluated using a custom function built upon the binary classification metric AUC ROC SCORE. We will use a custom metric to find binary classification performance of each of the devices, and then combine them by taking a weighted mean of these AUC ROC scores. The weights here are not shared for now to ensure that the performance is not emphasised in any one of the binary classifications.

  • The submission will be evaluated subjectively, but broadly the rating will be determined by the following:

    • 70-80% - model performance in terms of ROC AUC score. 

    • 10-20% - Code quality and issues related to the code

    • 10-20% - Documentation and Ease of deployment, following solely the deployment guide.

Final Submission Guidelines

What to Submit

  • The code

  • A PDF/Word/Markup format based report mentioning the techniques and algorithms used to achieve the goals.

  • A PDF/Word/Markup/Text file named ‘preprocessing_output_details’ detailing the preprocessing steps and the output layer as discussed in the section ‘An Important Note on Preprocessing and output layer’ above.

  • A README.md file detailing the deployment instructions, along with examples and steps to verify the various capabilities of the Code.

  • Optional - screenshots or video to make it easier to quickly verify and test your code.

Technology Stack

  • Python

Reliability Rating and Bonus

For challenges that have a reliability bonus, the bonus depends on the reliability rating at the moment of registration for that project. A participant with no previous projects is considered to have no reliability rating, and therefore gets no bonus. Reliability bonus does not apply to Digital Run winnings. Since reliability rating is based on the past 15 projects, it can only have 15 discrete values.
Read more.


Final Review:

Community Review Board


User Sign-Off


Review Scorecard