Predict Consumables - Replacement Predictor Ideation Challenge










    Status: ‌Cancelled requirements infeasible
    Show Deadlinesicon-arrow-up

    Challenge Overview

    Challenge Objectives
    In this challenge you will provide the training data, data analysis done in the previous challenge and test harness scripts. Based on the given inputs you have to 
    • Build and train a model
    • Make Predictions (True/False)
    Project Background

    Printing consumables are a significant cost and revenue driver in the printing business. Accurate predictions of when they will occur provide many opportunities for improvement such as component design, pricing, and service schedules. The project aims to predict near term component replacements.Specifically, 80% or better accuracy in predicting all replacements that will occur in the next 2 weeks. The success of the project depends on the achieving a False positive rate less than 20% and a False negative rate less than 20%.

    Technology Stack
    Python 3 / Or any language of your choice.

    Individual requirements

    Training Dataset: The dataset provided contains a unique id followed by number of columns with both numerical and non-numerical values. The last column is t/f (True/False) which is the target value that needs to be predicted. The training.csv which displays ID numbers (to identify each row in the training dataset) and the predicted t/f value.

    Testing DataSet: The dataset is similar to training dataset, however without the target_value (t/f).

    1. Build and Train the Model - This is entirely up to you to figure out the best predictive analytics approach that needs to be used based on the inputs provided. All the charateristics should be backed by data analysis done.

    2. Prediction requirements - Generate a csv file with the row identifier and the predicted t/f values.
    • 80% or better accuracy in predicting the target value.
    • False positive rate less than 20%, false negative rate less than 20%.
    You can use the test harness script provided to verify the results of your prediction.

    What should be the prediction format?
    As part of this challenge you should provide a testing_data.csv file as output that matches the format of the training_data.csv.  Please include the following header in your file:  
    ID,  TargetValue.

    Deployment Guide

    Make sure you provide a README.md that covers how to run the script in any environment.

    Important Notes:
    Review will be highly subjective and done by the client. No appeals will be allowed. The solutions will be ranked in order of accuracy of the predictions on the testing set.  You can use the test harness script provided. Your solution must be flexible enough to accommodate new training and testing files. 

    Final Submission Guidelines

    • Revised data analysis scripts with details on how to run it
    • Source Code as zip file
    • Documentation, including list of features (columns) and its correleation with label (target_value)

    Reliability Rating and Bonus

    For challenges that have a reliability bonus, the bonus depends on the reliability rating at the moment of registration for that project. A participant with no previous projects is considered to have no reliability rating, and therefore gets no bonus. Reliability bonus does not apply to Digital Run winnings. Since reliability rating is based on the past 15 projects, it can only have 15 discrete values.
    Read more.


    Final Review:

    Community Review Board


    User Sign-Off


    Review Scorecard