CFO Forecast Refinement Code Challenge

Key Information

The challenge is finished.
Show Deadlines

Challenge Overview


1st place - $3000

2nd place - $1500

3rd place - $800

4th place - $400

5th place - $250


If your are the first (according to your submission time) to beat the target MAPE, you will win a bonus prize of $100 per variable. For bonus prize, each variable is evaluated separately. There are 15 variables in total (5 variable types X 3 products in the objective table).



Previously, we have run a code challenge (https://www.topcoder.com/challenges/30097811/?type=develop&noncache=true) and a feature analysis challenge (https://www.topcoder.com/challenges/30102688/?type=develop) related to this forecasting problem. In this challenge, we are looking for further refinement to improve the model accuracy. Please read the previous challenge statements to understand the background as well as the difference between “derived feature” and “original variables”, which is very important.

Challenge Objectives

We have provided a set of models as enumerated below:

  • 312171 - Random Forest, Ridge, LASSO

  • 312698 - Prophet, SARIMAX, Mean Value, Linear

  • 312701 - VARMAX (Vector Autoregression Moving-Average with Exogenous Regressors)

  • 311181 - Random Forest

To forecast the following key business metrics (target variables to forecast) for 3 products (Tortoise, Rabbit, Cheetah):


- MAPE is based on the real tests data for 4 months only.


With this challenge, we would like to improve the accuracy of the forecasts and reduce the MAPE to at least below a given threshold, and ideally below the target (Bonus prizes are available for being better than target).  The threshold and targets are specific to the business metric. In addition,

  • We would like to create forecasts that mirror the trends of the actual (given the limited data set)

  • We would like to identify the important features that drive the above forecasts which we would use to explain the models and generate what-if analysis on our end.

Challenge Details

The above provided models can be used:

  • As a reference and then modify the same 

  • Or utilize aspects of these models and/or 

  • Implement new models to augment the existing models/algorithms for the above 5 metics.

  • Use a specific modelling technique for each of the 5 metrics being forecasted


Please consider and document following aspects when updating/creating models:

  • Understand the trend, cyclical variations, seasonality and fluctuations according to the season, and other irregular or random variations.  Note: These forecasts are known to be seasonal.

  • Data Preparation: If supervised machine learning approach is being used, pay close attention to transform the time series into a supervised learning problem - Transform Time Series to Stationary. 

  • Missing values - need to be appropriately handled.

  • Outlier Detection and elimination based on type of outlier (outlier classification) detected, as outliers can lead to misleading outputs

  • Ensure the input data is standardized and normalized when being utilized with the developed/modified models - this aspects needs to be accurate to not introduce errors

  • Feature Engineering - Pay close attention to feature engineering and explain the same in detail

  • Feature Selection - Perform feature selection and explain the same in detail. Note that, in each model, the "derived features" might be different than the "original variables". For example, you may see a feature like “The previous/delta value of the variable X”. We would like to see importance analysis at the “original variable” level, instead of the “derived feature” level. We would like to see some conclusion like "variable X is critically impactful (positively or negatively) to variable Y". 

  • Hyper-parameter Optimization - Select one well-performing model and tune the model hyperparameters in order to further improve performance

  • Better Model Interpretation - Utilize SHAP Analysis as a minimum, and other feature importance algorithms (optional if insightful) to explain the model output. 


Please consider these Hypothesis: based on Business knowledge and Insight.

  • Tortoise, Rabbit, Cheetah are products that follow a Classic 'bell curve' product life cycle.  The three products are at different stages of the cycle

    • Tortoise (Closing) Closing base is in steady decline as the product is at the tail end of the Product Life cycle.  Customers wishing to upgrade would move to Rabbit.

    • Rabbit (Closing) Customer base is in steady growth but it’s rate of growth is slowing, and it may be topping out.  Customers wishing to upgrade would move to Cheetah.

    • Cheetah (Closing) Customer base is in rapid growth as it is at the early stage of the lifecycle with its growth rate accelerating.

  • The five target variables are subject to different business drivers and behave differently - See hypothesis sheet in datasheet for full detail.

    • Gross Adds, Leavers.  These variables are seasonal; variable from month to month are almost entirely dependent on the market, and competitor pressures at that point in time.

      • Gross Adds - primarily impacted by sales and marketing inputs

        • Acquisition Price for the product in question and the next adjacent product

        • Footprint.  Defines the potential market size for this product

        • Marketing Spend, NPS - to a less extent

      • Leavers - primarily impacted by factors causing disatification with the current product, or reasons to upgrade or switch providers

        • Brand NPS - level of customer satisfaction

        • Price increase - to the Customer base

        • Acquisition Price from competitors encouraging customer to leave their current provider.

    • ARPU, Closing Base, and Revenue.  These variables have a significant monthly recurring component with only small monthly small incremental (both positive or negative) change driving a gradual shift in the value of the variable.

      • ARPU - Changes in ARPU from month to month will be driven by the number of customers joining (Gross Adds) and their ARPU (Acquisition ARPU) and number of customers leaving (Leavers) and their ARPU (Leavers ARPU). In addition, price increases (see below) have a impact in time.


New data sets introduced following previous iteration:

Three new or enhanced data sets have been added to this modeling iteration following a review of the previous forecast and discrepancies or apparent errors.  These datasets are known to be significant and are known to be correlated to forecasts.

  • Trading Weeks.  This is an artificial construct of the client which imposes a fixed number of whole weeks onto the trading calendar.  This means that each month is either 4 weeks or 5 whole weeks, and is not 28, 30, or 31 calendar days. This definitely impacts the Business metric Gross Adds, and Leavers.  The impact of Trading week on the other Business metrics will be significantly less.

  • Price Increase to the Closing base.  On a regular but infrequent basis, the client increases the price of their service to their existing, contracted customers.  This price increase directly impacts ARPU, Revenue and appears to drive up Leavers to a lesser extent.
    Note:  There will be both a ‘lead’ and ‘lag’ impact of this variable on ARPU, Revenue, and Leavers.  This is because some Customers will preemptively chose to leave before the impact of the price increase, as well as leaving after date/ month of the price increase.  This proactive, as well as reactive impact needs to be built into the model.

  • Footprint.  Cheetah footprint dataset has been extended from Nov 18 upto Mar 2019.  This variable is known to impact significantly Cheetah Gross Adds and Cheetah Revenue, and Closing Base because it dramatically increases the number of customers able to buy this superior product.  There may also be an impact on Rabbit product numbers as customers opt for the superior product as it becomes available.


Other points to keep in mind:

  • Univariate and multivariate analysis can be done for the metrics being forecast

  • Linear and non-linear model and forecasting techniques can both be utilized and optimized for accuracy

  • Variance of the forecast model error from the average should be minimized - Eg: a variance decomposition or forecast error variance decomposition (FEVD) is used to aid in the interpretation of a vector autoregression (VAR) model once it has been fitted

  • VAR - Vector Autoregression can be a possible technique

  • Neural networks such as AR-NN or RNN can be utilized but ensure the data is in scale. Eg: The default activation function for LSTMs is the hyperbolic tangent (tanh), which outputs values between -1 and 1. This is the preferred range for the time series data.


Training Data

The training data set covers from all data before 18/19_Q4_Mar. Each row described an item on a certain date as follows.

  • Generic Group

  • Generic Brand

  • Generic Product Category

  • Generic Product

  • Generic Variable

  • Generic Sub-Variable

  • Generic LookupKey

  • Units

  • Time Period (a month)


The items include metrics like revenue, volume base, gross ads, leavers, net migrations and Average revenue per customer (see Background section) for Broadband for the Consumer market and also broken down by the Product level.


The ground truth file has the same number of rows, but only has one column, i.e., the revenue. You can use this data set to train and test your algorithm locally.


Testing Data

The testing data set covers from a few months starting from 18/19_Q4_Mar till now. It has the same format as the training set, but there is no groundtruth provided.


You are asked to make predictions for the testing data. You will need to append the last column of “Value” into the testing data. The newly added column should be filled by your model’s predictions.


Time Horizon

Forecast period needed: 4 months (multi-step from Apr-2019 to Jul-2019) ahead (flexible enough to be extended to 12 months). Judging will be based on 4 months forecasts only.


Final Submission Guidelines

Submission Format

You submission must include the following items

  • The filled test data. We will evaluate the results quantitatively (See below)

  • A report about your model, including data analysis, model details, local cross validation results, and variable importance. 

  • A deployment instructions about how to install required libs and how to run.

Expected in Submission

  1. Working Python code which works on the different sets of data in the same format

  2. Report with clear explanation of all the steps taken to solve the challenge (refer section “Challenge Details”) and on how to run the code

  3. No hardcoding (e.g., column names, possible values of each column, ...) in the code is allowed

  4. All models in one code with clear inline comments 

  5. Calculation of MAPE in the code and representation in the tabular format/spreadsheet

  6. SHAP Analysis (Variable Importance) as recommended solution, or alternative interpretation model is felt to be superior.  Results needed in tabular/graphical format with the weightage (and explanation) for the Target variables for the three products.  Variables need to refer to the original variable set, as opposed to any ‘derived’ or ‘transformed’ variable dataset.

  7. If MAPE reduction is not possible below (target/threshold), please focus on forecast based on the trend in the data

  8. Flexibility to extend the code to forecast for additional months

Quantitative Scoring

Given two values, one groundtruth value (gt) and one predicted value (pred), we define the relative error as:


    MAPE(gt, pred) = |gt - pred| / gt


We then compute the raw_score(gt, pred) as


    raw_score(gt, pred) = max{ 0, 1 - MAPE(gt, pred) }


That is, if the relative error exceeds 100%, you will receive a zero score in this case.


The final score is computed based on the average of raw_score, and then multiplied by 100.

Final score = 100 * average( raw_score(gt, pred) )


We will use this as a part of evaluation.

Judging Criteria

Your solution will be evaluated in a hybrid of quantitative and qualitative way. 

  • Effectiveness (50%)

    • We will evaluate your forecasts by comparing it to the groundtruth data. Please check the “Quantitative Scoring” section for details.

    • Your model must achieve a better MAPE than the threshold MAPE.

  • Clarity (20%)

    • The model is clearly described, with reasonable justifications about the choice.

  • Reproducibility (20%)

    • The results must be reproducible. We understand that there might be some randomness for ML models, but please try your best to keep the results the same or at least similar across different runs.

  • Completeness (10%)

    • The submission must be complete. 

    • The winners should also provide the variable importance analysis for their models. Specifically, we are looking for 5 ~ 10 most important variables to support a “what if” functionality. In the future applications, we will provide a slider for these features to the end user, so they can play with these feature values.


Reliability Rating and Bonus

For challenges that have a reliability bonus, the bonus depends on the reliability rating at the moment of registration for that project. A participant with no previous projects is considered to have no reliability rating, and therefore gets no bonus. Reliability bonus does not apply to Digital Run winnings. Since reliability rating is based on the past 15 projects, it can only have 15 discrete values.
Read more.


Final Review:

Community Review Board


User Sign-Off


Review Scorecard