The objective of this challenge is to generate time-series forecasts with the highest accuracy predictions possible for only the 5 financial variables outlined below.
The accuracy of the forecast must at least improve on the Threshold targets quoted for each variable - targets based on the results of a previous challenge.
The model should be tailored to a 12-mth forecast horizon but must be extendable beyond this time period.
The accuracy of a prediction will be evaluated using MAPE (Mean Absolute Percentage Error) on the privatised data set over a period of 6 months and on the maximum Absolute Percentage Error on any one month over this same period.
A Mobile network, to be known as Sandesh, has been creating a high quality, high accuracy forecast for a number of key Financial metrics.
The first challenge in this creation process was held in February 2020 and a complete set of time-series forecasts were developed. A selection of these now need further improving to achieve the target required accuracy over the evaluation period - neither the MAPE nor the prediction trends were sufficiently close to the actuals.
The five target variables are financial metrics
(Mobile Data) Revenue - Panther - the revenue generated by the subscriber base per month for the ‘airtime and data’ service for product Panther. The initial prediction failed to anticipate the flattening performance from Apr ‘19.
The training data set provided now has been extended forward to Aug ‘19.
Average Revenue per user (ARPU) - Panther - the average monthly revenue paid by a subscriber per month for their ‘airtime and data allowance’ service.
The initial prediction failed to model the steady decline from Apr ‘19, and potentially starting from May ‘18.
The training data set provided now has been extended forward to Aug ‘19
Leavers (aka Churn) - Leopard and Panther - the number of subscribers per product who terminated service with the brand during that month.
The initial predictions failed to anticipate the steady climb in volumes from Apr ‘19 onwards.
The data set has been extended to Aug ‘19. Please see ‘Switching - new regulation’ section for further insight.
Gross adds – Panther - the number of new subscribers to Panther joining the brand during a month.
A completely new data set for this variable has now been provided, invalidating the previous predictions.
Additional data sets are included to provide a range of independent variables that may prove to be valuable in forecasting the 6 target variables. A complete list is included with the data set but a selected number are described here
Brand Net Promoter Score (Brand NPS). A measure of customer satisfaction with the product based on customers’ willingness to recommend the product to their friends and colleagues.
Leavers - Number of customers about to leave each month. This is the number of customers in the final month of their contract. These customers will therefore be legally allowed to leave the following month, or upgrade their contract without punitive charges. This metric is a strong indicator of ‘Leavers’ volumes - please see Business insight section.
This metric is closely linked to Gross Adds volumes from 24 months earlier.
Out of Contract %. This is a measure of the proportion of the customer base that are no longer within their contract period. These customers are legally allowed to leave without cost, and will therefore have a clear link to ‘Leavers. For historic reasons, this measure should only be considered from April 2016 onwards
Switching - New regulation
In July 2019 the regulations changed and it became easier for customers to Leave / Churn / . This change in market conditions needs to be modeled into the predictions.
The underlying Churn performance has dramatically increased across both Products and needs to be reflected in any predictions.
In June and July, the actuals were distorted by the introduction of the regulations. June performance is believed to have been reduced due to a temporary drop in demand, and July increased as a result from the pent up demand. This distortion to these two months can be treated as outliers - any treatment needs to be clearly documented, and reproducible as part of the model when used on the real data set.
Training data has therefore been provided until Aug 2019.
Factors impacting Leavers
It is important to note that a number of secondary factors will be contributing to the Churn performance. These factors, if included, may improve accuracy. In addition to the change in switching regulations, the following factors may cause change in Churn
Out of Contract %. Describes the number of customers eligible to Leaver
Leavers - Number of customers about to churn per month. Describes the number of customers about to come out of contract and drive an increase in ‘Out of Contract%’
Gross Adds. From two years previously for Leopard and one year previously for Panther closely defines the ‘Number of customers about to churn per month’ as these customers come to the end of their contract period.
Upgrades - volume of existing customers changing their contract (not shown on the above graph)
Financial Year modeling:
Sandesh reports its financial year from April - March. This may contribute to seasonality based on financial year, and quarters (Jun, Sep, Dec, and Mar), rather than calendar year.
Anonymised and Privatised data set:
‘Z-score’ is used to privatise the real data.
For all the variables, following is the formula used to privatise the data:
zi = (xi – μ) / σ
where zi = z-score of the ith value for the given variable
xi = actual value
μ = mean of the given variable
σ = standard deviation for the given variable
Targets and Thresholds
The performance of the models on privatised data is not direcly correlated with performance on Real Data. Please use the provided foundation model - 351686, as benchmark.
Your submission will be judged on two criteria.
Minimizing error (MAPE).
Achieving the Thresholds and Targets designated in the tables above.
It is recommended to optimise the models to minimise RMSE, as opposed to MAPE. The privatisation method used (see later section) can distort the error analysis.
The details will be outlined in the Quantitative Scoring section below.