Predictive Analytics: How Big Data Gets Smarter
Predicting the Future via Past Sets of Big DataThere is a specific and special contest type at TopCoder called Marathon Matches. These long form algorithmic challenges are created for problems that are larger and typically more complex and because of these factors, require more time for outstanding solutions to be developed and submitted. That is the case for a very special challenge the TopCoder Community is now tackling that centers around predicting violent crime and major health emergencies in the city of Boston, MA. Part of the specialty in succeeding in Big Data value creation via Open Innovation – which is powered by exceptional algorithmic solutions – is the way in which you set a problem up to be digested by a community (or crowd – we at TopCoder prefer the term community – see why here). To see the level of preparation that goes into making a challenge such as this readied for competition, we invite you to view the Boston Risk Prediction Problem Statement right here. PLEASE NOTE – You WILL have to log-in to the TopCoder Arena in order to view this problem statement, but for those interested in seeing Open Innovation in action, it is worth your time. Below is a snippet from the aforementioned problem statement.
“The aim of this contest is to produce an algorithm that can identify addresses, streets and neighborhoods in the city of Boston, Massachusetts in that are – at a given time – at risk for events of major public concern, like health emergencies or violent crime. Predictions will be based on information contained in the City of Boston’s administrative data. These data include sources that update regularly, like citations from Inspection Services (e.g. housing code violations) or public complaints, and those that are more static, like geography data or local demographic characteristics. The predictions are evaluated on different spatial and temporal resolutions. The more accurate a prediction is for time, type and location more useful it becomes for the City – and the higher the algorithm will score.”Also from the problem statement:
“Neighborhood characteristics have long been noted to have direct, causal impacts on individual and community health. Violence, blight, degraded infrastructure and sub-standard housing all affect a neighborhood’s ability to thrive and are therefore major public health concerns. Though the presence of such conditions strongly predicts a variety of negative health outcomes in residents, the algorithm solicited here would specify the spatial and temporal patterns that underlie these relationships. This could then inform policy measures that intend to make public services more responsive to risk factors, and more effective in preventing major health crises.”
Why Use Open Innovation to Better Solutions Born From Big Data?
A picture is worth a thousand words and in this era of Instagram, that valuation can probably be revised upward. Take a look at this screen capture taken on 11/26/2012 below:
This is why industries, enterprises and governments are turning to Open Innovation and Crowdsourcing to drive innovation in this era of Big Data and predictive analytics. With 14 days remaining in this Marathon Match, 33 competitors have already submitted 191 different potential solutions and as we discussed in the recent TopCoder article - What You Really Want From Open Innovation and Crowdsourcing – in the space of algorithms & analytics, you are really after extreme value outcomes and new discoveries that no other methodology could bring to light so rapidly. To predict the exact outcome of this Open Innovation challenge we may very well need those “Pre-Cogs”. But at TopCoder we have seen time and again that defining a challenge with exacting detail, properly exposing corresponding data sets, and then inviting in the world’s top algorithmists, data scientists, and mathematicians to compete to create the “smartest” solution is an exceptional way to succeed in creating tremendous new value from existing sets of data.