The best 6 performers of this contest, having a positive score (according to system test results) will receive the following prizes:
The top 3 solutions of the first match is available here. Please feel free to use anything from their submissions.
QuakeFinder (QF) has been involved for over 12 years with research looking at several electromagnetic indicators that may eventually provide short term (weeks to days) warning for large seismic events. These electromagnetic signals or indicators appear to be related to stress-induced release of charge conductors (p-hole carriers) deep in the earth, near future earthquake hypocenters. See this document. The indicators manifest themselves as unusual magnetic (unipolar) pulses, large increases in air conductivity near the future epicenter, "earthquake lights" (extreme cases of air ionization), and apparent infra red patterns seen near the epicenter as detected via various IR weather and environmental satellites. There may be also other manifestations that we are currently not aware of. You will be provided with several specific data sets (e.g. 3 axis induction magnetometer channels, and air conductivity sensor channel) from ground instruments near several earthquakes in California and Peru.
Your task is to develop a software algorithm to uniquely identify the electromagnetic pulses that may precede an earthquake by days to weeks.
Each data set contains measurements from one site. The site provides 3 channels of information, measured at a frequency of 32 or 50 samples per second. Hourly data for the channels will be given to your algorithm. Your algorithm should return the odds ratio of an earthquake event happening for every coming hour at each site, for a period of 90 days.
As a real raw magnetometer data, it may measure many signals with the origins non-related to earthquakes. Many of those signals may have even stronger amplitude than the signal from the events themselves. Here are just a few examples of some of the known signals of that type: Vehicle engines, lightning, solar flares, electrical interference, magnetometer resets, ...
You should implement the init method, which provides you with information about the sample rate. The init method will be called once for every test case. sampleRate contains the number of samples measured (H) in each second.
You should implement the forecast method. The method will be called once for each hour of data. hour contains the zero-based index of the hour to which the data belongs. The data array contains the measurements for all channels for the specific hour. The array will contain H*3600*3 elements. The ith measurement for channel c will be at data[c*(H*3600) + i]. The range of the values in the data array is in [0,2^24]. Sometimes it happens that measurements are not always available, if the data array contains a value of -1 it indicates that the measurement is not provided. K contains the planetary magnetic activity index at the given hour as is in the range [0,10].
You should return an array N of size 2160. Each value in the array should be the odds ratio of an earthquake event happening at time t. In other words, the value at return[t] should contain the odds ratio of an earthquake event at the site at hour h+t+1. (important - the output has changed as compared to the first match)
Each test case will have only one or zero earthquake events, for the event-positive test cases your forecast method will stop being called when an event occurred, for the event-negative test cases your forecast method will stop being called at a randomly selected time-moment before the dataset end.
Your forecast will be scored from day 32 (hour 768) onwards, until the earthquake event occurred. Your returned array N will not be normalized. Let Z=9. Let G be the entry in the array N where the actual event happened at the specific hour. In case the test contains an earthquake event, your score (F) for a single hour will then be:
F = 2 * G - (Sum of squared values in N / (Z*sizeof(N)) - 1
In case an earthquake event is not present, let W[h] be a hidden weight for hour h. Your score for a single hour will then be:
F = - (Sum of squared values in N / (Z*sizeof(N)) * W[h]
Your raw score for a test case will then be the sum of all the F scores. Finally, your total score is equal to the sum of raw scores on all test cases. If your system test total score is negative, you will receive a zero system test total score. Your total score shown on the leaderboard will be divided by the total number of hours for which F is calculated on event-positive test cases. Finally, it will be multiplied by 1000000.
The training data sets consist of 50% of all the sets we have available for the contests. The data can be accessed here
A visualization tool is provided for offline testing and can be downloaded here.
|-||The time limit is 60 minutes per test case (this includes only the time spent in your code).|
|-||The memory limit is 4096 megabytes.|
|-||There is no explicit code size limit. The implicit source code size limit is around 1 MB (it is not advisable to submit codes of size close to that or larger). Once your code is compiled, the binary size should not exceed 1 MB.|
|-||The compilation time limit is 30 seconds. You can find information about compilers that we use and compilation options here.|
This problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior written consent of TopCoder, Inc. is strictly prohibited. (c)2020, TopCoder, Inc. All rights reserved.