PINS Master: Extracting OI Ionogram Parameters

Key Information

Register
Submit
The challenge is finished.

Challenge Overview

Introduction

Despite starting nearly 100 kilometers above the surface of the Earth, the ionosphere plays an active role in our day-to-day lives affecting High-Frequency (HF) Radio propagation. International air traffic controllers, oceanographers using surface wave radars, the space launch community, and many others are all affected by electron density distribution in the ionosphere. The ionosphere lets you hear distant AM radio stations in your car but it can also affect the quality of long-range air traffic control communications.

Modeling the impacts of the ionosphere on HF Radio can be a significant challenge. Installing and operating ionospheric bottomside sounding systems, called ionosondes, requires a large amount of electricity, human resources, and the construction of an entire infrastructure of high-profile antennas. However, passively receiving a characterized or non-characterized sounder transmission is considerably more convenient. It requires a fraction of the power and resources, and utilizes lower-profile equipment that can be installed temporarily. 

The IARPA Passive Ionospheric Non-Characterized Sounding (PINS) Challenge is an open innovation competition that asks Solvers to develop an algorithm that characterizes, monitors, and models ionospheric variation effects on high frequency emissions. The PINS Challenge invites Solvers from around the world to develop innovative solutions that can lead to a greater understanding of the ionosphere and the effects it has on our technology.

Solvers are challenged to characterize the ionosphere with selected digitized radio-frequency (RF) spectrum recordings from sounder receiver data, but not any transmitter data. The PINS Challenge takes place in two stages: Explorer and Master. Part I, the Explorer challenge, dealt with the detection and characterization of sounder signals in vertical incidence. Part II, the Master challenge, focuses on a similar task in oblique incidence. This document is the problem specification of Part II, the Master challenge.

For those solvers who participated in the Explorer challenge, the most significant changes are briefly highlighted below, find details later in this document. Naturally you are advised to read the whole specification carefully, as there may be other important details that have changed since the previous challenge.

  • All data is from live, RF recordings, no simulated signals.

  • Different set of parameters to extract.

  • No explicit scoring for detecting presence / absence of layers.

  • Supporting multiple set of extracted parameters per test case.

  • Submission must contain annotated ionogram images.

Task overview

In these challenges your task is to process In-phase/Quadrature voltages (known as I/Q signals, see the References section at the end of this document for more information) measured by a broadband HF antenna that is connected to a set of Software Defined Radios (SDR). The SDRs were time-synchronized with Global Positioning System (GPS) timing. All PINS I/Q recordings were made at several locations in the United States. Within these recordings are many unique signals with a wide variety of modulations and signal strengths, and the recorded signals propagated between sites by ground wave and/or skywave.

The training data sets will have a good distribution of 'easy' and 'hard' signal environments. Using these data sets, the noise and interference environments can be gradually escalated in order to improve the performance of your algorithm.

In the current challenge (Master Challenge), your are asked to calculate ionospheric parameters from samples of I/Q data, collected in oblique incidence (OI) measurements. The goal of the Master Challenge is to derive specification of the HF sky-wave environment (the bottom-side ionosphere) across a longer circuit. This will be accomplished through passive reception of active sounders from an oblique incidence. This process will require to create algorithms to determine ionospheric characteristics derived from Oblique Incidence datasets. The parameters of interest are presented below in Figure 1.

 Figure 1:  OI Ionogram Parameters

Extracting these ionogram parameters requires you to extract an OI sounder's parameters from the given I/Q data. For linear sweep sounding you will need to determine the sounder's sweep rate and start time. For pulsed soundings derived parameters also include - among other variables - the number of pulse repeats per frequency and inter-pulse period. See [1] and [2] for a detailed list of sounder parameters that you will need to work with and also for the range and possible values of each sounder parameter.

For each test case (i.e. a continuous sample of raw HF I/Q data) you are given descriptions of soundings (e.g. whether it originates from a continuous linear sweep or from a coded pulse sounding. The first challange contained real (i.e. observed, measured) data, as well as synthetic data. In the currenct challenge you have to work only with real data.

For each test case your algorithm must calculate the following basic ionospheric parameters:

  • JF 

  • h’F2 ([8], page 54)

  • fmaxE ([8], page 53) 

Successful extraction of these parameters requires that your code identifies and segregates the E / F1 / F2 layers.

The quality of your algorithm will be judged by how closely these extracted parameters match those determined by domain experts. See Scoring for details.

Input files

Training and provisional test data can be downloaded from this AWS S3 bucket:

Bucket name: pins2-data-public
Access key ID: AKIAT52LNUDUK6JUL7KN
Secret access key: tDTQEKRh1xvziTqxrQaBBqbLyNXEpWuhPEXm49fH

(If you are unfamiliar with AWS S3 technology then this document can help you get started. Note that the document was prepared for a different contest, but most of its content - like how to set up an AWS account or what are the tools you can use - is relevant in this challenge as well.)

Note that the total size of training and provisional testing data is huge: ~2.93TB. We recommend that before downloading the full data set you familiarize yourself with the data by downloading a small subset, e.g. a single IQ file (between 8GB and 16GB). Access keys are the same as in the Explorer challenge, the bucket name is different.

The /training folder of the pins2-data-public bucket has the following structure:

- list.txt: contains metadata about each IQ data file in CSV format. Each line of list.txt describes one IQ file. The lines are formatted as follows:

file-id,sounder-type,fmaxE,JF,h'F2

where

  • file-id is the unique identifier of the test case. 

  • sounder-type is either linear or pulsed

  • fmaxE is the critical frequency of the E layer measured in MHz. If the E layer is not present or the value can not be determined, it contains the value nan.

  • JF is the junction frequency of the F2 layer measured in MHz. In a small number of cases when the value can not be determined it contains nan.

  • h'F2 is the virtual height of the F2 layer measured in km. In a small number of cases when the value can not be determined it contains nan.

- <file-id>.bin: raw IQ data. See [1] for details on the format and content of the file. Two important pieces of metadata (note that these are different from the Explorer challenge): 

  1. Sampling rate. Linear samples have a sampling rate of 20 MHz. Pulsed samples have a sampling rate of 10 MHz. 

  2. Center frequency. This value is inherent to the recorder and it is what the antenna signals are mixed against. Linear samples have a center frequency of 12 MHz. Pulsed samples have a center frequency of 7 MHz.

- <file-id>.png: contains the ionogram generated from the corresponding raw IQ data. 

- sounding-params-linear.txt and sounding-params-pulsed.txt: contain sounder parameters of the measurements, for the linear sweep and pulsed test cases, respectively. These are two CSV files, containing one line of information for each sample. The format of the line depends on the sounder-type:

  • for linear soundings (sounding-params-linear.txt) the line contains these fields, in this order:

    • file-id,

    • start time: the time when the linear sweep starts, measured in seconds from the start of the file,

    • start frequency: initial frequency of the sounder, measured in Hz,

    • sweep rate: speed of the frequency sweep, in kHz per second,

    • end frequency: final frequency of the sounder, measured in Hz.

 
  • for pulsed soundings (sounding-params-pulsed.txt) the line contains these fields, in this order:

    • file-id,

    • start time: the time when the first pulse starts, measured in seconds from the start of the file,

    • inter pulse period: time difference between the start of two consecutive pulses, measured in seconds,

    • number of pulses per frequency,

    • start frequency: initial frequency of the sounder, measured in Hz,

    • frequency step: frequency step size, in Hz,

    • end frequency: final frequency of the sounder, measured in Hz,

    • polarization: one of {O, O/X},

    • phase shifting enabled: one of {true, false}.

 

The /testing folder has the following structure:

- list.txt: contains one piece of metadata about each full bandwidth IQ data file in CSV format: whether the sample's sounder type is linear or pulsed.

- <file-id>.bin: raw IQ data. Each such file corresponds to one test case, your task is to extract the required ionogram parameters from each of these IQ files.

Output files

1. Extracted parameters. The extracted fmaxE,JF and h'F2 values must be listed in a single CSV file. This file should contain all the required ionogram parameters corresponding to all raw IQ files in the test set found in the AWS bucket referenced above. The file must be named solution.csv and have the following format: 

file-id,fmaxE,JF,h'F2

where

  • file-id is the unique identifier of the test case,

  • fmaxE is the critical frequency of the E layer measured in MHz,

  • JF is the junction frequency of the F2 layer measured in MHz,

  • h'F2 is the virtual height of the F2 layer measured in km.

In contrast to the Explorer challenge, now you can list multiple {fmaxE,JF,h'F2} triplets for a test case to address the fact that signals coming from more than one sounders may be present in the data. The maximum number of such triplets is 10. If you want to include a 2nd or further additional extracted parameter sets, simply add the value triplets to the end of the line, comma-separated, see sample later. See "Scoring" on how multiple extracted parameter sets are scored.

Your solution file may or may not include the above header line. The rest of the lines should specify the extracted parameters, one test case per line.

Sample lines:

test-001,10.2,11.73,499.9,13.44,15.2,521.0
test-002,12,13.887,480.2

Note the presence of two predicted sets of parameters in the first sample line.

2. Annotated ionograms. The master challenge seeks automated solutions. The challenge team expects an ionogram (in annotated picture form) for each set of ionospheric parameter answers.

Multiple sets of answers will not be accepted from the same ionogram. The challenge team may disqualify a solver’s entire solution set if the challenge team determines the solver is trying to “game the scoring system” instead of giving quantitative answers per sounding directly derived from the data.

If a solver finds only one sounding in a file, the solver should not attempt to produce more than one set of ionospheric parameter answers; doing so will risk disqualification and a zero for that test case.

Your submission must contain one or more ionogram images for each test case. The image must be named <file-id>-<n>.<ext>, where 

  • file-id is the unique identifier of the test case,

  • n is the 1-based ordinal number of the parameter set you found in the signal,

  • ext is either png or jpg.

The image must contain annotations for the nth parameter triplet that your algorithm extracted and that is listed in your solution.csv file for the corresponding test case. The style of the annotation is up to you, but it must be possible for a domain expert to correlate the annotations to the reported parameter triplets.

 

Submission format

This match uses a combination of the "submit data" and "submit code" submission styles. The required format of the submission package is specified in a submission template document. This current document gives only requirements that are either additional or override the requirements listed in the template.

  • You must not submit more often than 3 times a day. The submission platform does not enforce this limitation, it is your responsibility to be compliant to this limitation. Not observing this rule may lead to disqualification.

  • An exception from the above rule: if your submissions scores 0, then you may make a new submission after a delay of 1 hour. 

  • The /solution folder of the submission package must contain the solution.csv file, and also the annotated ionograms. As an example, if there are 3 test cases, named test-001, test-002 and test-003, and you found 2 sounders in the first test case, and a single one in the rest, then your package must look like this:
    /code
      Dockerfile
      test.sh
      train.sh
      ... // anything else required to run your system
    /solution
      solution.csv   // having 3 lines (or 4 if you add a header)
      test-001-1.png // or test-001-1.jpg
      test-001-2.png
      test-002-1.png
      test-003-1.png 

Scoring

During scoring your solution.csv file (as contained in your submission file during provisional testing, or generated by your docker container during final testing) will be matched against  expected ground truth data using the following algorithm. 

If your solution is invalid (e.g. if the tester tool can't successfully parse its content), you will receive a score of 0.

Otherwise your score for a test case is calculated as follows:

score = (sc_fE +sc_JF + sc_hF2) / (N * 1000), where

  • sc_fE = max(0, 1000 - diff), where diff is the difference of expected and extracted fmaxE values in kHz. If the expected value of fmaxE is unknown then sc_fmaxE = 0, regardless of the value you extracted.

  • sc_JF = max(0, 1000 - diff), where diff is the difference of expected and extracted JF values in kHz. If the expected value of JF is unknown then sc_JF = 0, regardless of the value you extracted.

  • sc_hF2 = max(0, 1000 - diff), where diff is the difference of expected and extracted h'F2 values in km. If the expected value of h'F2 is unknown then sc_hF2 = 0, regardless of the value you extracted.

  • N is the number of known expected parameter values. E.g. if all 3 parameters have valid expected values then N = 3. N is always at least 1.

The above calculation is repeated for all parameter triplets you reported for this test case. The test case score is the maximum of these score values. Note that because of using the maximum in this step, it is possible to increase the score by adding more parameter triplets to the set of extractions in the hope that one of them will get a higher score. Doing so is against the spirit of this problem, it is not allowed and may lead to disqualification. Stakeholders of the contest will regularly spot check submissions to verify that reported parameter sets correspond to the traces you annotated in the ionograms, and whether these different traces can be plausibly attributed to separate detected sounders.

Finally your score is calculated as 100 * average of test case scores. 

Note that there is a minimum quality requirement for prize winning solutions which is not directly related to the score calculated as described above. See the "Final prizes" section later.

Final testing

This details of the final testing work flow and the requirements against the /code folder of your submission are also specified in the submission template document. This current document gives only requirements or pieces of information that are either additional or override those given in the template. You may ignore this section until you decide you start to prepare your system for final testing. 

  • The allowed time limit for the train.sh script is not specified at the launch date of the contest. It will be determined after the end of the submission phase, based on discussions with the top few ranked solvers. 

  • The training data within your docker container will look like this (folder names are example only, you should not assume anything about them):
         data/
           training/
                list.txt
                sounding-params-linear.txt
                sounding-params-pulsed.txt
                train-000.bin
                train-000.png
                ... other .bin and .png files

  • The signature of the test script is the same as given in the template,
      test.sh <data-folder> <output_path>
    however, the script must output not only a CSV file but a set of PNG or JPG ionograms as well. The ionograms must be placed into the same folder where the CSV file must go. As an example, if there are 3 test cases, named test-001, test-002 and test-003, and the test script is executed as
      ./test.sh /data/test-images /wdata/output/your-handle.csv
    then your script must create these files:
      /wdata/output/your-handle.csv
      /wdata/output/test-001.png // or .jpg
      /wdata/output/test-002.png
      /wdata/output/test-003.png

  • The allowed time limit for the test.sh script is 48 hours. The testing data folder contain similar data in the same structure as is available for you during the coding phase. The final testing data will be similar in size and in content to the provisional testing data. 

  • Testing data within your docker container will look like this (folder names are example only, you should not assume anything about them):
         data/
           testing/
                list.txt
                test-000.bin
                test-001.bin
                ... other .bin files

  • Hardware specification. Your docker image will be built and run on a Linux AWS instance with this configuration: m4.4xlarge.  Please see here for the details of this instance type.

  • To speed up the final testing process the contest admins may decide not to build and run the dockerized version of each contestant's submission. It is guaranteed however that if there are N main prizes then at least the top 2*N ranked submissions (based on the provisional leader board at the end of the submission phase) will be final tested.

General Notes

  • This match is not rated

  • Teaming is allowed. Topcoder members are permitted to form teams for this competition. If you want to compete as a team, please complete the teaming form. After forming a team, Topcoder members of the same team are permitted to collaborate with other members of their team. To form a team, a Topcoder member may recruit other Topcoder members, and register the team by completing this Topcoder Teaming Form. Each team must declare a Captain. All participants in a team must be registered Topcoder members in good standing. All participants in a team must individually register for this Competition and accept its Terms and Conditions prior to joining the team. Team Captains must apportion prize distribution percentages for each teammate on the Teaming Form. The sum of all prize portions must equal 100%. The minimum permitted size of a team is 1 member, the maximum permitted team size is 5 members. Only team Captains may submit a solution to the Competition. Notwithstanding Topcoder rules and conditions to the contrary, solutions submitted by any Topcoder member who is a member of a team on this challenge but is not the Captain of the team are not permitted, are ineligible for award, may be deleted, and may be grounds for dismissal of the entire team from the challenge. The deadline for forming teams is 11:59pm ET on the 21th day following the start date of each scoring period. Topcoder will prepare a Teaming Agreement for each team that has completed the Topcoder Teaming Form, and distribute it to each member of the team. Teaming Agreements must be electronically signed by each team member to be considered valid. All Teaming Agreements are void, unless electronically signed by all team members by 11:59pm ET of the 28th day following the start date of each scoring period. Any Teaming Agreement received after this period is void. Teaming Agreements may not be changed in any way after signature.

  • Use the match forum to ask general questions or report problems, but please do not post comments and questions that reveal information about the problem itself or possible solution techniques.

  • In this match you may use any programming language and libraries, including commercial solutions, provided Topcoder is able to run it free of any charge. You may also use open source languages and libraries, with the restrictions listed in the next section below. If your solution requires licenses, you must have these licenses and be able to legally install them in a testing VM (see “Requirements to Win a Prize” section). Submissions will be deleted/destroyed after they are confirmed. Topcoder will not purchase licenses to run your code. Prior to submission, please make absolutely sure your submission can be run by Topcoder free of cost, and with all necessary licenses pre-installed in your solution. Topcoder is not required to contact submitters for additional instructions if the code does not run. If we are unable to run your solution due to license problems, including any requirement to download a license, your submission might be rejected. Be sure to contact us right away if you have concerns about this requirement.     

  • You may use open source languages and libraries provided they are equally free for your use, use by another competitor, or use by the client. If your solution includes licensed elements (software, data, programming language, etc) make sure that all such elements are covered by licenses that explicitly allow commercial use.

  • If your solution includes licensed software (e.g. commercial software, open source software, etc), you must include the full license agreements with your submission. Include your licenses in a folder labeled “Licenses”. Within the same folder, include a text file labeled “README” that explains the purpose of each licensed software package as it is used in your solution.     

  • External data sets and pre-trained models are allowed for use in the competition provided the following are satisfied:

    • The external data and pre-trained models are unencumbered with legal restrictions that conflict with its use in the competition.

    • The data source or data used to train the pre-trained models is defined in the submission description.

Final prizes

In order to receive a final prize, you must do all the following:

  • Achieve a score in the top five according to final system test results. See the "Scoring" and "Final testing" sections above.

  • Satisfy the following minimum quality requirement:

    • calculated as the average over the final test cases your extracted JF, h'F2 and fmaxE values must be within 35% of the expected values.

  • Once the final scores are posted and winners are announced, the prize winner candidates have 7 days to submit a report outlining their final algorithm explaining the logic behind and steps to its approach. You will receive a template that helps creating your final report.

  • If you place in a prize winning rank but fail to do any of the above, then you will not receive a prize, and it will be awarded to the contestant with the next best performance who did all of the above.

PINS Challenge References

[1] Dao, Eugene (2019, March): Data Description and Target Sounder Signals

[2] Dao, Eugene (2019, March): HF Sounder Signal Processing

[3] https://nvlpubs.nist.gov/nistpubs/Legacy/MONO/nbsmonograph80.pdf

[4] http://www.cnofs.org/Handbook_of_Geophysics_1985/Chptr10.pdf

[5] https://www.ngdc.noaa.gov/stp/iono/Dynasonde/

[6] https://github.com/MITHaystack

[7] https://www.iarpa.gov/index.php/research-programs/hfgeo

Additional References

Solvers can be successful solely using the preceding references, but for those seeking more in-depth knowledge of RF signal propagation and ionospheric characterization the following references may be of interest.

[8] Leo F. McNamara (1991) The Ionosphere: Communications, Surveillance, and Direction Finding (Orbit: A Foundation Series), Krieger Publishing Company, Malabar FL

[9] Goodman, J. M. (1991), HF Communications: Science and Technology, Van Nostrand Reinhold, New York, NY

Additional Rules and Conditions

There are a number of additional rules and conditions associated with this competition.  Please review the PINS Challenge Rules Document for supplementary information about Payment Terms, Intellectual Property Agreements, Eligibility Requirements, Warranties, and Limitations of Liability.  .