Detecting Wheezing Sounds for Asthma

Key Information

The challenge is finished.
Show Deadlines

Challenge Overview

First 10 member to cross the threshold score is eligible for $100 bonus prize in addition.

Asthma attacks typically occur outside of a hospital, such as at home or on the go. Therefore, the treatment and management of such attacks is not usually by a doctor but mostly left to parents. The symptoms of asthma vary from child to child, and infants are especially at risk, as they cannot express their symptoms in a clear manner.

Parents often do not know the side effects of the medicine, and often they are not administered correctly. As a result, it causes deep regret and added stress to the parents-- why wasn’t I able to do something about this before the asthma attack happened? In this way, parents are fighting to treat asthma in anxiety and regret...

The goal of this match is to help a global healthcare company create an algorithm to detect wheezing sounds from a special recorder placed on top of the children’s chest that could enable medication at an appropriate timing by finding small seizures of the lungs that patients and parents would otherwise find it hard to notice… and accordingly, to improve the troubles of parents with asthmatic children and prevents major episodes.


Problem Description

Your code should read from several sound clip files, and determine what segments of those files represent the “wheezing” sounds sought after. The ultimate output of your code should be a CSV file per sound file where each line is of the form “startMs, endMs”, indicating that the code detected a wheezing sound from the timestamp startMs to endMs.

Virtual Machine Access

Because all of the data for this contest is protected by various confidentiality concerns and legal regulations, the data sets, including the training data, cannot be made generally available for offline download. Recording the data is prohibited. This match is under NDA and sharing anything you heard in data files is prohibited also.
It is therefore required that development and testing be performed in this virtual environment using Amazon WorkSpaces that Internet access is blocked.
You will get 2 virtual environments.
  1. One for development which has all the training data available. You can choose between Linux or Windows machine. Although Internet access is blocked, you can upload any file you want following the notes described below. 
  2. Second for listening to any sound file. Since the development WorkSpaces does not have sounds enabled, you will need to copy files to this listening WorkSpaces in case you want to listen to the actual sound files. You can copy at most 100 files or 750 MB of data from your development to this listening machine during the entire time of the contest.

Once you have registered to the challenge, you will get an email to the form where you can request linux or windows development environment. After we successfully launched your instance, we will send you another email with login credentials and a URL to a page dedicated for you to do following;
  1. Upload any file to your development WorkSpaces
  2. Make a submission
  3. Copy any sound files to your listening WorkSpaces

The number of virtual machines we can provide is limited to 100 for the first week of the match and is provided to the first-come-first-serve basis. If you logged out from the virtual machine for 48 hours, your virtual machine will be shutdown and used for the next participant. Please don’t forget to submit the solution if you are planning to log out for 48 hours!

Anyone who has been working without logging out for more than 48 hrs at the first week of contest will be able to keep his/her WorkSpaces without any logout time limit.
Anyone joining new after May 12th 9am ET. is subject to 48 hrs. logout rule, and will be able to keep his/her WorkSpaces once they made a submission without logging out for 48 hours. 
We will be closing the registration phase once the number of WorkSpaces we can provide has reached the maximum.

Also, please note Amazon WorkSpaces client does not support linux-based clients. Also, Web Access client is disabled. Please work from supported devices (https://clients.amazonworkspaces.com/) to access your environments.


For this match you will NOT be submitting your solutions to topcoder submissions page but just need to push “submit” button on the controller.html page we send to you after your environment is ready.
You will need to package your submission into a folder on the file server mounted to your WorkSpaces. In the folder, you must include file named start.sh (for linux) or start.bat (for Windows). We will be copying your submissions to the scoring environment that is the same as the initial WorkSpaces we gave you and run this script against dev data sets.

The dev data are located at /data/wav/* on Linux WorkSpaces, C:\data\wav\* on Windows WorkSpaces. You will need to output prediction CSVs at /data/pred/ on Linux WorkSpaces, C:\data\pred on Windows WorkSpaces.

Data Description

There are 3 sets of sounds: train, dev, and test. The train data is accessible on Shared Drives inside your virtual machine.
The statistics of these sets are summarized in the following table.

          | Files with wheezing  | Files without wheezing | Total
train  |  75                               |  139                                   |  214
dev    |  93                               |  81                                     |  174
test   |  93                               |  81                                     |  174

For each sound file named $NAME (e.g., 00001), we provide the following data:
  • $NAME.wav (e.g., 00001.wav). This is the input sound file and is always available
  • $NAME.csv (e.g., 00001.csv). This is the groundtruth and is only available for trainining data

You must submit your prediction in following format:
  • You must read all the files in dev data directory, and output prediction file for all of the sound files.
  • Prediction file must be named $NAME.csv
  • Inside the csv, there must be just 2 columns, designating startMs and endMs, without any header row
  • Prediction file will have no rows in case the model did not detect any wheezing for the sound file.
Please note that ground truth files are created with CORSA guideline, as "the dominant frequency of a wheeze is usually >100 Hz and the duration >100 ms". Meaning the wheezing sounds in Ground Truth is always be more than 100ms. (please see wheezes section, Page 593, of this paper for details: https://pdfs.semanticscholar.org/4fc5/e86a83d4628051ceb1b5eccfa29f49346709.pdf)

Sound file format
The sounds are collected from a special recording device that is placed on top of the patient’s chest. The recording of Left channel is gathered from the position on top of the body or lung (gathering the body sound), and the Right channel is recorded from the reverse side (gathering the environment sound).
Sample rate is 44.1kHz, and bit depth is 16 bit for all the sound files.

Ground Truth CSV format
There are several columns in the csv file. The first column denotes the start of wheezing in ms, second column denotes the end time.
FYI: You can see a more detailed classification of sound data inside the reference folder if you would like to see them. The columns “flagA” and “flagB” denotes the type of wheezing and other sounds. For this challenge, you will just need to detect true wheezing sounds.


Your score will be calculated in 2 steps.

Firstly, your solution must meet the next 2 criteria, or the score will be zero.
  • Out of all the files that have wheezing sounds, your solution must detect at least 85% of the files have wheezing sounds into them.
  • Out of all the files that do not have wheezing sounds, your solution must not detect more than 20% of the files have wheezing sound into them.

Secondly, we will score as the micro-F1 score of your predictions against the actual ground truth, scaled to 100.0.

In determining the F1 score, we will break out each sound clip into 10ms intervals and consider each sample as true or false for wheezing, and compare against the submitted predictions, to compute the precision and recall.


The micro-F1 score is the F1 score for all the sound file combined.

F1 score is defined based on pixels. For each interval (10ms), there are 4 cases:
  • True Positive: Your prediction is wheezing (1) and the truth is wheezing (1).
  • False Positive: Your prediction is wheezing (1) and the truth is normal (0).
  • False Negative: Your prediction is normal (0) and the truth is wheezing (1).
  • True Negative: Your prediction is normal (0) and the truth is normal (0).
Let the total number of True Positive be TP, the total number of False Positive as FP, the total number of False Negative as FN, and the total number of True Negative as TN. The TP, FP, FN, and TN is the total number of all the sound files combined.

The precision (P) is defined as TP / (TP + FP) and the recall (R) is defined as TP / (TP + FN). The F1 score is computed as 2 * P * R / (P + R).  In the special case when TP = 0, we define F1 is 1 if FN = 0 and FP = 0 (i.e., there is actually no wheezing sound); otherwise, F1 is 0.

Solution Description

The top-scoring contestants, in order to be eligible to receive a prize payment, must submit the following:
  • A complete archive including their code and any pre-trained models or supplementary data that is used for the computation.
  • In the case that any pre-trained models are required, the code for training and instructions for using such code should also be included.
  • Instructions for running the code, such that results similar to what were submitted during the contest can be replicated. (It is understood that some algorithms rely on randomness and thus may produce small variations.)
  • Within7 days from the announcement of the contest winners, submit a report at least 2 pages long describing the solution, including most of the following:
    • How the code works
    • Why the given approach was used
    • Limitations of the approach
    • Other approaches that were tried but did not work
    • Any of information that you feel may be of use
    • Use of any open source libraries and it’s URL/source
If you place in top 5 but fail to do any of the above, then you will not receive a prize, and it will be awarded to the contest with the next best performance who did all the above.


Following are a list of few public web pages if you’d like to understand what Wheezing sounds are:

General Notes

  • This match is rated
  • If your solution includes licensed software (e.g. commercial software, open source software, etc), even just in the training stage, please ask in the forum. If the proposal gets approved, you must include the full license agreements with your submission. Include your licenses in a folder labeled "Licenses". Within the same folder, include a text file labeled "README" that explains the purpose of each licensed software package as it is used in your solution.
  • In this match you may use any programming language and libraries, including commercial solutions, provided Topcoder is able to run it free of any charge. You may also use open source languages and libraries, with the restrictions listed in the next section below. If your solution requires licenses, you must have these licenses and be able to legally install them in a testing VM (see "Requirements to Win a Prize" section). Submissions will be deleted/destroyed after they are confirmed. Topcoder will not purchase licenses to run your code. Prior to submission, please make absolutely sure your submission can be run by Topcoder free of cost, and with all necessary licenses pre-installed in your solution. Topcoder is not required to contact submitters for additional instructions if the code does not run. If we are unable to run your solution due to license problems, including any requirement to download a license, your submission might be rejected. Be sure to contact us right away if you have concerns about this requirement.
  • You may use open source languages and libraries provided they are equally free for your use, use by another competitor, or use by the client.
  • The usage of external resources (additional wheezing sounds, etc) is allowed as long as they are freely available.


Topcoder will compensate members in accordance with our standard payment policies, unless otherwise specified in this challenge. For information on payment policies, setting up your profile to receive payments, and general payment questions, please refer to ‌Payment Policies and Instructions.