Challenge Overview

Your task in this challenge is to analyze videos showing the driver's cabin of a garbage collecting truck. You have to create a tool that will help the client to determine whether the driver's behaviour is compliant to the company's code of conduct: nobody is allowed to smoke inside the driver’s cabin.

More specifically, we need a tool that:

Takes a video as input.
Creates a text file (events.txt) as output that contains time stamped events of several kinds. An event means that one or more of the following parameters change. All parameters are booleans with possible values {yes, no}.
- driver: the presence of the driver, 'yes' means the driver is present in the cabin.
- driver_smoking: whether the driver is smoking.

In events.txt you have to report all events when any of these parameters change. In case of smoking it is not possible to tell exactly when these events start and end, different human annotators are very likely to set these timestamp differently. However there should be good agreement on the fact whether the driver was smoking at all in a longer period of time, say, in the past minute. Because of this your tool's performance will not be judged on whether it reports these events at the exact right time, but it should be able to detect the presence or absence of them. See the Evaluation section for details on this.

Data files

Very Important: Participants are supposed to delete the Video contents from their personal device after the challenge is over and this data is not allowed to be shared with public as well as with any other person or systems.

Raw video files and corresponding annotations are available from the challenge forum after registration. Some videos feature normal daily operations, others show simulated scenes where people follow prescribed behaviour patterns. (Note that some of the simulated videos show unrealistic scenes like people using various objects instead of a cigarette for smoking. You may use these videos for training but we will use more realistic videos for evaluation.)

Please study the annotation format carefully, it is expected that your tool creates the events.txt output file in the exact same format:

A comma-separated text file.
First element of each line is time in mm:ss format, from the beginning of the video.
The rest of the elements are either:
- driver:{yes,no}
- driver_smoking:{yes,no}

The first line should describe the initial state of the video at time 00:00, it should list all the parameters in their initial state, e.g.:
00:00,driver:yes,driver_smoking:no
The rest of the lines will typically contain a single parameter change, e.g. if the driver starts smoking at 1 min 30 sec:
01:30,driver_smoking:yes
There may be multiple elements present for the same time stamp, e.g. if the driver enters the cabin while smoking at 1 min 30 sec:
01:30,driver:yes,driver_smoking:yes
The published annotation files contain additional parameters (moving, driver_phone, passenger_count, p_smoking, p_phone), these describe other event types like the presence and behaviour of the passengers in the cabin or whether the truck is moving. These can be ignored in the current project and need not be present in your output files.

Compliance

Based on these low level parameters described above the compliance state is calculated as follows: at any point of time non-compliance is detected if the 'driver_smoking' parameter was 'yes' any time in the previous 30 seconds (even if for a single second or for the whole 30 second period). If the 'driver_smoking' was constantly 'no' in the last 30 seconds then the driver's behaviour is compliant to standards.

Evaluation

Your submitted tools will be evaluated by running them on new videos. The compliance states calculated using the output of your tool will be compared against compliance states calculated from hand made annotations. To measure submission quality we will use the F-score metric (https://en.wikipedia.org/wiki/F1_score) which considers both the recall and precision of the algorithm, i.e. it awards high recall (catching a high number of non-compliant cases) and also high precision (not making too many false alarms). You need to achieve an F-score of at least 75% to be eligible for prizes.

For calculating the F-score we shall use 1-second long time windows and count True Positives (number of seconds when both your tool and the expected annotation reports non-compliance), False Positives (you report non-compliance but it is not expected) and False Negatives (there is non-compliance but your tool doesn't detect it).

Very Important: Final Prizes will be based upon the F-score target being achieved by contestants as shown below for “Driver’s Smoking Detection” only:

The prizes are as follows: 1st place: $2500, 2nd place: $1500, 3rd place $1000. No prizes will be paid unless the submission meets a 75% F-score threshold.

We'll pay a $500 bonus to the winning submission over 1st prize, if it achieves a 85% F-score threshold.

We'll pay a $1000 bonus to the winning submission over 1st prize, if it achieves a 90% F-score threshold.

Final Submission Guidelines

The submission package should include the following items.

A document in PDF / Word format to describe your approach.
- It should be minimum of 2 pages.
- It should be written in English. You’re not being judged on your facility with English. We won’t deduct points for spelling mistakes or grammatical issues.
- Leveraging charts, diagrams or tables to explain your ideas is encouraged to complement your proposal.
The output (one events file per video) your tool generates on the following videos:
- ch06_20180603131655_20180603133847.mp4
- Mobile Device and Smoking.mp4
Your implementation and deployment guide.
- We need a step-by-step instruction about how to run your code, including description of all dependencies.
- The code may be implemented in any programming language, however the stakeholders of this contest have a strong preference to Python or R.
Your source code and build scripts.
All dependent 3rd party libraries and tools. (Alternatively pointers to them if their size is large.)
If your solution includes prebuilt models (like neural network weights) then make sure we can run your tool without having to run training first. So either include or link to your hosted model files. But make sure that we can reproduce your training process as well.
Code requirements.
- Your code may use open source libraries as long as they are free to use by the client in a commercial deployment. Apache2, BSD and MIT licenses are permitted. GPL and LGPL are not permitted. All other libraries require permission. To request permission, post your request to the challenge Forum.
- The eventual end system of the client will run in real time. This means that processing a video should not take longer than the running time of the video.
- The end system will NOT contain a GPU, you must make sure that the target performance can be achieved using a CPU-only solution.
- Your solution must process the video sequentially and must not look ahead into future video frames.
- Your tool will eventually be run in the client's MS Azure environment, you must make sure that this will be possible to do. This requires that you check that none of the libraries that you are using has any known problem in Azure.

Titan Eye - Driver Behaviour Smoking Detection DS Sprint Series (Part 1)

Key Information

Challenge Overview

Data files

Compliance

Evaluation

Final Submission Guidelines

LEARN:

ELIGIBLE EVENTS:

REVIEW STYLE:

Final Review:

Approval:

CHALLENGE LINKS:

TOOLBOX:

SHARE:

ID: 30067794