Samasource serves marginalized women and youth by providing digital work at a living wage. We source this work through clients that have large, messy data projects. One vertical that we specialize in is machine learning. Our workers in East Africa, Haiti, Ghana and India are helping hone algorithms by correcting data sets that are processed by the algorithm. For example, we help one client identify elephants in aerial shots in order to count the elephant population and trigger alerts when poaching activity is detected.
In this challenge we’re going to be developing an algorithm for tracking objects in series shots where a video is spliced into many images, we’d like a service to predict the annotation of subsequent images based off of the work submitted on the initial image in the series. The efficiency gain of this predictive annotation service would allow workers to complete more tasks and therefore earn more money each day.
We have two series of images which have been provided:
These first of the series of images will be tagged by Samasource workers like so:
The coordinates of tags are recorded in JSON objects like so:
In this application we’ll need to track objects that have been previously tagged through a series of images. Samasource has provided two tagged images for us at the beginning of each sequence:
Corresponding images in the above .zip file are:
The JSON for the tagging can be found here:
1. The application should provided annotations in the form of JSON objects for each image provided and for each physical objects designated in the initial tagging. There is no image processing required by the application. The format of the JSON documents is the same as the samples provided.
2. The objects tracked in the images will change size as they move closer and further away from the camera. Your annotations should reflect these changes as well.
3. Object types are provided in the annotations which outline in general terms the type of shape that you’re attempting to tag.
4. Objects might move out of view in the pictures. If they aren’t visible you don’t need to provide annotations for them any more.
5. Road lines should be continuously monitored, if they’ve been previously tagged by the human agents. Although the lines are sometimes broken, a dotted line in the lanes, you should still attempt to tag the visible lines even though the initially tagged section has moved off the current image.
OpenCV has Python libraries which can assist with this challenge:
Meanshift and Camshift
OpenCV Track Object Movement
Here is an academic paper on the subject as well:
Validation:Fifty percent of your submission scoring will be based on your tagging accuracy compared to the other submissions for this challenge. Topcoder will perform a visual inspection of the results compared with images that have been tagged manually. The solutions will be ranked from top to bottom. The most accurate solution will receive a 10, the next most accurate a 9, etc.
Final Submission Guidelines
1. You should use Python 2.7 for this challenge.
2. You should submit a complete set of annotations with your submission so facilitate performance testing. Please create a data folder in your submission. Under the ~/data folder, please have a directory for each sequence just as the image files have been provided: carsequence1 and carsequence2. The annotation for each image file should be in separate json document file. The label for each file should be related to base image name: canstockphoto29741751 001.json, canstockphoto29741751 002.json etc. So the complete path for the canstockphoto29741751 001.json file would be ~/data/carsequence1/canstockphoto29741751 001.json.
3. Please include instructions on how to build and execute your application. Please describe any dependencies in the documentation (pip install …)
4. Accuracy rather than performance is the primary concern for this challenge, but if your solution requires more than 20 seconds to tag one picture your performance score and quality scores for the app will suffer.