DNA SEQUENCING, 5X IMPROVED
DNA sequencing has revolutionized biological research and is set to transform medicine as well as this technology scales. Machines like Ion Torrent’s Proton sequencer will deliver a $1000 genome in a day. One of the keys to processing genetic data economically is compressing and decompressing the data with high efficiency, resulting in small file sizes and short processing times. TopCoder members were challenged to make the data storage as efficient as possible in both time and space.
The top submissions from TopCoder’s members used many refinements of the standard compression techniques of Huffman coding and delta coding, with hardware-specific processing speedups. The winning solution improved compression by 41 times. The follow-up challenge, also solved by TopCoder members, produced a solution that compressed files in the client environment to 1.1% of the original file size. The actual implementation of the solution into Ion Torrent’s software pipeline showed a 5X improvement with no effect on data quality.
-- The new solution will lower the cost of life-saving genetic analysis by compressing data into small files at high speeds.
-- Life Technologies can modify the solution for the best fit into their new pipeline for Proton relatively easily.
-- TopCoder members produced a solution that was faster and more efficient without compromising the quality of the data.