Key Information

Register
Submit
The challenge is finished.

Challenge Overview

Background

Bill & Melinda Gates Foundation (BMGF) Healthy Birth, Growth, and Development (HBGD) program addresses the dual problem of growth faltering/stunting and poor neurocognitive development, including contributing factors such as fetal growth restriction and preterm birth. The HBGD program is creating a unified strategy for integrated interventions to solve complex questions about (1) life cycle, (2) pathophysiology, (3) interventions, and (4) scaling intervention delivery.  The HBGDki Open Innovation platform was developed to mobilize the global “unusual suspects” data science community to better understand how to improve neurocognitive and physical health for children worldwide. The data science contests are aimed at developing predictive models and tools that quantify geographic, regional, cultural, socioeconomic, and nutritional trends that contribute to poor neurocognitive and physical growth outcomes in children.   The application developed by this challenge will support the efforts of the HBGDki Open Innovation initiative.

Current Challenge

One of BMGF's research partners has developed C code to read a matrix stored in a delimited text file and save it to a binary format.   Now they would like the Topcoder community to take this code a step further.  In addition to reading the matrix and storing it as a binary file, the application should store the transpose of the matrix in binary format as well.

Input Format 
Input: 1 tab separated text file, unknown dimensions, with at least 2 rows and 2 columns. Values are integers in the range [­127, 127], and there are no missing values.  A sample input file is provided test.in.

Output Format
The binary format of the output files is as follows:

type   | bytes   | field     | description
­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­------------------------------------------
int    | 4       | 0         | zero
int    | 4       | col       | number of columns in the output matrix
int    | 4       | row       | number of rows in the output matrix
int    | 4       | maxp1     | maximum data value + 1
int    | 4       | 0         | zero
int[]  | col*4   | maxp1_col | vector with (max value + 1) for every column
char[] | col*row | data      | final matrix, column­major order

Note that the max_col array is 4 bytes per value, but the data array is 1 byte per value.  

In addition, the application has the following requirements:

1.  The current application has a dependency on the gsl math library and it uses the gsl matrix object to store the matrix representation.  You should remove this dependency.  No external libraries can be used, except for the C standard libraries and the pthread library discussed below.

2.  The software should not use more RAM than the final size of the largest output matrix. For example, if row = 10000, col = 20000, software should not use more than (10000*20000) bytes + (20000*4) bytes + (4*5) bytes = ~200.1 MB.
­ 
3.  Parallelization techniques are encouraged. The test system will have the pthread library installed, and up to 16 available CPU cores.
­ 
4. The test system architecture is x86_64 GNU/Linux.
­ 
5. You must provide all source files, along with Makefile or other instructions for compilation and execution.
­ 
6. The software will be judged on speed and memory efficiency.  You should add functionality to the command line output which display start time, end time and duration of the program execution in seconds with precision to three decimal points.  Ideally, the time required to transpose the matrix will require no more than 150% of the time required to construct the matrix in binary format.  Please provide timing output both for the initial construction of the matrix and for the transposition.



Final Submission Guidelines

- Please submit all your source code and documentation in a zip format.

- You should submit a Makefile to build your application. You may use the existing Makefile as a starting point. 

- The previously written code for this application can be found in the Code Documents forums attached to this challenge.  Instructions on how to execute the existing code are provided in the readme.txt

 

ELIGIBLE EVENTS:

2016 TopCoder(R) Open

REVIEW STYLE:

Final Review:

Community Review Board

Approval:

User Sign-Off

SHARE:

ID: 30053009