ico-arrow-big-left

Sensitive Data Lookup and Trimming

Key Information

Register
Submit
The challenge is finished.
Show Deadlines

Challenge Overview

Our frequent client, John Hancock, brought us a new job to do. One of their teams got a new compliance requirement, to ensure that variable length text fields they have in their databases do not contain sensitive financial information, like credit card numbers, bank accounts or social security numbers. They want us to create a tool that will help to identify such fields, and to trim sensitive data, replacing them by dummy placeholders.

In this challenge you will create a NodeJS script that takes a CSV file as input (it will contain the database content in key / value format, you’ll find an example in the challenge forum); goes through it line-by-line, checking whether values contain any sensitive data; and outputs CSV files containing updated key / value pairs.

Technically it must look the following way:
  • Don’t read entire CSV into memory, process it line-by-line as you read it, and generate the output the same way;
  • Each value should be passed via trim(string) function that takes arbitrary string as the input and outputs the trimmed string along with auxiliary data describing the modifications done in the output. Inside this function should be similar to this:
    function trim(string) {
      let res = { value: string, trimsDone: [] };
      res = rule01(res);

      res = rule02(res);
      ... 
      res = ruleNN(res);
      return res;
    }

    The idea is that it should sequentially run the value through multiple functions, each checking some rules and making some replacements, thus reducing the result. This way we can keep each test in a separate code file, making it easy to manage the code in future (like adding new / removing old rules). Sure, the names should be more verbose than ruleNN(...).
  • For this challenge, you should implement just a single rule that is able to detect and trim the Credit Card Numbers. Use the rules located here to determine if a string is a credit card number and mask. 
  • Each type of sensitive data should be replaced by its own placeholder, configurable via config file.
On the more technical side, be sure to use/setup:
  • Babel (ES6)
  • ESLint (AirBnB config)
  • Git (initial commit, .gitignore, README.md)
  • Lodash
  • Node 8.9.3 (latest LTS version)
  • Node config
  • NVM (.nvmrc refering the node version)
In case of any doubts, do not hesitate to raise them in the challenge forum!

Final Submission Guidelines

Submit the source code (all setup / usage instructions and related notes should be put into README.md).

Reliability Rating and Bonus

For challenges that have a reliability bonus, the bonus depends on the reliability rating at the moment of registration for that project. A participant with no previous projects is considered to have no reliability rating, and therefore gets no bonus. Reliability bonus does not apply to Digital Run winnings. Since reliability rating is based on the past 15 projects, it can only have 15 discrete values.
Read more.

ELIGIBLE EVENTS:

2018 Topcoder(R) Open

REVIEW STYLE:

Final Review:

Community Review Board
?

Approval:

User Sign-Off
?

CHALLENGE LINKS:

Review Scorecard

?