ico-arrow-big-left

Computer Vision - Duplicated Receipts Detector - Improvement of The Initial PoC

Key Information

Register
Submit
The challenge is finished.
Show Deadlines

Challenge Overview

In the previous challenge we have developed a proof-of-concept (PoC) solution for an automated detection of duplicate receipts submitted to the client’s expenses reimbursement process. The winning solution (you will find it in the challenge forum) implements a MongoJS / NodeJS / Python API to solve the problem, along with a simple Angular frontend to demonstrate it in action. Our client approved the outcome of that challenge, and wants to further build-up on that PoC, thus the new challenge in the series.

Here is the list of issues the client has with the current code version, to be addressed in this challenge:
  1. Current PoC does not support multiple bills provided in a single image. We should update the algorithm (add additional algorithms into the processing flow) to detect individual receipts in the images containing multiple receipts, to be able to handle them as separate receipts in the further flow. This operation should be the first in the entire processing flow.
  2. Current PoC does not perform well when resized images are submitted (in some cases it manages to detect as duplicates resized versions of the same image, it is inconsistent in some times, and does not work well when the size difference is large);
  3. Current PoC does not work well with content tampered bills; i.e. when the applicant edits some part of an old receipt and submits it again, for example modifying a date and submitting the bill again the next month. We want to add some algorithms that are able to detect receipt manipulations without comparisons with other images; i.e. this piece of the processing flow should take a single receipt image as the input, and tells on the output whether this image is authentic or has traces of manipulation with the receipt or with the image itself. This operation should be the second in the entire processing flow, and if the receipt is suspected to be tampered, the code should report it right away, without proceeding to the duplicates comparison stage.
  4. Current PoC does not work well with skewed bills.
Important additional requirement is that algorithms #1 and #3 should be able to process the input image in real-time (i.e. under 5 seconds for images containing 1-4 receipts). Because they will be the first operations in the processing flow, if the input image contains a tampered receipt, the app should fail it within 5 seconds.

Final Submission Guidelines

Submit updated code, along with verification instructions and a demo video.

Reliability Rating and Bonus

For challenges that have a reliability bonus, the bonus depends on the reliability rating at the moment of registration for that project. A participant with no previous projects is considered to have no reliability rating, and therefore gets no bonus. Reliability bonus does not apply to Digital Run winnings. Since reliability rating is based on the past 15 projects, it can only have 15 discrete values.
Read more.

ELIGIBLE EVENTS:

2018 Topcoder(R) Open

REVIEW STYLE:

Final Review:

Community Review Board
?

Approval:

User Sign-Off
?

CHALLENGE LINKS:

Review Scorecard

?