Hewlett Packard has developed a set of JSON-based REST API’s which enable “Big Data”-type processing capabilities allowing developers to process information embedded in unstructured text and images in previously inaccessible formats. This platform is called IDOL OnDemand and at this point is in the Early Access release phase, open for all Innovators to use.
As a proof of concept for the API’s, the [topcoder] community has developed a mobile web application that reads business cards and stores the data in Salesforce.com. The basic development for the web application is complete but there are a few significant issues that we need to address:
1. Often images from mobile phones can be as large as 3MB. This is causing time-out problems when these files are being sent to the IDOL OnDemand plaform. After the initial images files are loaded to the Busines Card Reader web server, we need to do some processing on uploaded picture files so that they are 100-200K in size at the most, hopefully without loosing quality. This should prevent a lot of timeout issues that we're currently experiencing.
2. The HP IDOL OnDemand OCR Document API call is returning a certain amount of garbage text when it doesn't read an image cleanly. Let's use regular expressions or some other mechanism to clean up non-ascii characters. It would also be helpful if we could filter out nonsense words as well. Hopefully, this is will assist the entity extraction process.
3. The OCR extraction process isn't currently working that well. We need to do some experimentation to explore the optimal conditions and parameters for text extraction. There is an article on the HP IDOL OnDemand Community site which gives some guidance on this subject here.
Getting Started with HP IDOL OnDemand
Before you can use the API’s you’ll need to sign up for an IDOL OnDemand developer account:
Please indicate that you heard about IDOL OnDemand through [topcoder] in the “How did you hear about IDOL OnDemand?” field:
Once your account has been verified you’ll be assigned a developer account and API Key that will allow you to make API calls. Complete information about available IDOL OnDemand API’s can be found here:
You’ll need to register for a developer account with HP in order to get access to additional Try functionality in the API console. Use of the APIs is free and restricted to non-commercial use at this time. Commercial use and pricing will be announced in the near future.
Before you compete in an IDOL-related challenge on [topcoder] please create a topcoder-specific key in your IDOL OnDemand Account. You can do this by Clicking on Account->API Keys from the developer home page.
Simply generate a new key and rename it to “topcoder” as shown above. This should be the key that you use in [topcoder] challenge completion. This will also give you visibility to Preview API’s which may not yet be in public release.
You should be all set!
Final Submission Guidelines
- Submit a .zip file with your code. Please use the existing maven build scripts to build and deploy your code. You should (of course) add any additional dependencies that are required.
- Use the existing code as a starting point for your submission. This application is an HTML5 Java-based Web Application. This code is attached to the challenge.
- Update the code as required to meet the requirements #1, #2, and #3 outlined above.
- You may change the size and/or format of the images processed by the Business Card Reader Application and loaded to SFDC but the images still need to be readable both in the application and when users examine them when downloaded from SFDC.
- A zip file of sample business card images is attached for experimentation purposes.
- Written documentation for your submission
- Video with a screenshare of your code in action
- A Salesforce.com developer instance has been whitelisted and set up for your convenience. There is a Remote Application for the Business Card Reader configured in this organization as well. URL: https://login.salesforce.com Username: email@example.com Password: @ppirio123 Security Token: QyYsehqOM2OCYff1kXR2Tmjif
Employees and direct and indirect subcontractors of Hewlett-Packard Company and its subsidiaries and other affiliates (“HP”), and employees and direct and indirect subcontractors of HP’s partners (including TopCoder and its affiliates) are not eligible to participate in the challenge.