Welcome to John Hancock - Create Email Parser Contest. As part of this contest, you will be writing an email parser that extracts meaningful text from it.
Our client wants to create a new Skill in Amazon Alexa. This skill allows the user to ask Alexa to read out a list of news headlines and after listening to the headlines, to ask Alexa to read the detailed news for a specified headline.
The source of this news will be from an email. The client gets periodic emails which contains news headlines and their details and the client would like to use this as the source for the Skills’ news.
The setup will be as follows:
We will upload the emails in .txt format to Amazon S3.
You need to write a AWS Lambda function that periodically checks S3 for new emails and processes them. You can use it with Scheduled Events (to run it periodically - once every 24 hours).
The Emails will always be of a specific format, content wise. So you need to go through it and extract the news headlines from it. We are of the opinion that regular expressions should be able to extract the data. You can proceed with an alternative approach if you think it is better. Additionally, you also need to remember the already processed files so that you don’t process it again.
In order to store the news and the details about the processed emails, you can use a database of your choosing. The database however has to be a cloud based solution. Amazon DynamoDB or Mongolab’s MongoDB are some suggestions.
Points To Note
Use Nodejs v6 for your Lambda function.
The major requirement for this contest is to get the parsing of the email correct. Extract the news from the email and format the data correctly (format will be shared in the contest forum).
The minor requirement for this contest is the setup itself - creating the lambda function and storing the data in the database.