The challenge is finished.
    Show Deadlinesicon-arrow-up

    Challenge Overview

    The objective of the Megahack is simple, the development of a search portal that allows users to perform a search against pending EPA laws and regulations based on search criteria that aren’t available today, and see if and how it is related to Clean Air Act enforcement as well as show linkages from the laws or regulations to programs that might be related.

    Contest Objective

    In order to associate proposed rules with informative facts about them, like NAICS codes, program names, and the CFR (Code of Federal Regulations) that implement them, the EPA provides a dataset called LRS.  LRS stand for Legislative Rules Service.  This dataset contains lists of current legislation (laws), indexed by chapter, subchapter and section.  This data also establishes relationships that associate these laws and chapters with other relevant information, like the Programs the EPA uses to regulate activity, the NAICS codes that associate a law with a specific type of industry, and other information.

    In this challenge your job is to create a parser on Node.js that will provide data that will be used in later contests to associate proposed rules (pulled from regulations.gov) with programs, codes and citations.  Programs and codes will be used by the interface to provide drop-downs.
    ---Build a Nodejs module to parse LRS XML files (provided in forum)
    ---LRS data are SKOS (simple knowledge organization systems) xml files

    ---Nodejs 6.x should be used
    ---Files should be parsed and stored in database (memory is fine) for easy query. 

    The XML files should be parsed such that users of the parser can:

    1. Retrieve a list of NAICS codes

    2. Retrieve a list of Program names, and the CFR citations associated with them

    3. Retrieve a list of Laws, and the CRF citations associated with them

    Note that a likely use case of these lists will be to provide drop-down autocomplete lists for 1 and 2, so the ability to apply a unique filter will be useful.

    In order to assist you with understanding how the LRS files work the following tips are provided.  Note the requirements are listed above.

    1. Retrieve NAICS codes

    NAICS: WHERE TO GET A LIST OF THEM                           
    This file: NAICS2012_LRSRelationships_RDF-SKOS_20160825                               
    Code Names: key on <skos:prefLabel>   e.g. <skos:prefLabel>61111 Elementary and Secondary Schools</skos:prefLabel>
    Code: <zthes:label>                          e.g. <zthes:label>61111</zthes:label>               
    Use <zthes:termID> for linking                      e.g. <zthes:termID>2944589</zthes:termID>               

    2) Retrieve Program and regulations associated

    PROGRAMS: WHERE TO GET A LIST OF THEM                               
    EPAProgramProject_LRSRelationships_RDF-SKOS_20160825.xml key on <skos:prefLabel>THIS WILL BE THE PROGRAM NAME</skos:prefLabel> to find your program names (like "Brownfields")
    Use that to get a list of Programs and their associated termID.         
    Match termID to <skm:PC rdf:resource="#<termId>" any "CFR2015Title40..." file to find regulations that are related to this program

    3) Retrieve Clean Air Act rules information

    REGULATIONS: HOW TO FIND THEM IN LRS DATA                               
    They're in the files labeled CFR2015Title40.....xml  Ignore any that don't begin with "CFR2015Title40"           
    Note these files have more regulations than we care about.   
    We care only about parts 50 through 98, which are spread across many of the XMLs volumes.  Search them to find the right Parts.

    Since there is no Nodejs module to parse SKOS data, other options can be used. One example is a python skos library (https://github.com/geo-data/python-skos). This is just a suggestion, competitors a free to suggest other options.

    Reference Documents
    ---Will be posted in the contest forum.

    Final Submission Guidelines

    ---A node module to be installed using NPM
    ---Github or Gitlab repository link with the source code. Add handles coderReview and rsial2 as collaborators.
    ---Deployment and usage instructions should be included in a README.md file

    Reliability Rating and Bonus

    For challenges that have a reliability bonus, the bonus depends on the reliability rating at the moment of registration for that project. A participant with no previous projects is considered to have no reliability rating, and therefore gets no bonus. Reliability bonus does not apply to Digital Run winnings. Since reliability rating is based on the past 15 projects, it can only have 15 discrete values.
    Read more.


    Final Review:

    Community Review Board


    User Sign-Off


    Review Scorecard