1. PROJECT OVERVIEW
People are exposed to many man-made chemicals throughout their lives. These include food ingredients and additives, pesticides, cosmetics, medicines, cleaners, solvents, etc. Historically, a series of standard animal studies have been used as a means to evaluate whether a chemical can cause a range of different adverse effects and at what dose these effects occur. The term “systemic toxicity” is often used because the effects can occur in different organ systems such as the liver, kidney, lungs, or reproductive system. The systemic Lowest Effect Level or LEL is the lowest dose that shows adverse effects in these animal toxicity tests. The LEL is then conservatively adjusted in different ways by regulators to derive a value that can be used by the Agency to set exposure limits that are expected to be tolerated by the majority of the population.
Ideally, every chemical to which we are exposed would have a well-defined LEL. However, the full battery of animal studies required to estimate the LEL costs millions of dollars and takes many months to complete. As a result, thousands of chemicals lack the required data needed to estimate an LEL. To help fill this gap, the EPA has screened nearly 2,000 chemicals across a battery of more than 700 biochemical and cell-based in vitro assays to identify what proteins, pathways, and cellular processes these chemicals interact with and at what concentration they interact. The goal of this challenge is to develop an algorithm using data from high-throughput in vitro assays, chemical properties, or chemical structural descriptors to quantitatively predict a chemical’s systemic LEL. Chemicals causing toxicity through inhibition of acetylcholinesterase (a common mechanism for neurotoxicity) are excluded from the challenge since they may skew the LEL values and can be identified in other ways.
2. ToxCast Project Background
In 2005, the federal government launched Tox21, an initiative to use in vitro high-throughput screening (HTS) to identify what proteins, pathways, and cellular processes chemicals interact with and at what concentration they interact. The screening data will be used to more cost-effectively and efficiently prioritize the thousands of chemicals that need in vivo toxicity testing and, in the future, predict the potential toxicity of chemicals. Tox21 currently pools the resources and expertise of EPA, National Institutes of Environmental Health Sciences/National Toxicology Program, National Institute of Health/National Center for Advancing Translational Sciences, and the Food and Drug Administration to screen almost 10,000 chemicals.
One of EPA’s main contributions to Tox21 is the Toxicity Forecaster or ToxCast for short. The first phase of ToxCast was designed as a “proof-of-concept” and was completed in 2009. It evaluated approximately 300 chemicals in over 500 biochemical and cell-based in vitro HTS assays. The 300 chemicals selected for the first phase were primarily data rich pesticides that have a large battery of in vivo toxicity studies performed on them. Data collection on the second phase of ToxCast was completed in 2013. It evaluated approximately 1,800 chemicals in an expanded set of over 700 biochemical and cell-based in vitro HTS assays. The 1,800 chemicals were from a broad range of sources, including industrial and consumer products, food additives, and potentially "green" substances that could be safer alternatives to existing chemicals. These chemicals were not as data rich as those selected for the first phase and many do not have in vivo toxicity studies. The in vitro data are accessible through the interactive Chemical Safety for Sustainability Dashboard (iCSS) and raw data files are also posted to the Dashboard web page.
In addition to the in vitro HTS data, the EPA has created a complementary Toxicity Reference Database (ToxRefDB), which captures the in vivo animal toxicity studies that have been performed on the ToxCast chemicals. For the first time, it provides detailed chemical toxicity data covering over 30 years and $2 billion in animal testing, in a publicly accessible and computable format.
In traditional toxicology, a series of standard animal studies are used as a means to evaluate whether a chemical can cause a range of different adverse effects and at what dose these effects occur. The term “systemic toxicity” is often used because the effects can occur in different organ systems such as the liver, kidney, lungs, or reproductive system. The systemic Lowest Effect Level or LEL is the lowest dose that shows adverse effects in these animal toxicity tests. The LEL is then conservatively adjusted in different ways by regulators to derive a value that can be used by the Agency to set exposure limits that are expected to be tolerated by the majority of the population. However, as stated above, many chemicals do not have the animal studies required to derive an LEL.
3. CONTEST OVERVIEW
We are launching a search and mine the internet idea generation challenge to find software libraries and/or methods that would help to describe the chemical structure of various compounds. Our vision is to use such libraries and methods to provide members with different options to generate chemical structure descriptors. The chemical structure descriptors will be used together with the in vitro assays in a subsequent marathon match to predict the in vivo systemic LEL. Also, on a high level, the structural descriptors will help in detecting the similarity between structure of compounds, thus allowing to predict the response of an unknown chemical whose structure is much similar to a compound for which all information is available.
We have opted for idea generation format because we are looking for different options (ideas) from available libraries (if any) for describing chemical structure. This is an open-ended search where we do not have any specific restrictions on the modeling of structure and we want to see as many ideas and libraries that you can propose to do such a modeling.
Please remember that we are looking for libraries which can be used further as a part of bigger software/algorithm and not something that is embedded deep inside some tool which is not re-usable for other component.
4. SUBMISSION GUIDELINES
- Please submit a document describing all such libraries and their usage details. Also explain how you think such a library will be useful in generating chemical structure descriptors for our purpose and whether it is able to detect similarities in structure of different compounds.
- Please provide complete information about the input to the library and output. Also, give details about the format of data used by library.
- Please support your findings with examples. It would also be good if you could cite some paper or other work where your proposed resource have been successfully used
- Include both proprietary and open source library information.
- In case of open source libraries which are accessible to you, please submit the libraries too.
- We do NOT want just a list of libraries with web links. We want complete information about your proposal for the library and how it will help the project.
- You may provide web links where there is a lot of documentation or some references are required.
5. REFERENCE RESOURCES
The following websites will provide all the information that can be used as reference for achieving the goals of this contest:
1. http://www.epa.gov/ncct/ - The data resource links below Dashboard, Models and Tools Section in the left side are particularly important and useful.
4. iCSS Dashboard: http://actor.epa.gov/actor/faces/CSSDashboardLaunch.jsp (URL will be active the afternoon of 12/17/2013 Eastern US Time)
5. EPA Challenge Web Page: http://epa.gov/ncct/challenges.html
6. REVIEW GUIDELINES
The winner will be chosen by the client.
Please Note: This contest does NOT have a milestone round.
There will be screening phase before the client review.