Accelerating Workflows with Generative AI: The EPA’s Document Processing Journey
The United States Environmental Protection Agency (EPA) is stepping into the future with its initiative, “Powering the Great American Comeback,” which aims to position the U.S. as the global hub for artificial intelligence (AI). Within this ambitious framework, the EPA’s Office of Chemical Safety and Pollution Prevention (OCSPP) and the Office of Pesticide Programs (OPP) are concrete examples of this mission in action. They showcased their innovative collaboration with the AWS Generative AI Innovation Center (GenAIIC) at the AWS Summit Washington, DC 2025, unveiling two transformative proof-of-concepts (POCs) that promise to expedite workflows and save taxpayers significant sums.
The EPA’s core mission revolves around safeguarding human health and the environment, actively protecting communities from risks associated with pesticides and chemicals. In an effort to leverage technological advances, the agency, alongside the AWS GenAIIC, developed these POCs in 2024 to streamline operations and enhance efficiency in evaluating studies—a task often laden with intensive manual labor.
AI as an Ally in Science
The EPA’s approach to implementing AI focuses on its role as a supportive tool for staff rather than as an independent decision-maker. This careful consideration ensures that scientific integrity remains at the forefront. The collaboration has established several foundational principles:
-
Transparency: The AI system provides confidence scores for its outputs, clearly depicting whenAI-generated content has undergone human review.
-
Verification Tools: Scientists can engage with an integrated chatbot to clarify AI-generated responses by posing specific queries such as, “What was the sample size in this experiment?”
-
Human Control: All AI-generated content undergoes vetting by scientists, who have the final say in any modifications before the information is published.
- Responsible AI Practices: The collaboration employs Amazon Bedrock Guardrails to ensure accurate scientific results are achieved while preventing instances of AI “hallucination.”
Streamlining Chemical Risk Assessments
Evaluating the vast array of chemicals in commerce is a monumental task for the EPA. Traditionally, two scientists assess a series of evaluation criteria concerning the quality of studies—the process can be painstakingly slow, often requiring considerable time and manpower. Each scientist reviews and judges criteria such as reporting quality and exposure methods.
Sean Watford, a Senior Environmental Data and Systems Scientist at the EPA, recognized the inefficiencies within this framework and set out to find an alternative solution in collaboration with the AWS GenAIIC. They created a POC to concurrently evaluate studies against pre-established criteria, aiming to significantly cut down on the current review timelines.
The proposed system aims to bring tasks that once took an hour down to mere minutes—freeing experts from prolonged bottlenecks and allowing them to redirect their focus toward critical issues requiring human judgment. For instance, when research papers are uploaded to Amazon S3, Amazon Textract extracts and processes the text, subsequently enabling evaluation through AI tools.
The Chemical Assessment Process
-
Intelligent Document Processing (IDP): Papers uploaded to S3 are processed by Textract, which efficiently pulls relevant information while preserving essential layouts.
-
Automated Evaluation: Integration with Amazon Bedrock and Anthropic Claude 3.7 allows the system to generate responses across nine historical evaluation criteria, drastically reducing manual labor.
-
Human-In-The-Loop Review: Scientists can prompt the AI, using a user-friendly interface and chatbot, to verify information and review AI-generated outputs.
-
Quality Control: Experts maintain the ability to modify the AI’s evaluations before publication, ensuring sustained accuracy and reliability.
-
Multi-Document Intelligence: The AI offers insights across multiple studies, enabling a broader understanding of patterns and implications.
-
Seamless Integration: The solution is designed to mesh perfectly with existing EPA workflows for minimal disruption.
- Cost-Effectiveness: Batch processing of studies dramatically cuts down on Amazon Bedrock inference costs.
The likely outcomes of these innovative measures include:
-
An 85% reduction in processing time: Some tasks that previously took months can now conclude in a matter of hours or days.
-
An 85% accuracy rate for AI-generated evaluations, allowing EPA scientists to allocate their expertise towards more complex inquiries.
- An overall cost-effective solution, with a processing price of just $40 for 250 research papers, compared to the traditional method that consumed hundreds of hours.
For Watford, this initiative marked a significant turning point, revealing that foundational models can indeed assist scientists in their critical missions.
Streamlining FIFRA Applications
The Federal Insecticide, Fungicide, and Rodenticide Act (FIFRA) requires EPA approval for pesticide products to ensure they do not pose undue risks. This responsibility brings an influx of over ten thousand regulatory applications annually—often resulting in OPP scientists facing extensive workloads, primarily due to insufficient staffing and inefficient review processes.
In 2024, Daniel Schoeff, a Senior Advisor at OPP, collaborated with the AWS GenAIIC to develop a POC aimed at expediting the creation of data evaluation records (DERs) required for each study submission. Leveraging technologies like Amazon Textract and Bedrock, this initiative promises revolutionary improvements.
The FIFRA Streamlining Process
-
IDP Implementation: A batch system takes the necessary documents, extracting text efficiently through Amazon Textract.
-
Vector Embeddings: DERs are converted into a format conducive to intelligent search, improving researchers’ ability to navigate extensive data.
-
Intelligent Search Capabilities: Advanced retrieval techniques allow scientists to conduct detailed inquiries across the FIFRA dataset using Amazon Bedrock Knowledge Bases.
-
HITL Verification: By comparing DERs generated through the AI with manual evaluations, the EPA gauges the consistency and reliability of AI outputs.
- Seamless Compatibility: The created systems are designed for easy integration into existing EPA workflows to further enhance efficiency.
Expected improvements from this project include:
-
A staggering 99% reduction in processing time, transforming tasks that typically required four months into mere seconds.
-
An increase in accuracy of AI-generated evaluations, especially notable in the lower to medium complexity batches.
- A radical 99% reduction in cost associated with creating DERs, leading to substantial savings for the agency.
Schoeff considers this a remarkable success, with aspirations for full-scale implementation in fiscal year 2026.
Lessons in Federal Government Innovation
The EPA’s experience illustrates key insights for federal agencies exploring AI integration:
-
Start with a Mission Focus: Identify specific processes where AI can yield significant improvements.
-
Emphasize Human-Centric Design: Ensure that domain experts remain at the helm while AI handles repetitive tasks.
-
Measure Progress: Regularly assess accuracy and efficiency gains to validate value and guide innovations.
- Prioritize Cost-Effectiveness: AI technologies can provide robust returns on investment by streamlining operations and enabling staff to focus on higher-value tasks.
The Path Forward
The EPA’s collaboration with AWS illuminates how generative AI can transform government operations. By automating labor-intensive areas while preserving rigorous scientific standards, the agency exemplifies how federal authorities can responsibly harness advanced technologies to better serve the community and uphold public health and safety.
Furthermore, ongoing advancements suggest a promising horizon for even more effective practices in regulatory processes, ultimately enhancing the protection of our environment and populations.
For continued updates on how AWS is impacting the future through generative AI capabilities, consider visiting the AWS Generative AI Innovation Center.
Disclaimer: The US EPA and its employees did not contribute to the writing of this post. EPA and its employees do not endorse any commercial products, services, or enterprises.