Academic

GSI Agent: Domain Knowledge Enhancement for Large Language Models in Green Stormwater Infrastructure

arXiv:2603.15643v1 Announce Type: new Abstract: Green Stormwater Infrastructure (GSI) systems, such as permeable pavement, rain gardens, and bioretention facilities, require continuous inspection and maintenance to ensure long-term performance. However, domain knowledge about GSI is often scattered across municipal manuals, regulatory documents, and inspection forms. As a result, non-expert users and maintenance staff may struggle to obtain reliable and actionable guidance from field observations. Although Large Language Models (LLMs) have demonstrated strong general reasoning and language generation capabilities, they often lack domain-specific knowledge and may produce inaccurate or hallucinated answers in engineering scenarios. This limitation restricts their direct application to professional infrastructure tasks. In this paper, we propose GSI Agent, a domain-enhanced LLM framework designed to improve performance in GSI-related tasks. Our approach integrates three complementary st

S
Shaohuang Wang
· · 1 min read · 8 views

arXiv:2603.15643v1 Announce Type: new Abstract: Green Stormwater Infrastructure (GSI) systems, such as permeable pavement, rain gardens, and bioretention facilities, require continuous inspection and maintenance to ensure long-term performance. However, domain knowledge about GSI is often scattered across municipal manuals, regulatory documents, and inspection forms. As a result, non-expert users and maintenance staff may struggle to obtain reliable and actionable guidance from field observations. Although Large Language Models (LLMs) have demonstrated strong general reasoning and language generation capabilities, they often lack domain-specific knowledge and may produce inaccurate or hallucinated answers in engineering scenarios. This limitation restricts their direct application to professional infrastructure tasks. In this paper, we propose GSI Agent, a domain-enhanced LLM framework designed to improve performance in GSI-related tasks. Our approach integrates three complementary strategies: (1) supervised fine-tuning (SFT) on a curated GSI instruction dataset, (2) retrieval-augmented generation (RAG) over an internal GSI knowledge base constructed from municipal documents, and (3) an agent-based reasoning pipeline that coordinates retrieval, context integration, and structured response generation. We also construct a new GSI Dataset aligned with real-world GSI inspection and maintenance scenarios. Experimental results show that our framework significantly improves domain-specific performance while maintaining general knowledge capability. On the GSI dataset, BLEU-4 improves from 0.090 to 0.307, while performance on the common knowledge dataset remains stable (0.304 vs. 0.305). These results demonstrate that systematic domain knowledge enhancement can effectively adapt general-purpose LLMs to professional infrastructure applications.

Executive Summary

This study presents GSI Agent, a domain-enhanced Large Language Model (LLM) framework designed to improve performance in Green Stormwater Infrastructure (GSI)-related tasks. The framework integrates supervised fine-tuning, retrieval-augmented generation, and agent-based reasoning to leverage domain-specific knowledge. Experimental results demonstrate significant improvement in domain-specific performance while maintaining general knowledge capability. The proposed framework has the potential to enhance the effectiveness of GSI inspection and maintenance tasks for non-expert users and maintenance staff. The study contributes to the development of more accurate and reliable LLMs for professional infrastructure applications, particularly in the context of GSI.

Key Points

  • GSI Agent is a domain-enhanced LLM framework designed to improve performance in GSI-related tasks
  • The framework integrates three strategies: supervised fine-tuning, retrieval-augmented generation, and agent-based reasoning
  • Experimental results show significant improvement in domain-specific performance while maintaining general knowledge capability

Merits

Strength in Domain Knowledge Enhancement

The proposed framework effectively adapts general-purpose LLMs to professional infrastructure applications, leveraging domain-specific knowledge from municipal documents and inspection forms.

Improved Performance in GSI-Related Tasks

The experimental results demonstrate significant improvement in domain-specific performance, indicating the potential of GSI Agent for enhancing the effectiveness of GSI inspection and maintenance tasks.

Demerits

Limitation in Generalizability

The study focuses on GSI-related tasks, and it is unclear whether the proposed framework can be generalized to other professional infrastructure applications or domains.

Dependence on Curated Instruction Dataset

The framework's performance relies on the quality and completeness of the curated GSI instruction dataset, which may limit its effectiveness in real-world scenarios.

Expert Commentary

The study presents a promising approach to enhancing the performance of LLMs in professional infrastructure applications. However, further research is needed to explore the generalizability of the proposed framework and its potential limitations. Additionally, the study's findings should be considered in the context of the broader discussion on the role of LLMs in professional infrastructure tasks. The proposed framework has the potential to contribute to the development of more accurate and reliable LLMs for GSI-related tasks, but its effectiveness in real-world scenarios remains to be seen.

Recommendations

  • Future research should investigate the generalizability of GSI Agent to other professional infrastructure applications and domains.
  • The study's findings should be replicated in real-world scenarios to evaluate the framework's effectiveness and potential limitations.

Sources