Chai, Chengzhang ![]() ![]() ![]() |
Preview |
PDF
- Accepted Post-Print Version
Download (729kB) | Preview |
Abstract
Bridge inspection is crucial for infrastructure maintenance. Current inspections based on computer vision primarily focus on identifying simple defects such as cracks or corrosion. These detection results can serve merely as preliminary references for bridge inspection reports. To generate detailed reports, on-site engineers must still present the structural conditions through lengthy textual descriptions. This process is time-consuming, costly, and prone to human error. To bridge this gap, we propose a deep learning-based framework to generate detailed and accurate textual descriptions, laying the foundation for automating bridge inspection reports. This framework is built around an encoderdecoder architecture, utilizing Convolutional Neural Networks (CNN) for encoding image features and Gated Recurrent Units (GRU) as the decoder, combined with a dynamically adaptive attention mechanism. The experimental results demonstrate this approach's effectiveness, proving that the introduction of the attention mechanism contributes to improved generation results. Moreover, it is worth noting that, through comparative experiments on image restoration, we found that the model requires further improvement in terms of explainability. In summary, this study demonstrates the potential and practical application of image captioning techniques for bridge defect detection, and future research can further explore the integration of domain knowledge with artificial intelligence (AI).
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Status: | Published |
Schools: | Engineering |
Subjects: | T Technology > TA Engineering (General). Civil engineering (General) |
Date of First Compliant Deposit: | 7 September 2024 |
Date of Acceptance: | 7 July 2024 |
Last Modified: | 02 Nov 2024 02:30 |
URI: | https://orca.cardiff.ac.uk/id/eprint/171911 |
Actions (repository staff only)
![]() |
Edit Item |