Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Autonomous complex knowledge mining and graph representation through natural language processing and transfer learning

Zhu, Xiaofeng, Li, Haijiang ORCID: and Su, Tengxiang 2023. Autonomous complex knowledge mining and graph representation through natural language processing and transfer learning. Automation in Construction 155 , 105074. 10.1016/j.autcon.2023.105074
Item availability restricted.

[thumbnail of accepted post printed version.pdf] PDF - Accepted Post-Print Version
Restricted to Repository staff only until 6 September 2024 due to copyright restrictions.
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (4MB)


Regulatory documents play a significant role in securing engineering project quality, standard process management and long-term sustainable developments. With the digitisation of knowledge in the AEC industry, the demand for automated knowledge mining has emerged when confronted with substantial regulations. However, the current interpretation approaches for regulatory documents are still mostly labour-intensive and flawed in complex knowledge. Based on transfer learning (BERT) and natural language processing (e.g., NLP-Syntactic Parsing), this paper proposes a fully automated knowledge mining framework to convert complex knowledge in textual regulations to graph-based knowledge representations. The framework uses a BERT-based engine to extract clauses from regulation documents through fine-tuning with the self-developed domain dataset. A constituent extractor is developed to process the provisions with complex knowledge and extract constituents. A knowledge modelling engine integrates the extracted constituents into a graph-based regulation knowledge model, which can be queried, visualised, and directly applied to downstream applications. The outcome has demonstrated promising performance in complex knowledge mining and knowledge graph modelling based on ISO 19650 case study. This research can effectively convert textual regulation documents to their counterpart regulatory knowledge base, contributing to automated knowledge acquisition and multi-domain knowledge fusion toward regulation digitalization.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Engineering
Publisher: Elsevier
ISSN: 0926-5805
Date of First Compliant Deposit: 6 September 2023
Date of Acceptance: 26 August 2023
Last Modified: 15 Nov 2023 16:31

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics