Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Data sampling strategies for accurate fault analyses: A scale-independent test based on a Machine Learning approach

Alves, Tiago M. ORCID: https://orcid.org/0000-0002-2765-3760, Taylor, Joshua, Corcoran, Padraig ORCID: https://orcid.org/0000-0001-9731-3385 and Ze, Tao 2025. Data sampling strategies for accurate fault analyses: A scale-independent test based on a Machine Learning approach. Journal of Structural Geology , 105342. 10.1016/j.jsg.2025.105342

[thumbnail of 1-s2.0-S0191814125000069-main.pdf] PDF - Accepted Post-Print Version
Available under License Creative Commons Attribution.

Download (13MB)
License URL: http://creativecommons.org/licenses/by/4.0/
License Start date: 11 January 2025

Abstract

Seismic and outcrop data from SE Brazil, Greece and SW England are used to develop a new method to correctly identify tectonic fault segments – either active or quiescent - using a machine learning approach. Three-dimensional (3D) analyses of tectonic faults are often based on the mapping of throw values (T) along their full length (D) or depth (Z) using a wide range of data. Yet, the collection of these throw values using geophysical or outcrop data is often time-consuming and onerous. In contrast to many empirical measurements of T/D and T/Z, our new method supports the mapping of active (or potentially active) fault segments and limits data undersampling, a caveat that results in the grouping of faults as single zones, systematically overlooking their natural segmentation. The new method is scale-independent and resulted in the definition of a minimum sampling ratio necessary for accurate fault segment mapping. Determined through the gradual downsampling of T/D and T/Z data to a critical point of information loss, the minimum sampling interval (δ) in T/D and T/Z data, expressed as a percentage of fault length, or height, is: a) for faults that are longer or higher than 3.5 km; b) for isolated faults that are shorter than 3.5 km in either length or height. This work is therefore important as it shows that one should never acquire T/D and T/Z data above a threshold value of 4% to identify successive, linked fault segments, whatever their scale. Total accuracy in fault-segment detection is only assured for δ values of 1% when in the presence of fault zones with segments longer than 3.5 km. As a corollary, we confirm that T/D and T/Z data are often undersampled in the published literature, leading to a significant bias of subsequent interpretations towards coherent constant-length growth models when analyzing both active and old, quiescent fault systems.

Item Type: Article
Date Type: Published Online
Status: In Press
Schools: Earth and Environmental Sciences
Computer Science & Informatics
Additional Information: License information from Publisher: LICENSE 1: URL: http://creativecommons.org/licenses/by/4.0/, Start Date: 2025-01-11
Publisher: Elsevier
ISSN: 0191-8141
Date of First Compliant Deposit: 15 January 2025
Date of Acceptance: 9 January 2025
Last Modified: 15 Jan 2025 11:00
URI: https://orca.cardiff.ac.uk/id/eprint/175288

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics