Data sampling strategies for accurate fault analyses: A scale-independent test based on a machine learning approach

Alves, Tiago M.

, Taylor, Joshua, Corcoran, Padraig

and Ze, Tao 2025. Data sampling strategies for accurate fault analyses: A scale-independent test based on a machine learning approach. Journal of Structural Geology 191 , 105342. 10.1016/j.jsg.2025.105342

[thumbnail of 1-s2.0-S0191814125000069-main (1).pdf]

Preview

PDF - Published Version
Available under License Creative Commons Attribution.
Download (21MB) | Preview

License URL: http://creativecommons.org/licenses/by/4.0/

License Start date: 11 January 2025

Official URL: https://doi.org/10.1016/j.jsg.2025.105342

Abstract

Seismic and outcrop data from SE Brazil, Greece and SW England are used to develop a new method to correctly identify tectonic fault segments – either active or quiescent - using a machine learning approach. Three-dimensional (3D) analyses of tectonic faults are often based on the mapping of throw values (T) along their full length (D) or depth (Z) using a wide range of data. Yet, the collection of these throw values using geophysical or outcrop data is often time-consuming and onerous. In contrast to many empirical measurements of T/D and T/Z, our new method supports the mapping of active (or potentially active) fault segments and limits data undersampling, a caveat that results in the grouping of faults as single zones, systematically overlooking their natural segmentation. The new method is scale-independent and resulted in the definition of a minimum sampling ratio necessary for accurate fault segment mapping. Determined through the gradual downsampling of T/D and T/Z data to a critical point of information loss, the minimum sampling interval (δ) in T/D and T/Z data, expressed as a percentage of fault length, or height, is: a) for faults that are longer or higher than 3.5 km; b) for isolated faults that are shorter than 3.5 km in either length or height. This work is therefore important as it shows that one should never acquire T/D and T/Z data above a threshold value of 4% to identify successive, linked fault segments, whatever their scale. Total accuracy in fault-segment detection is only assured for δ values of 1% when in the presence of fault zones with segments longer than 3.5 km. As a corollary, we confirm that T/D and T/Z data are often undersampled in the published literature, leading to a significant bias of subsequent interpretations towards coherent constant-length growth models when analyzing both active and old, quiescent fault systems.

Item Type:	Article
Date Type:	Publication
Status:	Published
Schools:	Schools > Earth and Environmental Sciences Schools > Computer Science & Informatics
Additional Information:	License information from Publisher: LICENSE 1: URL: http://creativecommons.org/licenses/by/4.0/, Start Date: 2025-01-11
Publisher:	Elsevier
ISSN:	0191-8141
Date of First Compliant Deposit:	15 January 2025
Date of Acceptance:	9 January 2025
Last Modified:	05 Feb 2025 12:00
URI:	https://orca.cardiff.ac.uk/id/eprint/175288

Actions (repository staff only)

Edit Item

Altmetric

Dimensions

Download Statistics

Downloads

Downloads per month over past year

View more statistics

CORE (COnnecting REpositories)