Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Representation learning and deep neural approaches for international trade product classification

Anggoro, Angga 2025. Representation learning and deep neural approaches for international trade product classification. PhD Thesis, Cardiff University.
Item availability restricted.

[thumbnail of 2026anggoroaphd.pdf]
Preview
PDF - Accepted Post-Print Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (5MB) | Preview
[thumbnail of Cardiff University Electronic Publication Form] PDF (Cardiff University Electronic Publication Form) - Supplemental Material
Restricted to Repository staff only

Download (282kB) | Request a copy

Abstract

International trade underpins the global economy, with trade volumes reflecting cross-border exchange, and tariffs function both as government revenue and as instruments of domestic market protection. The extensive documentation in international trade administration underlines the necessity of digitalisation to ensure the effectiveness of trade facilitation. The complexity of text-based trade administration presents challenges in optimising information processing, validation and utilisation. Employing language models has emerged as a practical paradigm for dealing with the complexities of natural language, presenting research opportunities for optimising automated machine learning tasks for trade document processing. This thesis first addressed the challenge of classifying thousands of product categories by applying effective preprocessing strategies to trade data containing heterogeneous product attributes. Contextualised representations were generated from selected modern pretrained language models, fine-tuned on domain-specific trade datasets, and subsequently fed into a novel hierarchical classification network. This pipeline approach shows substantial performance improvements. Second, supervised contrastive learning was implemented using a Siamese network to enhance sentence embeddings to represent trade product declarations. This approach facilitates flexibility for effective integration with machine learning classifiers and outperforms well-established methods for product classification. Third, to detect product misclassification, this thesis proposes a text anomaly detection framework based on autoencoders. Input representation from pretrained language models combined with autoencoders effectively identified contextualised anomalies under both semi supervised and unsupervised learning settings. Finally, a bilingual product classification model was developed that incorporated cross-lingual keywords during model training. The proposed method aligns product representation and achieves superior performance in challenging low-resource language settings, utilising bilingual trade data. This thesis concludes that the integration of product representation from modern language models and deep learning architectures presents strong potential to automate trade processes. Additionally, this thesis could present as the groundwork for both methodological and practical aspects of machine learning-based systems in the trade domain.

Item Type: Thesis (PhD)
Date Type: Completion
Status: Unpublished
Schools: Schools > Computer Science & Informatics
Subjects: Q Science > QA Mathematics > QA76 Computer software
Funders: Indonesia Endowment Fund for Education Agency (LPDP)
Date of First Compliant Deposit: 3 March 2026
Date of Acceptance: 1 March 2026
Last Modified: 04 Mar 2026 10:40
URI: https://orca.cardiff.ac.uk/id/eprint/185433

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics