|
Anggoro, Angga
2025.
Representation learning and deep neural approaches for international trade product classification.
PhD Thesis,
Cardiff University.
Item availability restricted. |
Preview |
PDF
- Accepted Post-Print Version
Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (5MB) | Preview |
|
PDF (Cardiff University Electronic Publication Form)
- Supplemental Material
Restricted to Repository staff only Download (282kB) | Request a copy |
Abstract
International trade underpins the global economy, with trade volumes reflecting cross-border exchange, and tariffs function both as government revenue and as instruments of domestic market protection. The extensive documentation in international trade administration underlines the necessity of digitalisation to ensure the effectiveness of trade facilitation. The complexity of text-based trade administration presents challenges in optimising information processing, validation and utilisation. Employing language models has emerged as a practical paradigm for dealing with the complexities of natural language, presenting research opportunities for optimising automated machine learning tasks for trade document processing. This thesis first addressed the challenge of classifying thousands of product categories by applying effective preprocessing strategies to trade data containing heterogeneous product attributes. Contextualised representations were generated from selected modern pretrained language models, fine-tuned on domain-specific trade datasets, and subsequently fed into a novel hierarchical classification network. This pipeline approach shows substantial performance improvements. Second, supervised contrastive learning was implemented using a Siamese network to enhance sentence embeddings to represent trade product declarations. This approach facilitates flexibility for effective integration with machine learning classifiers and outperforms well-established methods for product classification. Third, to detect product misclassification, this thesis proposes a text anomaly detection framework based on autoencoders. Input representation from pretrained language models combined with autoencoders effectively identified contextualised anomalies under both semi supervised and unsupervised learning settings. Finally, a bilingual product classification model was developed that incorporated cross-lingual keywords during model training. The proposed method aligns product representation and achieves superior performance in challenging low-resource language settings, utilising bilingual trade data. This thesis concludes that the integration of product representation from modern language models and deep learning architectures presents strong potential to automate trade processes. Additionally, this thesis could present as the groundwork for both methodological and practical aspects of machine learning-based systems in the trade domain.
| Item Type: | Thesis (PhD) |
|---|---|
| Date Type: | Completion |
| Status: | Unpublished |
| Schools: | Schools > Computer Science & Informatics |
| Subjects: | Q Science > QA Mathematics > QA76 Computer software |
| Funders: | Indonesia Endowment Fund for Education Agency (LPDP) |
| Date of First Compliant Deposit: | 3 March 2026 |
| Date of Acceptance: | 1 March 2026 |
| Last Modified: | 04 Mar 2026 10:40 |
| URI: | https://orca.cardiff.ac.uk/id/eprint/185433 |
Actions (repository staff only)
![]() |
Edit Item |




Download Statistics
Download Statistics