Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Using DistilBERT to assign HS codes to international trading transactions

Anggoro, Angga Wahyu, Corcoran, Padraig ORCID:, De Widt, Dennis ORCID: and Li, Yuhua ORCID: 2023. Using DistilBERT to assign HS codes to international trading transactions. Presented at: World Conference on Information Systems and Technologies, Pisa, Italy, 4 - 6 April 2023.
Item availability restricted.

[thumbnail of 189 final submitted.pdf] PDF - Submitted Pre-Print Version
Restricted to Repository staff only

Download (248kB) | Request a copy


One significant source of national revenue for many countries is the tax levied on international trade. Tax collection can be achieved by accurately classifying international trading commodities according to Harmonised System (HS) codes, which later can be used to impose customs duty/tax rates. The current approach to assigning HS codes to transactions relies on HS codes filled out by international traders and being manually inspected by customs officers. This ap-proach is tedious and prone to error, potentially leading to fraudulent activity. However, commodity texts are hard to classify because of their short length, noise, ambiguity, and use of a lot of technical terms. To address these challenges, our research aims to determine the HS codes automatically from commodity de-scription texts in trading transactions using text classification techniques. This paper proposes utilising transformers models, BERT and its variants, Distil-BERT, which is claimed to be lighter and faster than the BERT model and has the advantage of being deployed in computational resource-constrained environ-ments. The proposed approach adopts a transfer learning procedure to perform fine-tuning hyperparameters of BERT and DistilBERT. It is evaluated using real-world customs data for multi-class classification of commodity transactions in international trading. Experimental results indicate that both models achieve a comparable performance result.

Item Type: Conference or Workshop Item (Paper)
Status: Unpublished
Schools: Business (Including Economics)
Computer Science & Informatics
Subjects: H Social Sciences > HG Finance
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Uncontrolled Keywords: Customs Clearance, Harmonised System codes, Product Descrip-tion Classification, Transfer Learning, DistilBERT
Last Modified: 07 Nov 2023 06:53

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics