Anggoro, Angga Wahyu, Corcoran, Padraig ORCID: https://orcid.org/0000-0001-9731-3385, De Widt, Dennis ORCID: https://orcid.org/0000-0002-7299-5663 and Li, Yuhua ORCID: https://orcid.org/0000-0003-2913-4478
2023.
Using DistilBERT to assign HS codes to international trading transactions.
Presented at: World Conference on Information Systems and Technologies,
Pisa, Italy,
4 - 6 April 2023.
Item availability restricted. |
PDF
- Submitted Pre-Print Version
Restricted to Repository staff only Download (248kB) | Request a copy |
Abstract
One significant source of national revenue for many countries is the tax levied on international trade. Tax collection can be achieved by accurately classifying international trading commodities according to Harmonised System (HS) codes, which later can be used to impose customs duty/tax rates. The current approach to assigning HS codes to transactions relies on HS codes filled out by international traders and being manually inspected by customs officers. This ap-proach is tedious and prone to error, potentially leading to fraudulent activity. However, commodity texts are hard to classify because of their short length, noise, ambiguity, and use of a lot of technical terms. To address these challenges, our research aims to determine the HS codes automatically from commodity de-scription texts in trading transactions using text classification techniques. This paper proposes utilising transformers models, BERT and its variants, Distil-BERT, which is claimed to be lighter and faster than the BERT model and has the advantage of being deployed in computational resource-constrained environ-ments. The proposed approach adopts a transfer learning procedure to perform fine-tuning hyperparameters of BERT and DistilBERT. It is evaluated using real-world customs data for multi-class classification of commodity transactions in international trading. Experimental results indicate that both models achieve a comparable performance result.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Status: | Unpublished |
Schools: | Advanced Research Computing @ Cardiff (ARCCA) Business (Including Economics) Computer Science & Informatics |
Subjects: | H Social Sciences > HG Finance Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Uncontrolled Keywords: | Customs Clearance, Harmonised System codes, Product Descrip-tion Classification, Transfer Learning, DistilBERT |
Last Modified: | 15 Jun 2024 06:04 |
URI: | https://orca.cardiff.ac.uk/id/eprint/155999 |
Actions (repository staff only)
Edit Item |