Aldahawi, Hanaa and Allen, Stuart Michael ![]() |
Abstract
The rapid growth in social media data has motivated the development of a real time framework to understand and extract the meaning of the data. Text categorization is a well-known method for understanding text. Text categorization can be applied in many forms, such as authorship detection and text mining by extracting useful information from documents to sort a set of documents automatically into predefined categories. Here, we propose a method for identifying those who posted the tweets into categories. The task is performed by extracting key features from tweets and subjecting them to a machine learning classifier. The research shows that this multi-classification task is very difficult, in particular the building of a domain-independent machine learning classifier. Our problem specifically concerned tweets about oil companies, most of which were noisy enough to affect the accuracy. The analytical technique used here provided structured and valuable information for oil companies.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Computer Science & Informatics |
Subjects: | H Social Sciences > HM Sociology Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Publisher: | Springer |
ISBN: | 9783319181165 |
ISSN: | 0302-9743 |
Last Modified: | 01 Nov 2022 09:21 |
URI: | https://orca.cardiff.ac.uk/id/eprint/87679 |
Citation Data
Cited 1 time in Scopus. View in Scopus. Powered By Scopus® Data
Actions (repository staff only)
![]() |
Edit Item |