.
Abstract:
With the significant growth of user-generated content on the Web, sentiment analysis has gained increasing importance to draw insights of online social data and turn it into a valuable asset for supporting decision making. Lots of efforts have adopted text mining techniques with linguistic features to detect and track people's opinions. Yet, the results are not satisfactory. In this paper, we aim at exploring the impact of combining emojis based features, which are pictographic symbols that are becoming more commonly used in social media, with various forms of textual features on the sentiment classification of dialectical Arabic tweets. We extract textual features using four different methods: Bag-of-Words (BoW), Latent Semantic Analysis, and two forms of Word Embedding. The effect of fusing emojis with textual features is analyzed using a support vector classifier with and without feature selection. It has been observed that simpler models can be constructed with much better results when emojis are merged with word embedding and the selection of the most relevant subset of features as input to the classifier.
.
https://ieeexplore.ieee.org/document/8355456