Monday, May 7, 2018

Combining emojis with Arabic textual features for sentiment classification

.
Abstract:
With the significant growth of user-generated content on the Web, sentiment analysis has gained increasing importance to draw insights of online social data and turn it into a valuable asset for supporting decision making. Lots of efforts have adopted text mining techniques with linguistic features to detect and track people's opinions. Yet, the results are not satisfactory. In this paper, we aim at exploring the impact of combining emojis based features, which are pictographic symbols that are becoming more commonly used in social media, with various forms of textual features on the sentiment classification of dialectical Arabic tweets. We extract textual features using four different methods: Bag-of-Words (BoW), Latent Semantic Analysis, and two forms of Word Embedding. The effect of fusing emojis with textual features is analyzed using a support vector classifier with and without feature selection. It has been observed that simpler models can be constructed with much better results when emojis are merged with word embedding and the selection of the most relevant subset of features as input to the classifier.

.
https://ieeexplore.ieee.org/document/8355456

Thursday, February 1, 2018

Emotion detection of tweets in Indonesian language using LDA and expression symbol conversion


.
Abstract:
Twitter is one of the social networks that attract many Indonesian people because it is considered as a medium to express opinions and feelings about certain topic. Twitter popularity can be used as an efficient source of sentiment data for marketing or social studies. Social studies that can be applied to the process of Twitter analysis is emotion detection. Emotion detection has a potency to be applied in a wide range of applications, ranging from health applications, counseling, business, to community population studies. This research utilizes one of the most popular and simplest topic modeling models, that is Latent Dirichlet Allocation (LDA), as well as conversion expression symbol (emoticon/ emoji), which shows the emotion or topic in a tweet to multiply the vocabulary that represents emotion. The advantage of the LDA method proposed is that it can detect some emotions on the tweet because the detection is not rigid and is able to show the proportion of emotion in the tweet. This research compares emotional detection using LDA and conversion expression symbol with emotional detection using LDA without conversion expression symbol. The result shows that emotional detection using LDA with conversion expression symbol is better with the reached average difference of accuracy 14.096%.
.
https://ieeexplore.ieee.org/document/8276371