Friday, December 1, 2017

Language-independent data set annotation for machine learning-based sentiment analysis


.
Abstract:
Social media platforms provide large amounts of user-generated text which can be utilized for text-based Sentiment Analysis in order to obtain insights about opinions on many aspects of life. Current approaches that are based on supervised learning require manually annotated data sets that are time-consuming to create and are specific to a single language. In this work, we present an approach to generate ground truth sentiment values for a data set from Twitter. We use a sentiment emoji lexicon and distribute known polarities of hashtags over neighbors in a graph we build on them. This approach is language-independent. Native speakers of five different languages evaluate the accuracy of the sentiment values assigned by our method on a corpus of Tweets. Our experiments show that the quality of our automatically assigned sentiment values is sufficiently high to be used for training of machine learning-based sentiment analysis.
.
https://ieeexplore.ieee.org/document/8122930

Saturday, July 1, 2017

Automatic Construction of an Emoji Sentiment Lexicon

.
ABSTRACT
Emojis have been frequently used to express users' sentiments, emotions, and feelings in text-based communication. To facilitate sentiment analysis of users' posts, an emoji sentiment lexicon with positive, neutral, and negative scores has been recently constructed using manually labeled tweets. However, the number of emojis listed in the lexicon is smaller than that of currently existing emojis, and expanding the lexicon manually requires time and effort to reconstruct the labeled dataset. This paper presents a simple and efficient method for automatically constructing an emoji sentiment lexicon with arbitrary sentiment categories. The proposed method extracts sentiment words from WordNet-Affect and calculates the cooccurrence frequency between the sentiment words and each emoji. Based on the ratio of the number of occurrences of each emoji among the sentiment categories, each emoji is assigned a multidimensional vector whose elements indicate the strength of the corresponding sentiment. In experiments conducted on a collection of tweets, we show a high correlation between the conventional lexicon and our lexicon for three sentiment categories. We also show the results for a new lexicon constructed with additional sentiment categories.
.
https://dl.acm.org/doi/10.1145/3110025.3110139

Wednesday, March 1, 2017

The Effects of Emoji in Sentiment Analysis


.
Abstract: This study investigates the usage of Emoji characters on social networks and the effects of Emoji in text mining and sentiment analysis. As it provides live access to text based public opinions, we chose Twitter as our information source in our analysis. We collected text data for some global positive and negative events to analyze the impact of Emoji characters in sentiment analysis. In our analysis, we noticed that the utilization of Emoji characters in sentiment analysis results in higher sentiment scores.
Furthermore, we observed that the usage of Emoji characters in sentiment analysis appeared to have higherimpact on overall sentiments of the positive opinions in comparison to the negative opinions.

Key words: Emoji, opinion mining, sentiment analysis, twitter.
.

Thursday, February 2, 2017

Multi-sentiment Modeling with Scalable Systematic Labeled Data Generation via Word2Vec Clustering


.
Abstract:
Social networks are now a primary source for news and opinions on topics ranging from sports to politics. Analyzing opinions with an associated sentiment is crucial to the success of any campaign (product, marketing, or political). However, there are two significant challenges that need to be overcome. First, social networks produce large volumes of data at high velocities. Using traditional (semi-) manual methods to gather training data is, therefore, impractical and expensive. Second, humans express more than two emotions, therefore, the typical binary good/bad or positive/negative classifiers are no longer sufficient to address the complex needs of the social marketing domain. This paper introduces a hugely scalable approach to gathering training data by using emojis as proxy for user sentiments. This paper also introduces a systematic Word2Vec based clustering method to generate emoji clusters that arguably represent different human emotions (multi-sentiment). Finally, this paper also introduces a threshold-based formulation to predicting one or two class labels (multi-label) for a given document. Our scalable multi-sentiment multi-label model produces a cross-validation accuracy of 71.55% (± 0.22%). To compare against other models in the literature, we also trained a binary (positive vs. negative) classifier. It produces a cross-validation accuracy of 84.95% (± 0.17%), which is arguably better than several results reported in literature thus far.
.
https://ieeexplore.ieee.org/document/7836770

Saturday, January 14, 2017

What is E-Business?

E-business (electronic business) is the conduct of business processes on the Internet. These electronic business processes include buying and selling products, supplies and services; servicing customers; processing payments; managing production control; collaborating with business partners; sharing information; running automated employee services; recruiting; and more.

http://searchcio.techtarget.com/definition/e-business