모바일 메뉴 닫기
 
제목
[논문] 2016. Opinion polarity detection in Twitter data combining shrinkage regression and topic modeling
작성일
2019.04.13
작성자
소셜오믹스
게시글 내용

Yoon, H. G., Kim, H., Kim, C. O., & Song, M. (2016). Opinion polarity detection in Twitter data combining shrinkage regression and topic modeling. Journal of Informetrics, 10(2), 634-644.


https://doi.org/10.1016/j.joi.2016.03.006


Abstract
We propose a method to analyze public opinion about political issues online by automatically detecting polarity in Twitter data. Previous studies have focused on the polarity classification of individual tweets. However, to understand the direction of public opinion on a political issue, it is important to analyze the degree of polarity on the major topics at the center of the discussion in addition to the individual tweets. The first stage of the proposed method detects polarity in tweets using the Lasso and Ridge models of shrinkage regression. The models are beneficial in that the regression results provide sentiment scores for the terms that appear in tweets. The second stage identifies the major topics via a latent Dirichlet analysis (LDA) topic model and estimates the degree of polarity on the LDA topics using term sentiment scores. To the best of our knowledge, our study is the first to predict the polarities of public opinion on topics in this manner. We conducted an experiment on a mayoral election in Seoul, South Korea and compared the total detection accuracy of the regression models with five support vector machine (SVM) models with different numbers of input terms selected by a feature selection algorithm. The results indicated that the performance of the Ridge model was approximately 7% higher on average than that of the SVM models. Additionally, the degree of polarity on the LDA topics estimated using the proposed method was compared with actual public opinion responses. The results showed that the polarity detection accuracy of the Lasso model was 83%, indicating that the proposed method was valid in most cases.


연구의의

본 연구는 트위터 데이터 안에서 정치적 이슈들의 대한 양극성의 대중 의견을 자동적으로 감지하고 분석 방법을 제시하고자 연구 어젠다 하에 진행되었음. 트위터의 양극성을 찾기 위해 shrink regression(다중회귀)의 Lasso and Ridge 모델, LDA 토픽 모델링의 기법을 사용하여 분석하였음. Lasso and Ridge 모델은 트위터 안에서 나타나는 용어들의 감정 수치를 제시하고 LDA 토픽 모델링 기법으로 검증하였음. 이 결과, Lasso and Ridge 모델은 서울시 시장 선거에 사용 하였던 SVM 모델 보다 높은 정확도를 증명하였음.