Automatic analysis of document sentiment

Automatic analysis of document sentiment(2006)

引用 24|浏览9
Sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has attracted a great deal of attention. Potential applications include question-answering systems that address opinions as opposed to facts and business intelligence systems that analyze user feedback. The research issues raised by such applications are often quite challenging compared to fact-based analysis. This thesis presents several sentiment analysis tasks to illustrate the new challenges and opportunities. In particular, we describe how we modeled different types of relations in approaching several sentiment analysis problems; our models can have implications outside this area as well. One task is polarity classification, where we classify a movie review as "thumbs up" or "thumbs down" from textual information alone. We consider a number of approaches, including one that applies text categorization techniques to just the subjective portions of the document. Extracting these portions can be a hard problem in itself; we describe an approach based on efficient techniques for finding minimum cuts in graphs that incorporate sentence-level relations. The second task, which can be viewed as a non-standard multi-class classification task, is the rating-inference problem, where one must determine the reviewer's evaluation with respect to a multi-point scale (e.g. one to five "stars"). We apply a meta-algorithm; based on a metric-labeling formulation of the problem, that explicitly exploits relations between classes. A different type of relationship between text units is considered in the third task, where we investigate whether one can determine from the transcripts of U.S. Congressional floor debates whether the speeches represent support of or opposition to proposed legislation. In particular, we exploit the fact that these speeches occur as part of a discussion. We find that the incorporation of information regarding relationships between discourse segments yields substantial improvements over classifying speeches in isolation. Lastly, we introduce and discuss a sentiment analysis problem arising in Web-search applications: given documents on the same focused topic, we wish to rank the subjective documents before objective ones. We present early results with unsupervised approaches that do not assume prior linguistic or domain-specific knowledge.
text unit,document sentiment,sentiment analysis task,text categorization technique,hard problem,rating-inference problem,non-standard multi-class classification task,different type,automatic analysis,sentiment analysis problem,sentiment analysis,fact-based analysis
AI 理解论文
Chat Paper