期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A link-bridged topic model for cross-domain document classification

Pei Yang Wei Gao Qi Tan Kam-Fai Wong 《Information processing & management》2013

Transfer learning utilizes labeled data available from some related domain (source domain) for achieving effective knowledge transformation to the target domain. However, most state-of-the-art cross-domain classification methods treat documents as plain text and ignore the hyperlink (or citation) relationship existing among the documents. In this paper, we propose a novel cross-domain document classification approach called Link-Bridged Topic model (LBT). LBT consists of two key steps. Firstly, LBT utilizes an auxiliary link network to discover the direct or indirect co-citation relationship among documents by embedding the background knowledge into a graph kernel. The mined co-citation relationship is leveraged to bridge the gap across different domains. Secondly, LBT simultaneously combines the content information and link structures into a unified latent topic model. The model is based on an assumption that the documents of source and target domains share some common topics from the point of view of both content information and link structure. By mapping both domains data into the latent topic spaces, LBT encodes the knowledge about domain commonality and difference as the shared topics with associated differential probabilities. The learned latent topics must be consistent with the source and target data, as well as content and link statistics. Then the shared topics act as the bridge to facilitate knowledge transfer from the source to the target domains. Experiments on different types of datasets show that our algorithm significantly improves the generalization performance of cross-domain document classification. 相似文献

2.

Cognitive-inspired domain adaptation of sentiment lexicons

Frank Z. Xing Filippo Pallucchini Erik Cambria 《Information processing & management》2019,56(3):554-564

Sentiment lexicons are essential tools for polarity classification and opinion mining. In contrast to machine learning methods that only leverage text features or raw text for sentiment analysis, methods that use sentiment lexicons embrace higher interpretability. Although a number of domain-specific sentiment lexicons are made available, it is impractical to build an ex ante lexicon that fully reflects the characteristics of the language usage in endless domains. In this article, we propose a novel approach to simultaneously train a vanilla sentiment classifier and adapt word polarities to the target domain. Specifically, we sequentially track the wrongly predicted sentences and use them as the supervision instead of addressing the gold standard as a whole to emulate the life-long cognitive process of lexicon learning. An exploration-exploitation mechanism is designed to trade off between searching for new sentiment words and updating the polarity score of one word. Experimental results on several popular datasets show that our approach significantly improves the sentiment classification performance for a variety of domains by means of improving the quality of sentiment lexicons. Case-studies also illustrate how polarity scores of the same words are discovered for different domains. 相似文献

3.

Aspect-based sentiment analysis with alternating coattention networks

Chao Yang Hefeng Zhang Bin Jiang Keqin Li 《Information processing & management》2019,56(3):463-478

Aspect-based sentiment analysis aims to predict the sentiment polarities of specific targets in a given text. Recent researches show great interest in modeling the target and context with attention network to obtain more effective feature representation for sentiment classification task. However, the use of an average vector of target for computing the attention score for context is unfair. Besides, the interaction mechanism is simple thus need to be further improved. To solve the above problems, this paper first proposes a coattention mechanism which models both target-level and context-level attention alternatively so as to focus on those key words of targets to learn more effective context representation. On this basis, we implement a Coattention-LSTM network which learns nonlinear representations of context and target simultaneously and can extracts more effective sentiment feature from coattention mechanism. Further, a Coattention-MemNet network which adopts a multiple-hops coattention mechanism is proposed to improve the sentiment classification result. Finally, we propose a new location weighted function which considers the location information to enhance the performance of coattention mechanism. Extensive experiments on two public datasets demonstrate the effectiveness of all proposed methods, and our findings in the experiments provide new insight for future developments of using attention mechanism and deep neural network for aspect-based sentiment analysis. 相似文献

4.

Image-based 3D model retrieval via disentangled feature learning and enhanced semantic alignment

《Information processing & management》2023,60(2):103159

With the development of 3D technology and the increase in 3D models, 2D image-based 3D model retrieval tasks have drawn increased attention from scholars. Previous works align cross-domain features via adversarial domain alignment and semantic alignment. However, the extracted features of previous methods are disturbed by the residual domain-specific features, and the lack of labels for 3D models makes the semantic alignment challenging. Therefore, we propose disentangled feature learning associated with enhanced semantic alignment to address these problems. On one hand, the disentangled feature learning enables decoupling the twisted raw features into the isolated domain-invariant and domain-specific features, and the domain-specific features will be dropped while performing adversarial domain alignment and semantic alignment to acquire domain-invariant features. On the other hand, we mine the semantic consistency by compacting each 3D model sample and its nearest neighbors to further enhance semantic alignment for unlabeled 3D model domain. We give comprehensive experiments on two public datasets, and the results demonstrate the superiority of the proposed method. Especially on MI3DOR-2 dataset, our method outperforms the current state-of-the-art methods with gains of 2.88% for the strictest retrieval metric NN. 相似文献

5.

Aspect sentiment analysis with heterogeneous graph neural networks

《Information processing & management》2022,59(4):102953

Aspect-based sentiment analysis technologies may be a very practical methodology for securities trading, commodity sales, movie rating websites, etc. Most recent studies adopt the recurrent neural network or attention-based neural network methods to infer aspect sentiment using opinion context terms and sentence dependency trees. However, due to a sentence often having multiple aspects sentiment representation, these models are hard to achieve satisfactory classification results. In this paper, we discuss these problems by encoding sentence syntax tree, words relations and opinion dictionary information in a unified framework. We called this method heterogeneous graph neural networks (Hete_GNNs). Firstly, we adopt the interactive aspect words and contexts to encode the sentence sequence information for parameter sharing. Then, we utilized a novel heterogeneous graph neural network for encoding these sentences’ syntax dependency tree, prior sentiment dictionary, and some part-of-speech tagging information for sentiment prediction. We perform the Hete_GNNs sentiment judgment and report the experiments on five domain datasets, and the results confirm that the heterogeneous context information can be better captured with heterogeneous graph neural networks. The improvement of the proposed method is demonstrated by aspect sentiment classification task comparison. 相似文献

6.

基于深度学习的电商评论信息多刻面情感分类研究

下载免费PDF全文

岑咏华李文敬李莉《情报科学》2021,39(9):67-73

【目的/意义】文本情感分类是近年来情报学领域的研究热点之一。已有研究大多关注针对目标文本的单一情感分类。本文旨在探索基于深度学习的电商评论信息多刻面情感分类方法。【方法/过程】提出一种基于Atten⁃ tion-BiGRU-CNN的多刻面情感分类模型,通过BiGRU和CNN获取上下文信息和局部特征,利用Attention机制优化隐层权重,以深度挖掘文本内隐语义和有效刻画多刻面情感。【结果/结论】在中文电商评论信息语料上的实验表明,相较于其他神经网络模型,本文方法可有效提高多刻面情感分类的准确度。【创新/局限】进一步丰富多刻面情感分类的方法途径,为深度挖掘电商评论信息以及优化产品和营销策略提供参考。本文语料主要基于单一类别电商评论信息,聚焦可归纳刻面的情感分类,进一步的研究可面向类别多元化、需通过深度学习提取刻面信息的更大规模语料展开。相似文献

7.

TDAM: A topic-dependent attention model for sentiment analysis

《Information processing & management》2019,56(6):102084

We propose a topic-dependent attention model for sentiment classification and topic extraction. Our model assumes that a global topic embedding is shared across documents and employs an attention mechanism to derive local topic embedding for words and sentences. These are subsequently incorporated in a modified Gated Recurrent Unit (GRU) for sentiment classification and extraction of topics bearing different sentiment polarities. Those topics emerge from the words’ local topic embeddings learned by the internal attention of the GRU cells in the context of a multi-task learning framework. In this paper, we present the hierarchical architecture, the new GRU unit and the experiments conducted on users’ reviews which demonstrate classification performance on a par with the state-of-the-art methodologies for sentiment classification and topic coherence outperforming the current approaches for supervised topic extraction. In addition, our model is able to extract coherent aspect-sentiment clusters despite using no aspect-level annotations for training. 相似文献

8.

面向视频弹幕的网络舆情事件监测研究

黄立赫石映昕《情报杂志》2022,41(2):146-154

[研究目的]从视频弹幕的视角出发,挖掘网络舆情事件的话题漂移规律,提升网络舆情事件的视频情感检索精度。[研究方法]通过对视频弹幕进行主题与情感分析,提升网络舆情事件在线监测精准度,并在此基础上提出并构建弹幕迁移指数,建立一种基于弹幕迁移指数的情感监测方法,该方法首先基于BTM主题模型抽取视频弹幕的话题信息,并基于情感词典与颜文字词典计算不同时间窗口下的话题情感类别与情感强度,建立面向视频弹幕的网络舆情事件监测模型,再从话题内容的变化与视频兴趣热度两个角度构建话题迁移指数,并利用话题的情感强度变化,构建情感迁移指数。最终,基于话题迁移指数与情感迁移指数,得到加权后的弹幕迁移指数,实现网络舆情事件的在线监测。[研究结论]通过视频弹幕社区的真实数据,从逻辑层面验证了本模型的合理性,结果表明该方法能够较为准确地识别网络舆情事件迁移的关键时间窗口,为实现视频分享平台的情感可视化提供了切实可行的理论探索。相似文献

9.

Incorporating stock prices and news sentiments for stock market prediction: A case of Hong Kong

《Information processing & management》2020,57(5):102212

Stock prediction via market data analysis is an attractive research topic. Both stock prices and news articles have been employed in the prediction processes. However, how to combine technical indicators from stock prices and news sentiments from textual news articles, and make the prediction model be able to learn sequential information within time series in an intelligent way, is still an unsolved problem. In this paper, we build up a stock prediction system and propose an approach that 1) represents numerical price data by technical indicators via technical analysis, and represents textual news articles by sentiment vectors via sentiment analysis, 2) setup a layered deep learning model to learn the sequential information within market snapshot series which is constructed by the technical indicators and news sentiments, 3) setup a fully connected neural network to make stock predictions. Experiments have been conducted on more than five years of Hong Kong Stock Exchange data using four different sentiment dictionaries, and results show that 1) the proposed approach outperforms the baselines in both validation and test sets using two different evaluation metrics, 2) models incorporating prices and news sentiments outperform models that only use either technical indicators or news sentiments, in both individual stock level and sector level, 3) among the four sentiment dictionaries, finance domain-specific sentiment dictionary (Loughran–McDonald Financial Dictionary) models the news sentiments better, which brings more prediction performance improvements than the other three dictionaries. 相似文献

10.

ALDONAr: A hybrid solution for sentence-level aspect-based sentiment analysis using a lexicalized domain ontology and a regularized neural attention model

《Information processing & management》2020,57(3):102211

Aspect-based sentiment analysis allows one to compute the sentiment for an aspect in a certain context. One problem in this analysis is that words possibly carry different sentiments for different aspects. Moreover, an aspect’s sentiment might be highly influenced by the domain-specific knowledge. In order to tackle these issues, in this paper, we propose a hybrid solution for sentence-level aspect-based sentiment analysis using A Lexicalized Domain Ontology and a Regularized Neural Attention model (ALDONAr). The bidirectional context attention mechanism is introduced to measure the influence of each word in a given sentence on an aspect’s sentiment value. The classification module is designed to handle the complex structure of a sentence. The manually created lexicalized domain ontology is integrated to utilize the field-specific knowledge. Compared to the existing ALDONA model, ALDONAr uses BERT word embeddings, regularization, the Adam optimizer, and different model initialization. Moreover, its classification module is enhanced with two 1D CNN layers providing superior results on standard datasets. 相似文献

11.

Joint deep feature learning and unsupervised visual domain adaptation for cross-domain 3D object retrieval

《Information processing & management》2020,57(5):102275

With the widespread application of 3D capture devices, diverse 3D object datasets from different domains have emerged recently. Consequently, how to obtain the 3D objects from different domains is becoming a significant and challenging task. The existing approaches mainly focus on the task of retrieval from the identical dataset, which significantly constrains their implementation in real-world applications. This paper addresses the cross-domain object retrieval in an unsupervised manner, where the labels of samples from source domain are provided while the labels of samples from target domain are unknown. We propose a joint deep feature learning and visual domain adaptation method (Deep-VDA) to solve the cross-domain 3D object retrieval problem by the end-to-end learning. Specifically, benefiting from the advantages of deep learning networks, Deep-VDA employs MVCNN for deep feature extraction and domain alignment for unsupervised domain adaptation. The framework can enable the statistical and geometric shift between domains to be minimized in an unsupervised manner, which is accomplished by preserving both common and unique characteristics of each domain. Deep-VDA can improve the robustness of object features from different domains, which is important to maintain remarkable retrieval performance. 相似文献

12.

Self-training from labeled features for sentiment analysis

Yulan He Deyu Zhou 《Information processing & management》2011

Sentiment analysis concerns about automatically identifying sentiment or opinion expressed in a given piece of text. Most prior work either use prior lexical knowledge defined as sentiment polarity of words or view the task as a text classification problem and rely on labeled corpora to train a sentiment classifier. While lexicon-based approaches do not adapt well to different domains, corpus-based approaches require expensive manual annotation effort. 相似文献

13.

A comparative study of automated legal text classification using random forests and deep learning

《Information processing & management》2022,59(2):102798

Automated legal text classification is a prominent research topic in the legal field. It lays the foundation for building an intelligent legal system. Current literature focuses on international legal texts, such as Chinese cases, European cases, and Australian cases. Little attention is paid to text classification for U.S. legal texts. Deep learning has been applied to improving text classification performance. Its effectiveness needs further exploration in domains such as the legal field. This paper investigates legal text classification with a large collection of labeled U.S. case documents through comparing the effectiveness of different text classification techniques. We propose a machine learning algorithm using domain concepts as features and random forests as the classifier. Our experiment results on 30,000 full U.S. case documents in 50 categories demonstrated that our approach significantly outperforms a deep learning system built on multiple pre-trained word embeddings and deep neural networks. In addition, applying only the top 400 domain concepts as features for building the random forests could achieve the best performance. This study provides a reference to select machine learning techniques for building high-performance text classification systems in the legal domain or other fields. 相似文献

14.

基于文本情绪分类的社交网络用户传播他人隐私信息行为研究

下载免费PDF全文

马达卢嘉蓉朱侯《情报科学》2023,41(2):60-68

【目的/意义】探究针对微博文本的基于深度学习的情绪分类有效方法,研究微博热点事件下用户转发言论的情绪类型与隐私信息传播的关系。【方法/过程】选用BERT、BERT+CNN、BERT+RNN和ERNIE四个深度学习分类模型设置对比实验,在重新构建情绪7分类语料库的基础上验证性能较好的模型。选取4个微博热点案例,从情绪分布、情感词词频、转发时间和转发次数四个方面展开实证分析。【结果/结论】通过实证研究发现,用户在传播隐私信息是急速且短暂的,传播时以“愤怒”和“厌恶”等为代表的消极情绪占主导地位,且会因隐私信息主体的不同而产生情绪类型和表达方式上的差异。【创新/局限】研究了用户在传播隐私信息行为时的情绪特征及二者的联系,为保护社交网络用户隐私信息安全提供有价值的理论和现实依据,但所构建的语料库数据量对于训练一个高准确率的深度学习模型而言还不够,且模型对于反话、反讽等文本的识别效果不佳。相似文献

15.

Bi-view semi-supervised active learning for cross-lingual sentiment classification

Mohammad Sadegh Hajmohammadi Roliana IbrahimAli Selamat 《Information processing & management》2014

Recently, sentiment classification has received considerable attention within the natural language processing research community. However, since most recent works regarding sentiment classification have been done in the English language, there are accordingly not enough sentiment resources in other languages. Manual construction of reliable sentiment resources is a very difficult and time-consuming task. Cross-lingual sentiment classification aims to utilize annotated sentiment resources in one language (typically English) for sentiment classification of text documents in another language. Most existing research works rely on automatic machine translation services to directly project information from one language to another. However, different term distribution between original and translated text documents and translation errors are two main problems faced in the case of using only machine translation. To overcome these problems, we propose a novel learning model based on active learning and semi-supervised co-training to incorporate unlabelled data from the target language into the learning process in a bi-view framework. This model attempts to enrich training data by adding the most confident automatically-labelled examples, as well as a few of the most informative manually-labelled examples from unlabelled data in an iterative process. Further, in this model, we consider the density of unlabelled data so as to select more representative unlabelled examples in order to avoid outlier selection in active learning. The proposed model was applied to book review datasets in three different languages. Experiments showed that our model can effectively improve the cross-lingual sentiment classification performance and reduce labelling efforts in comparison with some baseline methods. 相似文献

16.

知识生活型规律模型

刘福林李淑萍宋唯一康洁刘丹《科学学研究》2012,30(10):1454-1461,1467

为了揭示知识在知识群落内的生存类型规律;依据知识DNA跨域映射思想,采用仿生学演绎与实证研究两种方法,以生物生活型规律为原型,跨域映射提出了知识生活型规律模型,包括思维影响力,知识生活型及其概念、类型、规律与模型,又经实际案例验证,该规律模型揭示了知识在知识群落内的生存类型规律;然后,提出了在知识群落、知识挖掘、知识可视化、知识转化等方面的应用前景,并提出了隐性知识可视化的崭新课题。该结果是基于仿生学发现的重要基础理论与启发性新观点,适用于知识创新与知识管理相关研究领域,具有突出的科学意义与应用价值。相似文献

17.

On the class separability of contextual embeddings representations – or “The classifier does not matter when the (text) representation is so good!”

《Information processing & management》2023,60(4):103336

The literature has not fully and adequately explained why contextual (e.g., BERT-based) representations are so successful to improve the effectiveness of some Natural Language Processing tasks, especially Automatic Text Classifications (ATC). In this article, we evince that such representations, when properly tuned to a target domain, produce an extremely separable space that makes the classification task very effective, independently of the classifier employed for solving the ATC task. To demonstrate our hypothesis, we perform a thorough class separability analysis in order to visualize and measure how well BERT-based embeddings separate documents of different classes in comparison with other widely used representation approaches, e.g., TFIDF BoW, static embeddings (e.g., fastText) and zero-shot (non-tuned) contextual embeddings. We also analyze separability in the context of transfer learning and compare BERT-based representations with those obtained from other transformers (e.g., RoBERTa, XLNET). Our experiments covering sixteen datasets in topic and sentiment classification, eight classification methods and three class separability metrics show that the fine-tuned BERT embeddings are highly separable in the corresponding space (e.g., they are 67% more separable than the static embeddings). As a consequence, they allow the simplest classifiers to achieve similar effectiveness as the most complex methods. We also find moderate to high correlations between separability and effectiveness in all experimented scenarios. Overall, our main finding is that more discriminative (i.e., separable) textual representations constitute a critical part of the ATC solutions that, given the current state-of-the-art in classification algorithms, are more prominent than the algorithmic (classifier) method for solving the task. 相似文献

18.

Towards a real-time processing framework based on improved distributed recurrent neural network variants with fastText for social big data analytics

《Information processing & management》2020,57(1):102122

Big data generated by social media stands for a valuable source of information, which offers an excellent opportunity to mine valuable insights. Particularly, User-generated contents such as reviews, recommendations, and users’ behavior data are useful for supporting several marketing activities of many companies. Knowing what users are saying about the products they bought or the services they used through reviews in social media represents a key factor for making decisions. Sentiment analysis is one of the fundamental tasks in Natural Language Processing. Although deep learning for sentiment analysis has achieved great success and allowed several firms to analyze and extract relevant information from their textual data, but as the volume of data grows, a model that runs in a traditional environment cannot be effective, which implies the importance of efficient distributed deep learning models for social Big Data analytics. Besides, it is known that social media analysis is a complex process, which involves a set of complex tasks. Therefore, it is important to address the challenges and issues of social big data analytics and enhance the performance of deep learning techniques in terms of classification accuracy to obtain better decisions.In this paper, we propose an approach for sentiment analysis, which is devoted to adopting fastText with Recurrent neural network variants to represent textual data efficiently. Then, it employs the new representations to perform the classification task. Its main objective is to enhance the performance of well-known Recurrent Neural Network (RNN) variants in terms of classification accuracy and handle large scale data. In addition, we propose a distributed intelligent system for real-time social big data analytics. It is designed to ingest, store, process, index, and visualize the huge amount of information in real-time. The proposed system adopts distributed machine learning with our proposed method for enhancing decision-making processes. Extensive experiments conducted on two benchmark data sets demonstrate that our proposal for sentiment analysis outperforms well-known distributed recurrent neural network variants (i.e., Long Short-Term Memory (LSTM), Bidirectional Long Short-Term Memory (BiLSTM), and Gated Recurrent Unit (GRU)). Specifically, we tested the efficiency of our approach using the three different deep learning models. The results show that our proposed approach is able to enhance the performance of the three models. The current work can provide several benefits for researchers and practitioners who want to collect, handle, analyze and visualize several sources of information in real-time. Also, it can contribute to a better understanding of public opinion and user behaviors using our proposed system with the improved variants of the most powerful distributed deep learning and machine learning algorithms. Furthermore, it is able to increase the classification accuracy of several existing works based on RNN models for sentiment analysis. 相似文献

19.

基于OTSCM模型的主题情感在线追踪

刘玉文刘月华杨枢张钰《现代情报》2017,37(12):35-41

网络舆论主题情感在线分析对舆情研判与管理起着十分重要的作用,当前的主题情感模型存在着主题与情感建模关系不紧密,情感挖掘偏斜等问题,容易造成舆情误判。文本在OLDA（On-Line Latent Dirichlet Allocation,OLDA）模型的基础上引入情感参数,并提出情感遗传思想,建立基于情感遗传的在线主题情感混合模型OTSCM（On-Line Topic and Sentiment Combining Model）。该模型把t-1时间片内的主题情感分布作为t时间片内主题情感分布的先验,通过构造主题情感演化矩阵,生成t时间片内文档—主题、主题—特征词以及主题—情感词3个分布,最后使用交叉熵方法计算t时间片内主题分布与t-1之前主题分布的相似度,得出t时间片内主题情感演化结果。本文在5个数据集上对OTSCM进行了验证,并与其它流行算法进行了对比,实验表明,文本方法在主题情感在线识别方面达到了良好的效果。相似文献

20.

Exploring temporal representations by leveraging attention-based bidirectional LSTM-RNNs for multi-modal emotion recognition

《Information processing & management》2020,57(3):102185

Emotional recognition contributes to automatically perceive the user’s emotional response to multimedia content through implicit annotation, which further benefits establishing effective user-centric services. Physiological-based ways have increasingly attract researcher’s attention because of their objectiveness on emotion representation. Conventional approaches to solve emotion recognition have mostly focused on the extraction of different kinds of hand-crafted features. However, hand-crafted feature always requires domain knowledge for the specific task, and designing the proper features may be more time consuming. Therefore, exploring the most effective physiological-based temporal feature representation for emotion recognition becomes the core problem of most works. In this paper, we proposed a multimodal attention-based BLSTM network framework for efficient emotion recognition. Firstly, raw physiological signals from each channel are transformed to spectrogram image for capturing their time and frequency information. Secondly, Attention-based Bidirectional Long Short-Term Memory Recurrent Neural Networks (LSTM-RNNs) are utilized to automatically learn the best temporal features. The learned deep features are then fed into a deep neural network (DNN) to predict the probability of emotional output for each channel. Finally, decision level fusion strategy is utilized to predict the final emotion. The experimental results on AMIGOS dataset show that our method outperforms other state of art methods. 相似文献