首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 593 毫秒
1.
The problem of social spam detection has been traditionally modeled as a supervised classification problem. Despite the initial success of this detection approach, later analysis of proposed systems and detection features has shown that, like email spam, the dynamic and adversarial nature of social spam makes the performance achieved by supervised systems hard to maintain. In this paper, we investigate the possibility of using the output of previously proposed supervised classification systems as a tool for spammers discovery. The hypothesis is that these systems are still highly capable of detecting spammers reliably even when their recall is far from perfect. We then propose to use the output of these classifiers as prior beliefs in a probabilistic graphical model framework. This framework allows beliefs to be propagated to similar social accounts. Basing similarity on a who-connects-to-whom network has been empirically critiqued in recent literature and we propose here an alternative definition based on a bipartite users-content interaction graph. For evaluation, we build a Markov Random Field on a graph of similar users and compute prior beliefs using a selection of state-of-the-art classifiers. We apply Loopy Belief Propagation to obtain posterior predictions on users. The proposed system is evaluated on a recent Twitter dataset that we collected and manually labeled. Classification results show a significant increase in recall and a maintained precision. This validates that formulating the detection problem with an undirected graphical model framework permits to restore the deteriorated performances of previously proposed statistical classifiers and to effectively mitigate the effect of spam evolution.  相似文献   

2.
Recently, phishing scams have become one of the most serious types of crime involved in Ethereum, the second-largest blockchain-based cryptocurrency platform. The existing phishing scams detection techniques for Ethereum mostly use traditional machine learning or network representation learning to mine the key information from the transaction network and identify phishing addresses. However, these methods typically crop the temporal transaction graph into snapshot sequences or construct temporal random wanderings to model the dynamic evolution of the topology of the transaction graph. In this paper, we propose PDTGA, a method that applies graph representation learning based on temporal graphs attention to improve the effectiveness of phishing scams detection in Ethereum. Specifically, we learn the functional representation of time directly and model the time signal through the interactions between the time encoding function and node features, edge features, and the topology of the graph. We collected a real-world Ethereum phishing scam dataset, containing over 250,000 transaction records between more than 100,000 account addresses, and divided them into three datasets of different sizes. Through data analysis, we first summarized the periodic pattern of Ethereum phishing scam activities. Then we constructed 14 kinds of account node features and 3 kinds of transaction edge features. Experimental evaluations based on the above three datasets demonstrate that PDTGA with 94.78% AUC score and 88.76% recall score outperforms the state-of-the-art methods.  相似文献   

3.
Fault or anomaly detection is one of the key problems faced by the chemical process industry for achieving safe and reliable operation. In this study, a novel methodology, spectral weighted graph autoencoder (SWGAE) is proposed, wherein, the problem of anomaly detection is addressed with the help of graphs. The proposed approach entails the following key steps. Firstly, constructing a spectral weighted graph, where each time step of a process variable in the multivariate time series dataset is modelled as a node in an appropriately tuned moving window. Subsequently, we propose to monitor the weights of the edges between two nodes that make a connection. The faulty instances are identified based on the discrepancy in the weight pattern compared to normal operating data. To this end, once the weights are determined, they are fed to the auto-encoder network, where reconstruction loss is calculated, and faults are identified if the reconstruction loss exceeds a threshold. Further, to make the proposed approach comprehensive, a fault isolation methodology is also proposed to identify the faulty nodes once the faulty variables are identified. The proposed approach is validated using Tennessee-Eastman benchmark data and pressurized heavy water nuclear reactor real-time plant data. The results indicate that the SWGAE method, when compared to the other state-of-the-art methods, yielded more accurate results in correctly detecting faulty nodes and isolating them.  相似文献   

4.
With the prosperity and development of the digital economy, many fraudsters have emerged on e-commerce platforms to fabricate fraudulent reviews to mislead consumers’ shopping decisions for profit. Moreover, in order to evade fraud detection, fraudsters continue to evolve and present the phenomenon of adversarial camouflage and collaborative attack. In this paper, we propose a novel temporal burstiness and collaborative camouflage aware method (TBCCA) for fraudster detection. Specifically, we capture the hidden temporal burstiness features behind camouflage strategy based on the time series prediction model, and identify highly suspicious target products by assigning suspicious scores as node priors. Meanwhile, a propagation graph integrating review collusion is constructed, and an iterative fraud confidence propagation algorithm is designed for inferring the label of nodes in the graph based on Loop Belief Propagation (LBP). Comprehensive experiments are conducted to compare TBCCA with state-of-the-art fraudster detection approaches, and experimental results show that TBCCA can effectively identify fraudsters in real review networks with achieving 6%–10% performance improvement than other baselines.  相似文献   

5.
Dynamic link prediction is a critical task in network research that seeks to predict future network links based on the relative behavior of prior network changes. However, most existing methods overlook mutual interactions between neighbors and long-distance interactions and lack the interpretability of the model’s predictions. To tackle the above issues, in this paper, we propose a temporal group-aware graph diffusion network(TGGDN). First, we construct a group affinity matrix to describe mutual interactions between neighbors, i.e., group interactions. Then, we merge the group affinity matrix into the graph diffusion to form a group-aware graph diffusion, which simultaneously captures group interactions and long-distance interactions in dynamic networks. Additionally, we present a transformer block that models the temporal information of dynamic networks using self-attention, allowing the TGGDN to pay greater attention to task-related snapshots while also providing interpretability to better understand the network evolutionary patterns. We compare the proposed TGGDN with state-of-the-art methods on five different sizes of real-world datasets ranging from 1k to 20k nodes. Experimental results show that TGGDN achieves an average improvement of 8.3% and 3.8% in terms of ACC and AUC on all datasets, respectively, demonstrating the superiority of TGGDN in the dynamic link prediction task.  相似文献   

6.
Semi-supervised anomaly detection methods leverage a few anomaly examples to yield drastically improved performance compared to unsupervised models. However, they still suffer from two limitations: 1) unlabeled anomalies (i.e., anomaly contamination) may mislead the learning process when all the unlabeled data are employed as inliers for model training; 2) only discrete supervision information (such as binary or ordinal data labels) is exploited, which leads to suboptimal learning of anomaly scores that essentially take on a continuous distribution. Therefore, this paper proposes a novel semi-supervised anomaly detection method, which devises contamination-resilient continuous supervisory signals. Specifically, we propose a mass interpolation method to diffuse the abnormality of labeled anomalies, thereby creating new data samples labeled with continuous abnormal degrees. Meanwhile, the contaminated area can be covered by new data samples generated via combinations of data with correct labels. A feature learning-based objective is added to serve as an optimization constraint to regularize the network and further enhance the robustness w.r.t. anomaly contamination. Extensive experiments on 11 real-world datasets show that our approach significantly outperforms state-of-the-art competitors by 20%–30% in AUC-PR and obtains more robust and superior performance in settings with different anomaly contamination levels and varying numbers of labeled anomalies.  相似文献   

7.
The recent significant growth of social media has brought the attention of researchers toward monitoring the enormous amount of streaming data using real-time approaches. This data may appear in different forms like streaming text, images, audio, videos, etc. In this paper, we address the problem of deciding the appropriateness of streaming videos with the help of on-demand crowdsourcing. We propose a novel crowd-powered model ViSSa, which is an open crowdsourcing platform that helps to automatically detect appropriateness of the videos getting uploaded online through employing the viewers of existing videos. The proposed model presents a unique approach of not only identifying unsafe videos but also detecting the portion of inappropriateness (in terms of platform’s vulnerabilities). Our experiments with 47 crowd contributors demonstrate the effectiveness of the proposed approach. On the designed ViSSa platform, 18 safe videos are initially posted. After getting access, 20 new videos are added by different users. These videos are assessed (and marked as safe or unsafe) by users and finally with judgment analysis a consensus judgment is obtained. The approach detects the unsafe videos with high accuracy (95%) and point out the portion of inappropriateness. Interestingly, changing the mode of video segment allocation (homogeneous and heterogeneous) is found to have a significant impact on the viewers’ feedback. However, the proposed approach performs consistently well in different modes of viewing (with varying diversity of opinions), and with any arbitrary video size and type. The users are found to be motivated by their sense of responsibility. This paper also highlights the importance of identifying spammers through such models.  相似文献   

8.
With the information explosion of news articles, personalized news recommendation has become important for users to quickly find news that they are interested in. Existing methods on news recommendation mainly include collaborative filtering methods which rely on direct user-item interactions and content based methods which characterize the content of user reading history. Although these methods have achieved good performances, they still suffer from data sparse problem, since most of them fail to extensively exploit high-order structure information (similar users tend to read similar news articles) in news recommendation systems. In this paper, we propose to build a heterogeneous graph to explicitly model the interactions among users, news and latent topics. The incorporated topic information would help indicate a user’s interest and alleviate the sparsity of user-item interactions. Then we take advantage of graph neural networks to learn user and news representations that encode high-order structure information by propagating embeddings over the graph. The learned user embeddings with complete historic user clicks capture the users’ long-term interests. We also consider a user’s short-term interest using the recent reading history with an attention based LSTM model. Experimental results on real-world datasets show that our proposed model significantly outperforms state-of-the-art methods on news recommendation.  相似文献   

9.
Opinion summarization can facilitate user’s decision-making by mining the salient review information. However, due to the lack of sufficient annotated data, most of the early works are based on extractive methods, which restricts the performance of opinion summarization. In this work, we aim to improve the informativeness of opinion summarization to provide better guidance to users. We consider the setting with only reviews without corresponding summaries, and propose an aspect-augmented model for unsupervised abstractive opinion summarization, denoted as AsU-OSum. We first employ an aspect-based sentiment analysis system to extract opinion phrases from reviews. Then, we construct a heterogeneous graph consisting of reviews and opinion clusters as nodes, which is used to enhance the Transformer-based encoder–decoder framework. Furthermore, we design a novel cascaded attention mechanism to prompt the decoder to pay more attention to the aspects that are more likely to appear in summary. During training, we introduce a sentiment accuracy reward that further enhances the learning ability of our model. We conduct comprehensive experiments on the Yelp, Amazon, and Rotten Tomatoes datasets. Automatic evaluation results show that our model is competitive and performs better than the state-of-the-art (SOTA) models on some ROUGE metrics. Human evaluation results further verify that our model can generate more informative summaries and reduce redundancy.  相似文献   

10.
Many science and engineering problems can be represented by a network, a generalization of which is a graph. Examples of the problems that can be represented by a graph include: cyclic sequential circuit, organic molecule structures, mechanical structures, etc. The most fundamental issue with these problems (e.g., designing a molecule structure) is the identification of structure, which further reduces to be the identification of graph. The problem of the identification of graph is called graph isomorphism. The graph isomorphism problem is an NP problem according to the computational complexity theory. Numerous methods and algorithms have been proposed to solve this problem. Elsewhere we presented an approach called the eigensystem approach. This approach is based on a combination of eigenvalue and eigenvector which are further associated with the adjacency matrix. The eigensystem approach has been shown to be very effective but requires that a graph must contain at least one distinct eigenvalue. The adjacency matrix is not shown sufficiently to meet this requirement. In this paper, we propose a new matrix called adjusted adjacency matrix that meets this requirement. We show that the eigensystem approach based on the adjusted adjacency matrix is not only effective but also more efficient than that based on the adjacency matrix.  相似文献   

11.
The advent of connected devices and omnipresence of Internet have paved way for intruders to attack networks, which leads to cyber-attack, financial loss, information theft in healthcare, and cyber war. Hence, network security analytics has become an important area of concern and has gained intensive attention among researchers, off late, specifically in the domain of anomaly detection in network, which is considered crucial for network security. However, preliminary investigations have revealed that the existing approaches to detect anomalies in network are not effective enough, particularly to detect them in real time. The reason for the inefficacy of current approaches is mainly due the amassment of massive volumes of data though the connected devices. Therefore, it is crucial to propose a framework that effectively handles real time big data processing and detect anomalies in networks. In this regard, this paper attempts to address the issue of detecting anomalies in real time. Respectively, this paper has surveyed the state-of-the-art real-time big data processing technologies related to anomaly detection and the vital characteristics of associated machine learning algorithms. This paper begins with the explanation of essential contexts and taxonomy of real-time big data processing, anomalous detection, and machine learning algorithms, followed by the review of big data processing technologies. Finally, the identified research challenges of real-time big data processing in anomaly detection are discussed.  相似文献   

12.
Graph convolutional network (GCN) is a powerful tool to process the graph data and has achieved satisfactory performance in the task of node classification. In general, GCN uses a fixed graph to guide the graph convolutional operation. However, the fixed graph from the original feature space may contain noises or outliers, which may degrade the effectiveness of GCN. To address this issue, in this paper, we propose a robust graph learning convolutional network (RGLCN). Specifically, we design a robust graph learning model based on the sparse constraint and strong connectivity constraint to achieve the smoothness of the graph learning. In addition, we introduce graph learning model into GCN to explore the representative information, aiming to learning a high-quality graph for the downstream task. Experiments on citation network datasets show that the proposed RGLCN outperforms the existing comparison methods with respect to the task of node classification.  相似文献   

13.
Anomalous event recognition requires an instant response to reduce the loss of human life and property; however, existing automated systems show limited performance due to considerations related to the temporal domain of the videos and ignore the significant role of spatial information. Furthermore, although current surveillance systems can detect anomalous events, they require human intervention to recognise their nature and to select appropriate countermeasures, as there are no fully automatic surveillance techniques that can simultaneously detect and interpret anomalous events. Therefore, we present a framework called Vision Transformer Anomaly Recognition (ViT-ARN) that can detect and interpret anomalies in smart city surveillance videos. The framework consists of two stages: the first involves online anomaly detection, for which a customised, lightweight, one-class deep neural network is developed to detect anomalies in a surveillance environment, while in the second stage, the detected anomaly is further classified into the corresponding class. The size of our anomaly detection model is compressed using a filter pruning strategy based on a geometric median, with the aim of easy adaptability for resource-constrained devices. Anomaly classification is based on vision transformer features and is followed by a bottleneck attention mechanism to enhance the representation. The refined features are passed to a multi-reservoir echo state network for a detailed analysis of real-world anomalies such as vandalism and road accidents. A total of 858 and 1600 videos from two datasets are used to train the proposed model, and extensive experiments on the LAD-2000 and UCF-Crime datasets comprising 290 and 400 testing videos reveal that our framework can recognise anomalies more effectively, outperforming other state-of-the-art approaches with increases in accuracy of 10.14% and 3% on the LAD-2000 and UCF-Crime datasets, respectively.  相似文献   

14.
This paper investigates the research question if senders of large amounts of irrelevant or unsolicited information – commonly called “spammers” – distort the network structure of social networks. Two large social networks are analyzed, the first extracted from the Twitter discourse about a big telecommunication company, and the second obtained from three years of email communication of 200 managers working for a large multinational company. This work compares network robustness and the stability of centrality and interaction metrics, as well as the use of language, after removing spammers and the most and least connected nodes. The results show that spammers do not significantly alter the structure of the information-carrying network, for most of the social indicators. The authors additionally investigate the correlation between e-mail subject line and content by tracking language sentiment, emotionality, and complexity, addressing the cases where collecting email bodies is not permitted for privacy reasons. The findings extend the research about robustness and stability of social networks metrics, after the application of graph simplification strategies. The results have practical implication for network analysts and for those company managers who rely on network analytics (applied to company emails and social media data) to support their decision-making processes.  相似文献   

15.
This paper focuses on extracting temporal and parent–child relationships between news events in social news. Previous methods have proved that syntactic features are valid. However, most previous methods directly use the static outcomes parsed by syntactic parsing tools, but task-irrelevant or erroneous parses will inevitably degrade the performance of the model. In addition, many implicit higher-order connections that are directly related and critical to tasks are not explicitly exploited. In this paper, we propose a novel syntax-based dynamic latent graph model (SDLG) for this task. Specifically, we first apply a syntactic type-enhanced attention mechanism to assign different weights to different connections in the parsing results, which helps to filter out noisy connections and better fuse the information in the syntactic structures. Next, we introduce a dynamic event pair-aware induction graph to mine the task-related latent connections. It constructs a potential attention matrix to complement and correct the supervised syntactic features, using the semantics of the event pairs as a guide. Finally, the latent graph, together with the syntactic information, is fed into the graph convolutional network to obtain an improved representation of the event to complete relational reasoning. We have conducted extensive experiments on four public benchmarks, MATRES, TCR, HiEve and TB-Dense. The results show that our model outperforms the state-of-the-art model by 0.4%, 1.5%, 3.0% and 1.3% in F1 scores on the four datasets, respectively. Finally, we provide detailed analyses to show the effectiveness of each proposed component.  相似文献   

16.
Knowledge graph representation learning (KGRL) aims to infer the missing links between target entities based on existing triples. Graph neural networks (GNNs) have been introduced recently as one of the latest trendy architectures serves KGRL task using aggregations of neighborhood information. However, current GNN-based methods have fundamental limitations in both modelling the multi-hop distant neighbors and selecting relation-specific neighborhood information from vast neighbors. In this study, we propose a new relation-specific graph transformation network (RGTN) for the KGRL task. Specifically, the proposed RGTN is the first pioneer model that transforms a relation-based graph into a new path-based graph by generating useful paths that connect heterogeneous relations and multi-hop neighbors. Unlike the existing GNN-based methods, our approach is able to adaptively select the most useful paths for each specific relation and to effectively build path-based connections between unconnected distant entities. The transformed new graph structure opens a new way to model the arbitrary lengths of multi-hop neighbors which leads to more effective embedding learning. In order to verify the effectiveness of our proposed model, we conduct extensive experiments on three standard benchmark datasets, e.g., WN18RR, FB15k-237 and YAGO-10-DR. Experimental results show that the proposed RGTN achieves the promising results and even outperforms other state-of-the-art models on the KGRL task (e.g., compared to other state-of-the-art GNN-based methods, our model achieves 2.5% improvement using H@10 on WN18RR, 1.2% improvement using H@10 on FB15k-237 and 6% improvement using H@10 on YAGO3-10-DR).  相似文献   

17.
Online recommender systems have been shown to be vulnerable to group shilling attacks in which attackers of a shilling group collaboratively inject fake profiles with the aim of increasing or decreasing the frequency that particular items are recommended. Existing detection methods mainly use the frequent itemset (dense subgraph) mining or clustering method to generate candidate groups and then utilize the hand-crafted features to identify shilling groups. However, such two-stage detection methods have two limitations. On the one hand, due to the sensitivity of support threshold or clustering parameters setting, it is difficult to guarantee the quality of candidate groups generated. On the other hand, they all rely on manual feature engineering to extract detection features, which is costly and time-consuming. To address these two limitations, we present a shilling group detection method based on graph convolutional network. First, we model the given dataset as a graph by treating users as nodes and co-rating relations between users as edges. By assigning edge weights and filtering normal user relations, we obtain the suspicious user relation graph. Second, we use principal component analysis to refine the rating features of users and obtain the user feature matrix. Third, we design a three-layer graph convolutional network model with a neighbor filtering mechanism and perform user classification by combining both structure and rating features of users. Finally, we detect shilling groups through identifying target items rated by the attackers according to the user classification results. Extensive experiments show that the classification accuracy and detection performance (F1-measure) of the proposed method can reach 98.92% and 99.92% on the Netflix dataset and 93.18% and 92.41% on the Amazon dataset.  相似文献   

18.
19.
Multiagent systems are increasingly becoming popular among researchers spanning multiple fields of study. However, existing studies only models communication interaction between agents as either fixed or switching topologies described by crisp graphs supported by algebraic graph theories. In this paper, we propose an alternative approach to describing agent interactions using fuzzy graphs. Our approach is aimed at opening up new research avenues and defining new problems in coordination control especially in terms of dynamics between agents’ states, graph topologies and coordination objectives. This paper studies distributed coordination on fuzzy graphs where the edge-weights modeling network topologies are dependent on the states of the agents in the network. In hindsight, the network weights are adjustable based on the situational state of the agents. First, we introduce the concept of fuzzy graphs and give some distinguishing features from the crisp or fixed graphs. Next, we provide some membership functions to define the state-dependent weights and finally we use some simulations to demonstrate the convergence of the proposed consensus algorithms especially for cases where the agents are subject to system failures.  相似文献   

20.
李叶叶  李贺  沈旺  曹阳  涂敏 《情报科学》2022,39(2):65-73
【目的/意义】随着网络购物的普及,在线评论成为影响消费者、销售者和生产者决策的重要数据。大数据 时代,在线评论呈现出多源异构、爆发式增长的特点,难以为用户的购买决策和商家竞争提供有力的情报支撑。【方 法/过程】本文利用多源异构的在线评论数据构建知识图谱,提出了一种基于多源异构数据构建知识图谱的框架, 模式层构建围绕在线评论的信源、内容以及形式构建,最终形成知识图谱的概念框架,并运用word2vec从多源异构 文本中获取实体、关系和属性,并进行数据融合与知识图谱分析。【结果/结论】实验部分以手机商品在线评论为例, 验证了本文所构建的知识图谱对在线评论相关研究及挖掘的有效性,研究结果揭示了多源异构在线评论数据的特 点,为大数据环境下在线评论信息组织、展示和挖掘提供了新的研究视角。【创新/局限】运用知识图谱对在线评论 进行描述,有效解决信息过载、多源异构信息融合等问题。本文采用半自动化的方式构建知识图谱,未来考虑引入 无监督的方法提高构建效率。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号