期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

石梦舒韩雅萱黄元生刘敦楠《科技管理研究》2023,(1):163-170

针对电力系统对短期电力负荷预测精确性的需求，以长短期记忆算法为基础，采用差分自适应进化算法对其进一步改进，从而提出一种基于机器学习的混合算法（SaDE-LSTM）对电力负荷进行短期预测。基于我国2004—2018年间月度社会用电负荷数据，对改进后的混合算法进行性能测试，首先利用差分进化算法的自适应变异和交叉因子来优化长短期记忆算法的初始参数，在此基础上，运用寻优得到的参数训练长短期记忆算法从而得到优化后的预测结果。为证明其优越性，对同组数据采用支持向量机（SVM）、反向传播神经网络、自回归积分滑动平均等算法分别预测。各方法预测结果和真实结果对比分析证明，SaDE-LSTM算法对时间序列数据量要求较低，同时相比其他传统算法有更高的预测精度。该改进算法能够为参与电力系统调度的虚拟电厂、负荷聚合商等对小样本和高精度预测有需求的主体提供参考。相似文献

2.

Region-action LSTM for mouse interaction sequence based search satisfaction evaluation

《Information processing & management》2020,57(6):102349

Mouse interaction data contain a lot of interaction information between users and Search Engine Result Pages (SERPs), which can be useful for evaluating search satisfaction. Existing studies use aggregated features or anchor elements to capture the spatial information in mouse interaction data, which might lose valuable mouse cursor movement patterns for estimating search satisfaction. In this paper, we leverage regions together with actions to extract sequences from mouse interaction data. Using regions to capture the spatial information in mouse interaction data would reserve more details of the interaction processes between users and SERPs. To modeling mouse interaction sequences for search satisfaction evaluation, we propose a novel LSTM unit called Region-Action LSTM (RALSTM), which could capture the interactive relations between regions and actions without subjecting the network to higher training complexity. Simultaneously, we propose a data augmentation strategy Multi-Factor Perturbation (MFP) to increase the pattern variations on mouse interaction sequences. We evaluate the proposed approach on open datasets. The experimental results show that the proposed approach achieves significant performance improvement compared with the state-of-the-art search satisfaction evaluation approach. 相似文献

3.

Forecasting movements of stock time series based on hidden state guided deep learning approach

《Information processing & management》2023,60(3):103328

Stock movement forecasting is usually formalized as a sequence prediction task based on time series data. Recently, more and more deep learning models are used to fit the dynamic stock time series with good nonlinear mapping ability, but not much of them attempt to unveil a market system’s internal dynamics. For instance, the driving force (state) behind the stock rise may be the company’s good profitability or concept marketing, and it is helpful to judge the future trend of the stock. To address this issue, we regard the explored pattern as an organic component of the hidden mechanism. Considering the effective hidden state discovery ability of the Hidden Markov Model (HMM), we aim to integrate it into the training process of the deep learning model. Specifically, we propose a deep learning framework called Hidden Markov Model-Attentive LSTM (HMM-ALSTM) to model stock time series data, which guides the hidden state learning of deep learning methods via the market’s pattern (learned by HMM) that generates time series data. What is more, a large number of experiments on 6 real-world data sets and 13 stock prediction baselines for predicting stock movement and return rate are implemented. Our proposed HMM-ALSTM achieves an average 10% improvement on all data sets compared to the best baseline. 相似文献

4.

Real-time big data processing for anomaly detection: A Survey

《International Journal of Information Management》2019

The advent of connected devices and omnipresence of Internet have paved way for intruders to attack networks, which leads to cyber-attack, financial loss, information theft in healthcare, and cyber war. Hence, network security analytics has become an important area of concern and has gained intensive attention among researchers, off late, specifically in the domain of anomaly detection in network, which is considered crucial for network security. However, preliminary investigations have revealed that the existing approaches to detect anomalies in network are not effective enough, particularly to detect them in real time. The reason for the inefficacy of current approaches is mainly due the amassment of massive volumes of data though the connected devices. Therefore, it is crucial to propose a framework that effectively handles real time big data processing and detect anomalies in networks. In this regard, this paper attempts to address the issue of detecting anomalies in real time. Respectively, this paper has surveyed the state-of-the-art real-time big data processing technologies related to anomaly detection and the vital characteristics of associated machine learning algorithms. This paper begins with the explanation of essential contexts and taxonomy of real-time big data processing, anomalous detection, and machine learning algorithms, followed by the review of big data processing technologies. Finally, the identified research challenges of real-time big data processing in anomaly detection are discussed. 相似文献

5.

基于机器学习算法的研究热点趋势预测模型对比与分析——BP神经网络、支持向量机与LSTM模型 总被引：2，自引：0，他引：2

李静徐路路《现代情报》2019,39(4):23-33

[目的/意义]细粒度分析学科领域热点主题发展脉络并对利用机器学习算法对未来发展趋势进行准确预测研究。[方法/过程]提出一种基于机器学习算法的研究热点趋势预测方法与分析框架，以基因工程领域为例利用主题概率模型识别WOS核心集中论文摘要数据研究热点主题并进行主题演化关联构建，然后选取BP神经网络、支持向量机及LSTM模型等3种典型机器学习算法进行预测分析，最后利用RE指标和精准度指标评价机器学习算法预测效果并对基因工程领域在医药卫生、农业食品等方面研究趋势进行分析。[结果/结论]实验表明基于LSTM模型对热点主题未来发展趋势预测准确度最高，支持向量机预测效果次之，BP神经网络预测效果较差且预测稳定性不足，同时结合专家咨询和文献调研表明本文方法可快速识别基因领域研究主题及发展趋势，可为我国学科领域大势研判和架构调整提供决策支持和参考。相似文献

6.

基于LSTM神经网络的创新创业发展预测与区域发展比较——以陕西与四川为对象

查博《科技管理研究》2021,41(19):76-85

为使预测所得的数据更好的为创新创业服务,提出LSTM神经网络主导下的预测模型,训练与创新创业发展相关的数据.对反应创新创业发展水平的指标进行预测,并与传统回归模型及BP神经网络模型进行对比后发现LSTM模型的显示效果更好.在此基础上,通过对比陕西省和四川省这两个西部重要省份的创新创业发展情况,能够得出两个省份有着相近的创新创业总体发展水平,但具体发展细节与侧重点上则各有不同.四川在创新创业的发展中拥有着更好更大的基础性投入,而陕西在创新创业的发展中拥有着更高更强的技术产出水平,可以看出陕西创新创业发展的效率要比四川强大,较高的效率弥补了人力物力在投入数量上的不足. 相似文献

7.

Time-series classification with SAFE: Simple and fast segmented word embedding-based neural time series classifier

《Information processing & management》2022,59(5):103044

Dictionary-based classifiers are an essential group of approaches in the field of time series classification. Their distinctive characteristic is that they transform time series into segments made of symbols (words) and then classify time series using these words. Dictionary-based approaches are suitable for datasets containing time series of unequal length. The prevalence of dictionary-based methods inspired the research in this paper. We propose a new dictionary-based classifier called SAFE. The new approach transforms the raw numeric data into a symbolic representation using the Simple Symbolic Aggregate approXimation (SAX) method. We then partition the symbolic time series into a sequence of words. Then we employ the word embedding neural model known in Natural Language Processing to train the classifying mechanism. The proposed scheme was applied to classify 30 benchmark datasets and compared with a range of state-of-the-art time series classifiers. The name SAFE comes from our observation that this method is safe to use. Empirical experiments have shown that SAFE gives excellent results: it is always in the top 5%–10% when we rank the classification accuracy of state-of-the-art algorithms for various datasets. Our method ranks third in the list of state-of-the-art dictionary-based approaches (after the WEASEL and BOSS methods). 相似文献

8.

Multi-view clustering via spectral partitioning and local refinement

《Information processing & management》2016,52(4):618-627

Cluster analysis using multiple representations of data is known as multi-view clustering and has attracted much attention in recent years. The major drawback of existing multi-view algorithms is that their clustering performance depends heavily on hyperparameters which are difficult to set. In this paper, we propose the Multi-View Normalized Cuts (MVNC) approach, a two-step algorithm for multi-view clustering. In the first step, an initial partitioning is performed using a spectral technique. In the second step, a local search procedure is used to refine the initial clustering. MVNC has been evaluated and compared to state-of-the-art multi-view clustering approaches using three real-world datasets. Experimental results have shown that MVNC significantly outperforms existing algorithms in terms of clustering quality and computational efficiency. In addition to its superior performance, MVNC is parameter-free which makes it easy to use. 相似文献

9.

Real-world model for bitcoin price prediction

《Information processing & management》2022,59(4):102968

Cryptocurrency is a new sort of digital asset that has evolved as a result of advances in financial technology, and it has provided a significant research opportunity. There are many algorithms for price prediction for crypto currencies like LSTM and ARIMA. However, the downside is that LSTM-based RNNs are difficult to comprehend, and gaining intuition into their behavior is tough. In order to produce decent outcomes, rigorous hyperparameter adjustment is also essential. Furthermore, crypto currencies do not precisely adhere to past data, and patterns change fast, reducing the accuracy of predictions. Cryptocurrency price forecasting is difficult due to price volatility and dynamism. Because the data is dynamic and heavily influenced by various seasons, the ARIMA model is unable to handle seasonal data. In order to provide better price predictions for crypto traders, a new model is required. The objective of the study is to apply Fbprophet model as the key model because it is superior in functionality as compared to LSTM and ARIMA additionally removing the pitfalls generated in LSTM and ARIMA model while analyzing the cryptocurrency data. This study provides a methodology for predicting the future price of bitcoin that does not rely solely on past data due to seasonality in historical data. So, after fitting the seasonality and smoothing, the model is constructed that can be useful for real-world use cases. In case of crypto currencies where less historical data is available and it is hard to find pattern, proposed method can easily deal this type of problems. Overall difference between predicted and actual values is low as compared to other model even after seasonal data was available. 相似文献

10.

Towards a real-time processing framework based on improved distributed recurrent neural network variants with fastText for social big data analytics

《Information processing & management》2020,57(1):102122

Big data generated by social media stands for a valuable source of information, which offers an excellent opportunity to mine valuable insights. Particularly, User-generated contents such as reviews, recommendations, and users’ behavior data are useful for supporting several marketing activities of many companies. Knowing what users are saying about the products they bought or the services they used through reviews in social media represents a key factor for making decisions. Sentiment analysis is one of the fundamental tasks in Natural Language Processing. Although deep learning for sentiment analysis has achieved great success and allowed several firms to analyze and extract relevant information from their textual data, but as the volume of data grows, a model that runs in a traditional environment cannot be effective, which implies the importance of efficient distributed deep learning models for social Big Data analytics. Besides, it is known that social media analysis is a complex process, which involves a set of complex tasks. Therefore, it is important to address the challenges and issues of social big data analytics and enhance the performance of deep learning techniques in terms of classification accuracy to obtain better decisions.In this paper, we propose an approach for sentiment analysis, which is devoted to adopting fastText with Recurrent neural network variants to represent textual data efficiently. Then, it employs the new representations to perform the classification task. Its main objective is to enhance the performance of well-known Recurrent Neural Network (RNN) variants in terms of classification accuracy and handle large scale data. In addition, we propose a distributed intelligent system for real-time social big data analytics. It is designed to ingest, store, process, index, and visualize the huge amount of information in real-time. The proposed system adopts distributed machine learning with our proposed method for enhancing decision-making processes. Extensive experiments conducted on two benchmark data sets demonstrate that our proposal for sentiment analysis outperforms well-known distributed recurrent neural network variants (i.e., Long Short-Term Memory (LSTM), Bidirectional Long Short-Term Memory (BiLSTM), and Gated Recurrent Unit (GRU)). Specifically, we tested the efficiency of our approach using the three different deep learning models. The results show that our proposed approach is able to enhance the performance of the three models. The current work can provide several benefits for researchers and practitioners who want to collect, handle, analyze and visualize several sources of information in real-time. Also, it can contribute to a better understanding of public opinion and user behaviors using our proposed system with the improved variants of the most powerful distributed deep learning and machine learning algorithms. Furthermore, it is able to increase the classification accuracy of several existing works based on RNN models for sentiment analysis. 相似文献

11.

混沌时间序列混合预测方法探索

吕瑞华王卫亚 Lü Rui-hu WANG Wei-ya 《中国软科学》2006,(2):150-154

根据Kolmogorov连续性定理,本文建立了混沌—神经网络(C-ANN)预测模型;提出了基于遗传算法和神经网络的混沌预测模型与方法(C-ANN-GA混合预测方法);解决了混沌时间序列的非解析式预测问题;使混沌时间序列预测方法得到了新的改进和发展。相似文献

12.

基于独立分量分析与小波神经网络的时间序列预测

吴焱《中国科技纵横》2014,(8):27-28,31

对时间序列的预测是一项重要的数据挖掘技术。本文将独立分量分析方法和小波神经网络相结合,建立一种ICA—WNN预测模型,并应用于风力发电功率时间序列预测。仿真结果表明所建模型具有较好的泛化性能,得到了较高的预测精度。相似文献

13.

Temporal burstiness and collaborative camouflage aware fraud detection

《Information processing & management》2023,60(2):103170

With the prosperity and development of the digital economy, many fraudsters have emerged on e-commerce platforms to fabricate fraudulent reviews to mislead consumers’ shopping decisions for profit. Moreover, in order to evade fraud detection, fraudsters continue to evolve and present the phenomenon of adversarial camouflage and collaborative attack. In this paper, we propose a novel temporal burstiness and collaborative camouflage aware method (TBCCA) for fraudster detection. Specifically, we capture the hidden temporal burstiness features behind camouflage strategy based on the time series prediction model, and identify highly suspicious target products by assigning suspicious scores as node priors. Meanwhile, a propagation graph integrating review collusion is constructed, and an iterative fraud confidence propagation algorithm is designed for inferring the label of nodes in the graph based on Loop Belief Propagation (LBP). Comprehensive experiments are conducted to compare TBCCA with state-of-the-art fraudster detection approaches, and experimental results show that TBCCA can effectively identify fraudsters in real review networks with achieving 6%–10% performance improvement than other baselines. 相似文献

14.

Fault detection and isolation of multi-variate time series data using spectral weighted graph auto-encoders

《Journal of The Franklin Institute》2023,360(10):6783-6803

Fault or anomaly detection is one of the key problems faced by the chemical process industry for achieving safe and reliable operation. In this study, a novel methodology, spectral weighted graph autoencoder (SWGAE) is proposed, wherein, the problem of anomaly detection is addressed with the help of graphs. The proposed approach entails the following key steps. Firstly, constructing a spectral weighted graph, where each time step of a process variable in the multivariate time series dataset is modelled as a node in an appropriately tuned moving window. Subsequently, we propose to monitor the weights of the edges between two nodes that make a connection. The faulty instances are identified based on the discrepancy in the weight pattern compared to normal operating data. To this end, once the weights are determined, they are fed to the auto-encoder network, where reconstruction loss is calculated, and faults are identified if the reconstruction loss exceeds a threshold. Further, to make the proposed approach comprehensive, a fault isolation methodology is also proposed to identify the faulty nodes once the faulty variables are identified. The proposed approach is validated using Tennessee-Eastman benchmark data and pressurized heavy water nuclear reactor real-time plant data. The results indicate that the SWGAE method, when compared to the other state-of-the-art methods, yielded more accurate results in correctly detecting faulty nodes and isolating them. 相似文献

15.

Estimation fusion of nonlinear cost functions with application to multisensory Kalman filtering

Il Young Song Vladimir Shin Seokhyoung Lee Won Choi 《Journal of The Franklin Institute》2014

This paper focuses on four fusion algorithms for the estimation of nonlinear cost function (NCF) in a multisensory environment. In multisensory filtering and control problems, NCF represents a nonlinear multivariate functional of state variables, which can indicate useful information of the target systems for automatic control. To estimate the NCF using multisensory information, we propose one centralized and three decentralized estimation fusion algorithms. For multivariate polynomial NCFs, we propose a simple closed-form computation procedure. For general NCFs, the most popular procedure for the evaluation of their estimates is based on the unscented transformation. The effectiveness and estimation accuracy of the proposed fusion algorithms are demonstrated with theoretical and numerical examples. 相似文献

16.

LA-MGFM: A legal judgment prediction method via sememe-enhanced graph neural networks and multi-graph fusion mechanism

《Information processing & management》2023,60(5):103455

相似文献

17.

Unsupervised graph-based rank aggregation for improved retrieval

《Information processing & management》2019,56(4):1260-1279

This paper presents a robust and comprehensive graph-based rank aggregation approach, used to combine results of isolated ranker models in retrieval tasks. The method follows an unsupervised scheme, which is independent of how the isolated ranks are formulated. Our approach is able to combine arbitrary models, defined in terms of different ranking criteria, such as those based on textual, image or hybrid content representations.We reformulate the ad-hoc retrieval problem as a document retrieval based on fusion graphs, which we propose as a new unified representation model capable of merging multiple ranks and expressing inter-relationships of retrieval results automatically. By doing so, we claim that the retrieval system can benefit from learning the manifold structure of datasets, thus leading to more effective results. Another contribution is that our graph-based aggregation formulation, unlike existing approaches, allows for encapsulating contextual information encoded from multiple ranks, which can be directly used for ranking, without further computations and post-processing steps over the graphs. Based on the graphs, a novel similarity retrieval score is formulated using an efficient computation of minimum common subgraphs. Finally, another benefit over existing approaches is the absence of hyperparameters.A comprehensive experimental evaluation was conducted considering diverse well-known public datasets, composed of textual, image, and multimodal documents. Performed experiments demonstrate that our method reaches top performance, yielding better effectiveness scores than state-of-the-art baseline methods and promoting large gains over the rankers being fused, thus demonstrating the successful capability of the proposal in representing queries based on a unified graph-based model of rank fusions. 相似文献

18.

Computing controversy: Formal model and algorithms for detecting controversy on Wikipedia and in search queries

Kazimierz Zielinski Radoslaw Nielek Adam Wierzbicki Adam Jatowt 《Information processing & management》2018,54(1):14-36

Controversy is a complex concept that has been attracting attention of scholars from diverse fields. In the era of Internet and social media, detecting controversy and controversial concepts by the means of automatic methods is especially important. Web searchers could be alerted when the contents they consume are controversial or when they attempt to acquire information on disputed topics. Presenting users with the indications and explanations of the controversy should offer them chance to see the “wider picture” rather than letting them obtain one-sided views. In this work we first introduce a formal model of controversy as the basis of computational approaches to detecting controversial concepts. Then we propose a classification based method for automatic detection of controversial articles and categories in Wikipedia. Next, we demonstrate how to use the obtained results for the estimation of the controversy level of search queries. The proposed method can be incorporated into search engines as a component responsible for detection of queries related to controversial topics. The method is independent of the search engine’s retrieval and search results recommendation algorithms, and is therefore unaffected by a possible filter bubble.Our approach can be also applied in Wikipedia or other knowledge bases for supporting the detection of controversy and content maintenance. Finally, we believe that our results could be useful for social science researchers for understanding the complex nature of controversy and in fostering their studies. 相似文献

19.

Dependence measures for model selection in singular spectrum analysis

《Journal of The Franklin Institute》2019,356(15):8906-8928

Selection of optimal dimension of trajectory matrix in singular spectrum analysis plays an important role in signal reconstruction from noisy time series. A noisy time series is embedded into a Hankel matrix and the dimension of this matrix depends on the window length considered for a time series. The window length requirement of a time series depends on its underlying data generating mechanism. Since the number of columns in a Hankel structured trajectory matrix is a function of number of rows (window length), dimension dependency occurs naturally in the trajectory matrix and this dependency is characterized by the statistical properties of a time series. In this paper, we develop an entropy based dimension dependency measure that accounts for changes in information content in the matrix in response to changes in window length for a time series. We examine the performance of this measure by using simulation experiments and analyzing real data sets. Results obtained from simulation experiments show that the dimension dependency measure finds reasonably meaningful dimension of the trajectory matrix and provides better forecasting outcome when applied to some popular climatic time series and production indices. 相似文献

20.

不同力量对比供应链中两种VMI模式下的决策问题研究

秦娟娟《软科学》2011,25(5):41-46

利用博弈论分析了力量对等和力量不对等两种力量结构的供应链中,两种VM I模式下成员的决策问题:VMI-模式中,零售商承担库存持有成本;VMI+模式中,供应商承担库存持有成本。首先以VMI运营模式为出发点,分析了在VMI-和VMI+模式下不同力量结构的供应链中成员的决策问题;而后横向分析在某种力量结构的供应链中对其成员来说最优的VMI模式。相似文献