首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 718 毫秒
1.
In recent years, sparse subspace clustering (SSC) has been witnessed to its advantages in subspace clustering field. Generally, the SSC first learns the representation matrix of data by self-expressive, and then constructs affinity matrix based on the obtained sparse representation. Finally, the clustering result is achieved by applying spectral clustering to the affinity matrix. As described above, the existing SSC algorithms often learn the sparse representation and affinity matrix in a separate way. As a result, it may not lead to the optimum clustering result because of the independence process. To this end, we proposed a novel clustering algorithm via learning representation and affinity matrix conjointly. By the proposed method, we can learn sparse representation and affinity matrix in a unified framework, where the procedure is conducted by using the graph regularizer derived from the affinity matrix. Experimental results show the proposed method achieves better clustering results compared to other subspace clustering approaches.  相似文献   

2.
High-resolution probabilistic load forecasting can comprehensively characterize both the uncertainties and the dynamic trends of the future load. Such information is key to the reliable operation of the future power grid with a high penetration of renewables. To this end, various high-resolution probabilistic load forecasting models have been proposed in recent decades. Compared with a single model, it is widely acknowledged that combining different models can further enhance the prediction performance, which is called the model ensemble. However, existing model ensemble approaches for load forecasting are linear combination-based, like mean value ensemble, weighted average ensemble, and quantile regression, and linear combinations may not fully utilize the advantages of different models, seriously limiting the performance of the model ensemble. We propose a learning ensemble approach that adopts the machine learning model to directly learn the optimal nonlinear combination from data. We theoretically demonstrate that the proposed learning ensemble approach can outperform conventional ensemble approaches. Based on the proposed learning ensemble model, we also introduce a Shapley value-based method to evaluate the contributions of each model to the model ensemble. The numerical studies on field load data verify the remarkable performance of our proposed approach.  相似文献   

3.
The class distribution of imbalanced data sets is skewed in practical application. As traditional clustering methods mainly are designed for improving the overall learning performance, the majority class usually tends to be clustered and the minority class which is more valuable maybe ignored. Moreover, existing clustering methods can be limited for the performance of imbalanced and high-dimensional domains. In this paper, we present one-step spectral rotation clustering for imbalanced high-dimensional data (OSRCIH) by integrating self-paced learning and spectral rotation clustering in a unified learning framework, where sample selection and dimensionality reduction are simultaneously considered with mutual and iterative update. Specifically, the imbalance problem is considered by selecting the same number of training samples from each intrinsic group of the training data, where the sample-weight vector is obtained by self-paced learning. Moreover, dimensionality reduction is conducted by combining subspace learning and feature selection. Experimental analysis on synthetic datasets and real datasets showed that OSRCIH could recognize and enhance the weight of important samples and features so as to avoid the clustering method in favor of the majority class and to improve effectively the clustering performance.  相似文献   

4.
The Hammerstein–Wiener model is a nonlinear system with three blocks where a dynamic linear block is sandwiched between two static nonlinear blocks. For parameter learning of the Hammerstein–Wiener model, the synchronous parameter learning methods are proposed to learn the model parameters by constructing hybrid model of the three series block, such as over parameterization method, subspace method and maximum likelihood method. It should be pointed out that the aforementioned methods appeared the product term of model parameters in the process of parameter learning, and parameter separation method is further adopted to separate hybrid parameters, which increases the complexity of parameter learning. To address this issue, a novel three-stage parameter learning method of the neuro-fuzzy based Hammerstein–Wiener model corrupted by process noise using combined signals is developed in this paper. The combined signals are designed to completely separate the parameter learning issues of the static input nonlinear block, the linear dynamic block and the static output nonlinear block, which effectively simplifies the process of parameter learning of the Hammerstein–Wiener model. Parameter learning of the Hammerstein–Wiener model are summarized into the following three aspects: The first one is to learn the output static nonlinear block parameters using two sets of separable signals with different sizes. The second one is to estimate the linear dynamic block parameters by means of the correlation analysis method, the unmeasurable intermediate variable information problem is effectively handled. The final one is to determine the parameters of the static input nonlinear block and the moving average noise model using recursive extended least square scheme. The simulation results are presented to illustrate that the proposed learning approach yields high learning accuracy and good robustness for the Hammerstein–Wiener model corrupted by process noise.  相似文献   

5.
Specific emitter identification (SEI), as an important problem in situational awareness, identifies emitters via unique characteristics. However, current SEI methods mostly suffer from appropriately setting the trade-off between comprehensiveness and efficiency when extracting fingerprint features. To address the issue, this paper provides a novel SEI framework with a separate representation module. Within the novel framework, manifolds are proposed to be signal representations and multi-level manifold features are extracted as fingerprint features. We first build the SEI model from the nonlinear dynamic perspective, where the SEI process identifies the nonlinear systems via a measurement sequence. Then, we demonstrate that manifolds can represent emitters equivalently and prove the one-to-one correspondence between manifolds and emitter individuals. Hence, manifolds can highlight unique nonlinear dynamic characteristics and simultaneously describe comprehensive system working processes. The coordinate delayed technique and manifold learning methods are employed to reconstruct the phase space and manifold, respectively. For accomplishing the identification task, multi-level manifold features, comprising intrinsic dimension, topological features, conformal features, and Riemannian metric features, are extracted from the reconstructed manifolds and input to an ensemble learning scheme, named Adaboost. Extensive simulation and real-world experiments agree with our analytical conclusions and confirm the proposed method’s efficiency. The results also demonstrate that the proposed method achieves a high recognition accuracy, outstanding adaptability, and strong robustness.  相似文献   

6.
CLIQUE是一种基于密度和基于网格的混合聚类方法。在高维空间中,它能够有效地进行聚类,并且能够发现嵌套在高维数据空间子空间中的聚类。但是,CLIQUE算法存在着很多的局限性,主要有以下两点:首先是子空间的剪枝;其次是CLIQUE算法追求方法简单化。针对CLIQUE算法的局限性,采用基于约束条件的聚类技术、自适应网格技术和边界调整技术来对CLIQUE算法进行改进,提出了基于约束条件和自适应网格的CAG-CLIQUE算法。  相似文献   

7.
面对电力系统中海量的多维数据,传统的可视化数据挖掘无法满足空间数据处理的需要,多维数据可视化也不利于用户获取知识。因此提出了基于SOM(自组织特征映射网络)聚类的电网可视化数据挖掘新模型VSDMmodel,模型利用改进的SOM聚类算法对高维电网数据进行降维,提出一种基于颜色映射的可视化方法,对聚类结果进行低维展现,加快了用户对挖掘结果的理解,并且允许用户对结果中感兴趣的区域加以深入分析,实现对电力系统海量数据的可视化挖掘。  相似文献   

8.
Nonlinear characteristic widely exists in industrial processes. Many approaches based on kernel methods and machine learning have been developed for nonlinear process monitoring. However, the fault isolation for nonlinear processes has rarely been studied in previous works. In this paper, a process monitoring and fault isolation framework is proposed for nonlinear processes using variational autoencoder (VAE) model. First, based on the probability graph model of VAE, a uniform monitoring index can be calculated by the probability density of observation variables. Then, the fault variables are estimated with normal variables by a missing value estimation method. The optimal fault variable set can be searched by branch and bound (BAB) algorithm. The proposed method can resolve the ”smearing effects” problem existing in traditional fault isolation methods. Finally, a numerical case and a hot strip mill process case are used to verified the proposed method.  相似文献   

9.
Similarity search with hashing has become one of the fundamental research topics in computer vision and multimedia. The current researches on semantic-preserving hashing mainly focus on exploring the semantic similarities between pointwise or pairwise samples in the visual space to generate discriminative hash codes. However, such learning schemes fail to explore the intrinsic latent features embedded in the high-dimensional feature space and they are difficult to capture the underlying topological structure of data, yielding low-quality hash codes for image retrieval. In this paper, we propose an ordinal-preserving latent graph hashing (OLGH) method, which derives the objective hash codes from the latent space and preserves the high-order locally topological structure of data into the learned hash codes. Specifically, we conceive a triplet constrained topology-preserving loss to uncover the ordinal-inferred local features in binary representation learning. By virtue of this, the learning system can implicitly capture the high-order similarities among samples during the feature learning process. Moreover, the well-designed latent subspace learning is built to acquire the noise-free latent features based on the sparse constrained supervised learning. As such, the latent under-explored characteristics of data are fully employed in subspace construction. Furthermore, the latent ordinal graph hashing is formulated by jointly exploiting latent space construction and ordinal graph learning. An efficient optimization algorithm is developed to solve the resulting problem to achieve the optimal solution. Extensive experiments conducted on diverse datasets show the effectiveness and superiority of the proposed method when compared to some advanced learning to hash algorithms for fast image retrieval. The source codes of this paper are available at https://github.com/DarrenZZhang/OLGH .  相似文献   

10.
Deep multi-view clustering (MVC) is to mine and employ the complex relationships among views to learn the compact data clusters with deep neural networks in an unsupervised manner. The more recent deep contrastive learning (CL) methods have shown promising performance in MVC by learning cluster-oriented deep feature representations, which is realized by contrasting the positive and negative sample pairs. However, most existing deep contrastive MVC methods only focus on the one-side contrastive learning, such as feature-level or cluster-level contrast, failing to integrating the two sides together or bringing in more important aspects of contrast. Additionally, most of them work in a separate two-stage manner, i.e., first feature learning and then data clustering, failing to mutually benefit each other. To fix the above challenges, in this paper we propose a novel joint contrastive triple-learning framework to learn multi-view discriminative feature representation for deep clustering, which is threefold, i.e., feature-level alignment-oriented and commonality-oriented CL, and cluster-level consistency-oriented CL. The former two submodules aim to contrast the encoded feature representations of data samples in different feature levels, while the last contrasts the data samples in the cluster-level representations. Benefiting from the triple contrast, the more discriminative representations of views can be obtained. Meanwhile, a view weight learning module is designed to learn and exploit the quantitative complementary information across the learned discriminative features of each view. Thus, the contrastive triple-learning module, the view weight learning module and the data clustering module with these fused features are jointly performed, so that these modules are mutually beneficial. The extensive experiments on several challenging multi-view datasets show the superiority of the proposed method over many state-of-the-art methods, especially the large improvement of 15.5% and 8.1% on Caltech-4V and CCV in terms of accuracy. Due to the promising performance on visual datasets, the proposed method can be applied into many practical visual applications such as visual recognition and analysis. The source code of the proposed method is provided at https://github.com/ShizheHu/Joint-Contrastive-Triple-learning.  相似文献   

11.
This study employs our proposed semi-supervised clustering method called Constrained-PLSA to cluster tagged documents with a small amount of labeled documents and uses two data sets for system performance evaluations. The first data set is a document set whose boundaries among the clusters are not clear; while the second one has clear boundaries among clusters. This study employs abstracts of papers and the tags annotated by users to cluster documents. Four combinations of tags and words are used for feature representations. The experimental results indicate that almost all of the methods can benefit from tags. However, unsupervised learning methods fail to function properly in the data set with noisy information, but Constrained-PLSA functions properly. In many real applications, background knowledge is ready, making it appropriate to employ background knowledge in the clustering process to make the learning more fast and effective.  相似文献   

12.
针对形状特征,提出了一种基于主动式边界基元模型的多类目标自动识别方法. 该方法以主动式边界基元为基础构建字典,可准确描述各类目标的形状结构, 不受尺度、旋转等变化的影响;然后,综合分析上下文信息进行概率学习,采用级联框架和Bootstrap动态采样训练最优边界分类器,实现目标的类别识别和位置定位,并可获取精确形状. 实验结果表明,该方法能有效提取多种类型和复杂结构的目标,具有较强的实用价值.  相似文献   

13.
With an increase in the number of data instances, data processing operations (e.g. clustering) requires an increasing amount of computational resources, and it is often the case that for considerably large datasets such operations cannot be executed on a single workstation. This requires the use of a server computer for carrying out the operations. However, to ensure privacy of the shared data, a privacy preserving data processing workflow involves applying an encoding transformation on the set of data points prior to applying the computation. This encoding should ideally cater to two objectives—first, it should be difficult to reconstruct the data, second, the results of the operation executed on the encoded space should be as close as possible to the results of the same operation executed on the original data. While standard encoding mechanisms, such as locality sensitive hashing, caters to the first objective, the second objective may not always be adequately satisfied.In this paper, we specifically focus on ‘clustering’ as the data processing operation. We apply a deep metric learning approach to learn a parameterized encoding transformation function with an objective to maximize the alignment of the clusters in the encoded space to those in the original data. We conduct experimentation on four standard benchmark datasets, particularly MNIST, Fashion-MNIST (each dataset contains 70K grayscale images), CIFAR-10 consisting of 60K color images and 20-Newsgroups containing 18K news articles. Our experiments demonstrate that the proposed method yields better clusters in comparison to approaches where the encoding process is agnostic of the clustering objective.  相似文献   

14.
In this paper, the subspace identification based robust fault prediction method which combines optimal track control with adaptive neural network compensation is presented for prediction the fault of unknown nonlinear system. At first, the local approximate linear model based on input-output of unknown system is obtained by subspace identification. The optimal track control is adopted for the approximate model with some unknown uncertainties and external disturbances. An adaptive RBF neural network is added to the track control in order to guarantee the robust tracking ability of the observation system. The effect of the system nonlinearity and the error caused by subspace modeling can be overcome by adaptive tuning of the weights of the RBF neural network online without any requisition of constraint or matching conditions. The stability of the designed closed-loop system is thus proved. A density function estimation method based on state forecasting is then used to judge the fault. The proposed method is applied to fault prediction of model-unknown fighter F-8II of China airforce and the simulation results show that the proposed method can not only predict the fault, but has strong robustness against uncertainties and external disturbances.  相似文献   

15.
For new product development, previous segmentation methods based on demographic, psychographic, and purchase behavior information cannot identify a group of customers with unsatisfied needs. Moreover, segmentation is limited to sales promotions in marketing. Although needs-based segmentation considering customer sentiments on product features can be conducted to develop a new product concept, it cannot identify commonalities among customers owing to their diverse preferences. Therefore, this paper proposes an interpretable machine learning-based approach for customer segmentation for new product development based on the importance of product features from online product reviews. The technical challenges of determining the importance of product features in each review are identifying and interpreting the nonlinear relations between satisfaction with product features and overall customer satisfaction. In this study, interpretable machine learning is used to identify these nonlinear relations with high performance and transparency. A case study on a wearable device is conducted to validate the proposed approach. Customer segmentation using the proposed approach based on importance is compared with that employing a previous approach based on sentiments. The results show that the proposed approach presents a higher clustering performance than the previous approach and offers opportunities to identify new product concepts.  相似文献   

16.
基于马尔可夫模型的图书馆用户聚类分群方法研究   总被引:1,自引:0,他引:1       下载免费PDF全文
吴艳玲  孙思阳 《情报科学》2021,39(11):167-172
【目的/意义】针对图书馆用户群体聚类分群不稳定且错误率较高的问题,提出基于马尔可夫模型的图书馆 用户聚类分群方法,提升图书馆用户聚类分群精准度。【方法/过程】采用一阶马尔可夫混合模型构建用户动作序列 模型,通过模型产生用户行为聚类,体现用户动作的动态性,采用自适应自然梯度算法,依据用户行为分离状态自 适应调整自身步长,优化模型参数学习中模型自动选择问题,实现最佳图书馆用户聚类分群。【结果/结论】通过实 验结果能够证明,实际聚类数量小于L值时,提出方法能够实现参数学习过程中模型的自动选择。提出方法的分群 数量最多,能够划分出最大的取值区间,聚类错误率最低为0.22%,聚类性能比较稳定,分群结果更加精准,达到了 设计的预期。【创新/局限】采用一阶马尔可夫混合模型实现了图书馆用户聚类分群。后续将进一步研究可考虑用 户序列间关联的高阶马尔可夫分量模型,以提高分群算法的准确性和稳定性。  相似文献   

17.
In this paper, a robust adaptive control scheme is proposed for the leader following control of a class of fractional-order multi-agent systems (FMAS). The asymptotic stability is shown by a linear matrix inequality (LMI) approach. The nonlinear dynamics of the agents are assumed to be unknown. Moreover, the communication topology among the agents is assumed to be unknown and time-varying. A deep general type-2 fuzzy system (DGT2FS) using restricted Boltzmann machine (RMB) and contrastive divergence (CD) learning algorithm is proposed to estimate uncertainties. The simulation studies presented indicate that the proposed control method results in good performance under time-varying topology, unknown dynamics and external disturbances. The effectiveness of the proposed DGT2FS is verified also on modeling problems with high dimensional real-world data sets.  相似文献   

18.
With the popularity of social platforms such as Sina Weibo, Tweet, etc., a large number of public events spread rapidly on social networks and huge amount of textual data are generated along with the discussion of netizens. Social text clustering has become one of the most critical methods to help people find relevant information and provides quality data for subsequent timely public opinion analysis. Most existing neural clustering methods rely on manual labeling of training sets and take a long time in the learning process. Due to the explosiveness and the large-scale of social media data, it is a challenge for social text data clustering to satisfy the timeliness demand of users. This paper proposes a novel unsupervised event-oriented graph clustering framework (EGC), which can achieve efficient clustering performance on large-scale datasets with less time overhead and does not require any labeled data. Specifically, EGC first mines the potential relations existing in social text data and transforms the textual data of social media into an event-oriented graph by taking advantage of graph structure for complex relations representation. Secondly, EGC uses a keyword-based local importance method to accurately measure the weights of relations in event-oriented graph. Finally, a bidirectional depth-first clustering algorithm based on the interrelations is proposed to cluster the nodes in event-oriented graph. By projecting the relations of the graph into a smaller domain, EGC achieves fast convergence. The experimental results show that the clustering performance of EGC on the Weibo dataset reaches 0.926 (NMI), 0.926 (AMI), 0.866 (ARI), which are 13%–30% higher than other clustering methods. In addition, the average query time of EGC clustered data is 16.7ms, which is 90% less than the original data.  相似文献   

19.
In this paper we develop a new framework for time series segmentation based on a Hierarchical Linear Dynamical System (HLDS), and test its performance on monophonic and polyphonic musical note recognition. The center piece of our approach is the inclusion of constraints in the filter topology, instead of on the cost function as normally done in machine learning. Just by slowing down the dynamics of the top layer of an augmented (multilayer) state model, which is still compatible with the recursive update equation proposed originally by Kalman, the system learns directly from data all the musical notes, without labels, effectively creating a time series clustering algorithm that does not require segmentation. We analyze the HLDS properties and show that it provides better classification accuracy compared to current state-of-the-art approaches.  相似文献   

20.
把主分量分析(PCA)方法和自组织特征映射网络(SOM)相结合,应用到基因数据聚类分析中。首先对基因数据集进行PCA分析,提取出少量的特征主分量,再对数据集进行降维。这些主分量基本上可以反映原数据集的综合信息,然后应用SOM网络对得到的特征分量进行聚类分析,把相似的基因划分到一个区域。实验结果表明,与单一地选用SOM网络进行聚类分析相比,该方法有较高的分类正确率及较为清晰的分类边界,是一种非常有效的聚类分析方法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号