首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Information filtering (IF) systems usually filter data items by correlating a set of terms representing the user’s interest (a user profile) with similar sets of terms representing the data items. Many techniques can be employed for constructing user profiles automatically, but they usually yield large sets of term. Various dimensionality-reduction techniques can be applied in order to reduce the number of terms in a user profile. We describe a new terms selection technique including a dimensionality-reduction mechanism which is based on the analysis of a trained artificial neural network (ANN) model. Its novel feature is the identification of an optimal set of terms that can classify correctly data items that are relevant to a user. The proposed technique was compared with the classical Rocchio algorithm. We found that when using all the distinct terms in the training set to train an ANN, the Rocchio algorithm outperforms the ANN based filtering system, but after applying the new dimensionality-reduction technique, leaving only an optimal set of terms, the improved ANN technique outperformed both the original ANN and the Rocchio algorithm.  相似文献   

2.
张景素  魏明珠 《情报科学》2022,40(10):164-170
【目的/意义】旨在研究少量标注样本构建古文断句模型,减少在模型训练过程中样本标注所需的成本,为 探索数字技术与人文学科的融合发展提供崭新的思路。【方法/过程】从古文样本的不确定性和多样性出发,提出一 种加权多策略选样方法,有效结合了 BERT-BiLSTM-CRF、BERT-CRF等古文断句模型。通过引入信息熵和相 似性等概念,深入分析古籍文本的不确定性和多样性,运用加权计算评估古文样本对模型训练的价值高低,对加权 多策略方法所筛选的有价值样本进行人工标注,同时更新到训练集进行模型迭代训练。【结果/结论】以古籍《宋史》 为例进行研究,所提出的方法分别在 BERT-BiLSTM-CRF、BERT-CRF等古文断句模型训练过程中减少原来训 练样本量的50%、55%,进一步验证了方法的有效性。【创新/局限】加权多策略选样的方法为古文断句模型训练提供 了一种新思路,未来将探索该方法在古籍整理中其他任务的适用性。  相似文献   

3.
With the continuous growth in the amount of data generated in the edge-cloud environment, security risks in traditional centralized data management platforms have been concerned. Blockchain technology can be applied to guarantee safety and information transparency in data caching and trading processes. Therefore, a blockchain-based secure cost-aware data caching scheme is proposed to optimize the placement and prevent the tampering of cache data. In this scheme, under the constraints of transmission cost, edge cache size, a quantum particle swarm optimization (QPSO) algorithm is used to solve the data cache placement problem with the greatest content caching gain. A blockchain-based secure decentralized data trading model is proposed to solve the trust problem among the buyers, sellers, and agent nodes and increase incentives for users to trade data. A double auction mechanism is used to maximize social welfare. The experimental results reveal that the proposed data caching and trading scheme can reduce the data transmission cost, improve the cache hit ratio, and maximize social welfare.  相似文献   

4.
为去除网络入侵数据集中的冗余和噪声特征,降低数据处理难度和提高检测性能,提出一种基于特征选择和支持向量机的入侵检测方法。该方法采用提出的特征选择算法选取最优特征组合,并以支持向量机为分类器建立模型,应用于入侵检测系统。仿真结果表明,本文方法不仅可以减少特征维数,降低训练和测试时间,还能提高入侵检测的分类准确率。  相似文献   

5.
An hybrid uninterrupted multi-speed transmission (HUMST), based on the integration of a planetary gear set and a 3-speed automatic manual transmission (3-AMT), is developed to satisfy the specific performance indexes of mining trucks. The power-split device can alleviate and eliminate the inherent torque interruption of the 3-AMT during gear shift by implementing the designed cooperative shift control strategy which is optimized by quadratic performance index. In order to achieve fast torque coordination while guaranteeing the driving comfort performance, the torque profiles of the power split device and the traction motor are optimized by Linear-quadratic regulator (LQR) algorithm. Dynamic programming (DP) is implemented as a benchmark to demonstrate the maximum fuel efficiency of the proposed HUMST. Because of the high computational cost of optimal control strategies such as DP, an improved real-time control strategy (IRTCS) using modified Gaussian distribution function is proposed to significantly reduce the computing load. As efficiency-oriented energy control strategy would result in frequent gear shifts, to achieve a desirable tradeoff between the overall efficiency and the shift stability, multi-objective genetic algorithm (MGA) is integrated to optimize the overall performance. The detail mathematical and dynamic model shows that the proposed shifting strategy with LQR can effectively suppress shift jerk, and the proposed IRTCS with MGA can reduce shift frequency by 70.78% to improve the drivability, only sacrificing 4.86% of overall efficiency compared to that of DP.  相似文献   

6.
一种基于二值模式特征的人脸检测算法   总被引:1,自引:0,他引:1  
提出了一种改进的基于多区块局部二值模式(MB-LBP)特征的人脸检测算法.算法针对Ad aBoost算法训练过程中出现的权值分布扭曲的现象,对样本权值的更新规则进行了调整.实验结果表明,该方法有效地缩短了训练时间,而且避免了权值扭曲的现象.算法在保证检测率的同时降低了误检率.  相似文献   

7.
一种基于k最近邻的快速文本分类方法   总被引:5,自引:0,他引:5  
k最近邻方法是一种简单而有效的文本分类方法,但是传统的k最近邻分类方法在搜索k个最近邻时需要高强度的相似性计算,尤其是在训练集数据量很大情况下,全局的最优搜索几乎是不可能的.因此,加速k个最近邻的搜索是k最近邻方法实用的关键.本文提出了一种基于k最近邻的快速文本分类方法,它能够保证在海量数据集中进行快速有效的分类.实验结果表明这一方法较传统方法性能有显著提升.  相似文献   

8.
Breast cancer is one of the leading causes of death among women worldwide. Accurate and early detection of breast cancer can ensure long-term surviving for the patients. However, traditional classification algorithms usually aim only to maximize the classification accuracy, failing to take into consideration the misclassification costs between different categories. Furthermore, the costs associated with missing a cancer case (false negative) are clearly much higher than those of mislabeling a benign one (false positive). To overcome this drawback and further improving the classification accuracy of the breast cancer diagnosis, in this work, a novel breast cancer intelligent diagnosis approach has been proposed, which employed information gain directed simulated annealing genetic algorithm wrapper (IGSAGAW) for feature selection, in this process, we performs the ranking of features according to IG algorithm, and extracting the top m optimal feature utilized the cost sensitive support vector machine (CSSVM) learning algorithm. Our proposed feature selection approach which can not only help to reduce the complexity of SAGASW algorithm and effectively extracting the optimal feature subset to a certain extent, but it can also obtain the maximum classification accuracy and minimum misclassification cost. The efficacy of our proposed approach is tested on Wisconsin Original Breast Cancer (WBC) and Wisconsin Diagnostic Breast Cancer (WDBC) breast cancer data sets, and the results demonstrate that our proposed hybrid algorithm outperforms other comparison methods. The main objective of this study was to apply our research in real clinical diagnostic system and thereby assist clinical physicians in making correct and effective decisions in the future. Moreover our proposed method could also be applied to other illness diagnosis.  相似文献   

9.
创业战略是新企业获取竞争优势的关键手段,现有研究对这一问题展开了初步探讨,但存在不足,仍需要进一步完善。一方面以新企业作为对象来剖析创业战略作用机理的研究缺乏;另一方面,转型情境的关注不足,即转型环境特征对创业战略的作用研究不足。以现有研究为基础,结合新企业特点和中国情境独特性,探讨投机导向、创业战略与新企业竞争优势之间的关系。通过对226家新企业的数据分析,实证结果表明:投机导向和创业战略对新企业竞争优势都会产生积极影响,并且产品创新战略和营销差异化战略在投机导向与竞争优势之间起到部分中介作用。研究成果进一步拓展了战略管理和创业研究领域,并为新企业发展提供了实践指导意见。  相似文献   

10.
互联网应用缓存加速解决方案   总被引:2,自引:0,他引:2  
文章介绍基于应用缓存和传输优化为核心的解决方案,可以更好的解决运营商在行业用户网络建设和发展过程中所面临的困境,提升运营商网络服务质量和降低运营成本,提升关键业务的服务质量,同时加强网络的网络流量的监控和管理。  相似文献   

11.
为了避免预防性维修的决策判断过于片面,本文首先将系统劣化状态和能源效率指标相结合,建立了具有生态意识的二维视情维修决策模型,即当系统的劣化状态超过其阈值或者能源效率指标超过其阈值时(两者发生其一即可),对系统进行预防性维修;然后将系统运行过程中的能源消耗成本纳入总成本中,并以单位有用产出的平均期望成本为目标函数建立优化模型;最后利用蒙特卡洛仿真和模拟退火算法进行算例分析。结果表明:与传统的只基于劣化状态或者能源效率指标的一维视情维修决策模型相比,本文新提出的二维视情维修决策模型更优,不仅可以降低企业维修成本,而且可以节约能源,满足可持续发展的时代要求。  相似文献   

12.
In recent years, distributed algorithms have been increasingly used to solve the economic dispatch (ED) problem of multi-energy systems (MES) due to the advantages of high flexibility, strong robustness, and privacy. However, the MES based on the distributed optimization architecture must bear higher cyber-attack risks, so as to maintain the safe and stable operation of MES. To address this issue, an event-triggered fully distributed algorithm is proposed to solve the ED problem, which can effectively mitigate the communication burden. On this basis, an attack resilient strategy against false data injection (FDI) attacks is implemented in the proposed fully distributed algorithm, which can eliminate incorrect measurement of incremental cost and power generation data caused by cyber-attacks. In addition, a reputation value protocol embedded in the proposed attack resilient strategy is designed to effectively reduce the potential of direct isolation of the node. Finally, case studies are given in this paper to validate the effectiveness of the proposed distributed control scheme on a 9-bus MES.  相似文献   

13.
不正常航班恢复模型的贪婪模拟退火算法研究   总被引:4,自引:0,他引:4  
唐小卫  高强  朱金福 《预测》2010,29(1):66-70
为解决不正常航班恢复对航空公司带来的严重影响,研究了不正常航班恢复模型及其优化算法,对现有不正常航班恢复优化模型提出适当改进,重点设计了一种贪婪模拟退火算法。算法融合了GRASP和模拟退火算法的特点,提高了领域解的选择效率并且降低了陷入局部最优解的概率。实例证明这种算法可以处理大规模的不正常航班恢复问题,并且能够达到时间代价与结果质量的均衡。  相似文献   

14.
Imbalanced sample distribution is usually the main reason for the performance degradation of machine learning algorithms. Based on this, this study proposes a hybrid framework (RGAN-EL) combining generative adversarial networks and ensemble learning method to improve the classification performance of imbalanced data. Firstly, we propose a training sample selection strategy based on roulette wheel selection method to make GAN pay more attention to the class overlapping area when fitting the sample distribution. Secondly, we design two kinds of generator training loss, and propose a noise sample filtering method to improve the quality of generated samples. Then, minority class samples are oversampled using the improved RGAN to obtain a balanced training sample set. Finally, combined with the ensemble learning strategy, the final training and prediction are carried out. We conducted experiments on 41 real imbalanced data sets using two evaluation indexes: F1-score and AUC. Specifically, we compare RGAN-EL with six typical ensemble learning; RGAN is compared with three typical GAN models. The experimental results show that RGAN-EL is significantly better than the other six ensemble learning methods, and RGAN is greatly improved compared with three classical GAN models.  相似文献   

15.
在对跨境电子商务物流的5种模式作基本介绍的基础上,从物流成本、物流速度和物流可靠性等方面比较了各种跨境电子商务物流模式的优缺点,提出基于不同交易方向、不同交易模式和不同交易品类的跨境电子商务物流模式选择策略,以期为跨境电子商务交易主体选择合适的跨境电子商务物流模式提供借鉴。  相似文献   

16.
Gram-Schmidt正交化算法是数值线性代数中的基本算法之一,主要用于计算矩阵QR分解.经典和修正Gram-Schmidt正交化算法基于level 1/2 BLAS运算,低级BLAS运算对cache的利用率比较低,从而限制了算法性能.提出一种新的分块Gram-Schmidt正交化算法.新算法通过重正交保证产生矩阵 Q 的正交性达到机器精度,并且利用level 3 BLAS运算提高了算法性能.数值试验表明,新算法能使得矩阵 Q 的正交性达到机器精度,并且新算法使得性能得到显著提高.  相似文献   

17.
Optimal sensor allocation can substantially reduce the life cycle maintenance costs of engineering systems. Considerable effort has been exerted to model the causal relationship between sensors and faults, but without considering the propagation of fault risk. In this paper, a grey relational analysis (GRA) based quantitative causal diagram (QCD) sensor allocation strategy is proposed that can take account of the influence of the propagation of fault risk. QCD is used to describe both the fault-sensor causal relationship and the fault-to-fault causal relationship. A data-driven-based GRA is applied in QCD to calculate the coefficients of the propagation of fault risk. To achieve an accurate relationship between faults and sensors, an improved quantitative analytic hierarchy process is proposed to calculate the coefficients between faults and sensors that is defined as sensor detectability in this paper. An optimal sensor allocation strategy is then developed using an improved particle swarm optimization (IPSO) algorithm under the constraint on sensor detectability to minimize fault unobservability and total cost. The proposed strategy is demonstrated by a case study on a single-phase inverter system. Compared with two other sensor allocation strategies, the results show that the proposed strategy can obtain the lowest fault unobservability of per unit cost (?0.242) for sensor allocation under the propagation of fault risk.  相似文献   

18.
本文在研究传统的DV-Hop3D算法基础上提出了一种新无线传感器网络定位算法。新算法在算法的第一阶段设置了跳数阈值参数以减小通信开销,并且在算法的第二阶段用可选择的平均跳距代替固定的平均跳距来计算未知节点到锚节点的距离,最后用Matlab7.1进行了仿真。仿真结果表明,该改进算法可明显提高节点定位精度,并且能有效降低网络通信量。  相似文献   

19.
Financial decisions are often based on classification models which are used to assign a set of observations into predefined groups. Different data classification models were developed to foresee the financial crisis of an organization using their historical data. One important step towards the development of accurate financial crisis prediction (FCP) model involves the selection of appropriate variables (features) which are relevant for the problems at hand. This is termed as feature selection problem which helps to improve the classification performance. This paper proposes an Ant Colony Optimization (ACO) based financial crisis prediction (FCP) model which incorporates two phases: ACO based feature selection (ACO-FS) algorithm and ACO based data classification (ACO-DC) algorithm. The proposed ACO-FCP model is validated using a set of five benchmark dataset includes both qualitative and quantitative. For feature selection design, the developed ACO-FS method is compared with three existing feature selection algorithms namely genetic algorithm (GA), Particle Swarm Optimization (PSO) algorithm and Grey Wolf Optimization (GWO) algorithm. In addition, a comparison of classification results is also made between ACO-DC and state of art methods. Experimental analysis shows that the ACO-FCP ensemble model is superior and more robust than its counterparts. In consequence, this study strongly recommends that the proposed ACO-FCP model is highly competitive than traditional and other artificial intelligence techniques.  相似文献   

20.
The feature selection, which can reduce the dimensionality of vector space without sacrificing the performance of the classifier, is widely used in text categorization. In this paper, we proposed a new feature selection algorithm, named CMFS, which comprehensively measures the significance of a term both in inter-category and intra-category. We evaluated CMFS on three benchmark document collections, 20-Newsgroups, Reuters-21578 and WebKB, using two classification algorithms, Naïve Bayes (NB) and Support Vector Machines (SVMs). The experimental results, comparing CMFS with six well-known feature selection algorithms, show that the proposed method CMFS is significantly superior to Information Gain (IG), Chi statistic (CHI), Document Frequency (DF), Orthogonal Centroid Feature Selection (OCFS) and DIA association factor (DIA) when Naïve Bayes classifier is used and significantly outperforms IG, DF, OCFS and DIA when Support Vector Machines are used.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号