首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 578 毫秒
1.
A fundamental issue for statistical classification models in a streaming environment is that the joint distribution between predictor and response variables changes over time (a phenomenon also known as concept drifts), such that their classification performance deteriorates dramatically. In this paper, we first present a hierarchical hypothesis testing (HHT) framework that can detect and also adapt to various concept drift types (e.g., recurrent or irregular, gradual or abrupt), even in the presence of imbalanced data labels. A novel concept drift detector, namely Hierarchical Linear Four Rates (HLFR), is implemented under the HHT framework thereafter. By substituting a widely-acknowledged retraining scheme with an adaptive training strategy, we further demonstrate that the concept drift adaptation capability of HLFR can be significantly boosted. The theoretical analysis on the Type-I and Type-II errors of HLFR is also performed. Experiments on both simulated and real-world datasets illustrate that our methods outperform state-of-the-art methods in terms of detection precision, detection delay as well as the adaptability across different concept drift types.  相似文献   

2.
通过对国际科技资源监测内容、方法、技术等方面的系统研究,构建可视化的国际科技资源监测和服务体系,提出"三个一流"的国际科技资源概念,采用科学计量学、数据挖掘等方法,针对不同类型的国际科技信息数据库,研究设计了一套行之有效的国际科技资源监测框架体系,为把握国际科技资源的分布状况、寻找高水平的国际合作伙伴、有效利用国际科技资源提供了一个信息化的支撑手段。  相似文献   

3.
The distributed estimation has important research significance in unmanned systems. This paper investigates the distributed estimation of unmanned surface vessel (USV) via multi-sensor collaboration and 3D object recognition, in which a Knowledge Graph (KG) is constructed to store and represent the estimation results. Kalman-consensus Filter (KCF) and convolutional neural network (CNN) are used to estimate the optimal states of objects, and recognise multiple classes of objects without designing detectors for each class of objects, respectively. The recognition efficiency is improved by dividing the data into pixel blocks whose value is the number of detection points, and a point cloud dataset in different locations and rotations is also provided. Experiments are proposed to show that our method can help the USV accurately perceive entities in the environment, which validates the effectiveness of the proposed algorithm.  相似文献   

4.
Dialectal Arabic (DA) refers to varieties of everyday spoken languages in the Arab world. These dialects differ according to the country and region of the speaker, and their textual content is constantly growing with the rise of social media networks and web blogs. Although research on Natural Language Processing (NLP) on standard Arabic, namely Modern Standard Arabic (MSA), has witnessed remarkable progress, research efforts on DA are rather limited. This is due to numerous challenges, such as the scarcity of labeled data as well as the nature and structure of DA. While some recent works have reached decent results on several DA sentence classification tasks, other complex tasks, such as sequence labeling, still suffer from weak performances when it comes to DA varieties with either a limited amount of labeled data or unlabeled data only. Besides, it has been shown that zero-shot transfer learning from models trained on MSA does not perform well on DA. In this paper, we introduce AdaSL, a new unsupervised domain adaptation framework for Arabic multi-dialectal sequence labeling, leveraging unlabeled DA data, labeled MSA data, and existing multilingual and Arabic Pre-trained Language Models (PLMs). The proposed framework relies on four key components: (1) domain adaptive fine-tuning of multilingual/MSA language models on unlabeled DA data, (2) sub-word embedding pooling, (3) iterative self-training on unlabeled DA data, and (4) iterative DA and MSA distribution alignment. We evaluate our framework on multi-dialectal Named Entity Recognition (NER) and Part-of-Speech (POS) tagging tasks.The overall results show that the zero-shot transfer learning, using our proposed framework, boosts the performance of the multilingual PLMs by 40.87% in macro-F1 score for the NER task, while it boosts the accuracy by 6.95% for the POS tagging task. For the Arabic PLMs, our proposed framework increases performance by 16.18% macro-F1 for the NER task and 2.22% accuracy for the POS tagging task, and thus, achieving new state-of-the-art zero-shot transfer learning performance for Arabic multi-dialectal sequence labeling.  相似文献   

5.
6.
7.
Good information and records management is assumed to promote organizational efficiency. Despite established management regimes and available technology, many organizations still consider information and records management challenging. The reason may be cultural factors. This study based on a literature review, aims to explore the academic discourse on information culture and to discuss its relevance for records management. The findings show that the concept information culture is used in various ways: as an explanatory framework; as an analytical and evaluative tool; or as normative standard. The research on information culture addresses several areas: business performance, systems implementation, the manifestation of information culture in different organizations, and a few concerns records management practices. The research settings and the objects of study varied, why general conclusions are difficult to draw, but often a positive correlation between culture and performance is assumed. The focus has been on how information is used, shared and disseminated, while the production and management, that is the vital object of records management, has with few exceptions been neglected. If information culture should fully function as an analytical framework concerning records management, a widened and more inclusive conceptualization is required, which also will enrich information culture as a theoretical concept.  相似文献   

8.
This article reports the results from an information and knowledge assessment (IKA) to identify information and knowledge needs and, their coverage by information resources to derive recommendations for improvement and proposes a contingeny framework. The approach is based on a review of audit methods from information sciences and management, knowledge management and the engineering discipline and tested with data (N = 580) collected from the engineering domain within an automotive supplier over six European sites. The integrated assessment uses content needs profiles from two complementary perspectives, the coverage of needs by various internal information sources and data on awareness and usage of these information sources. The employment of content categories on a more granular analytical level than information systems sources opens up new possibilities to derive improvement measures and requirements for the design of information systems within an organization. The brief data-gathering instrument also reduces the required resources to implement this approach considerably overcoming weaknesses previously identified in case studies and the IA literature. It makes a contribution to research bridging the gap between research and practice and opens up options to design contingency frameworks for a specific domain.  相似文献   

9.
The wide spread of false information has detrimental effects on society, and false information detection has received wide attention. When new domains appear, the relevant labeled data is scarce, which brings severe challenges to the detection. Previous work mainly leverages additional data or domain adaptation technology to assist detection. The former would lead to a severe data burden; the latter underutilizes the pre-trained language model because there is a gap between the downstream task and the pre-training task, which is also inefficient for model storage because it needs to store a set of parameters for each domain. To this end, we propose a meta-prompt based learning (MAP) framework for low-resource false information detection. We excavate the potential of pre-trained language models by transforming the detection tasks into pre-training tasks by constructing template. To solve the problem of the randomly initialized template hindering excavation performance, we learn optimal initialized parameters by borrowing the benefit of meta learning in fast parameter training. The combination of meta learning and prompt learning for the detection is non-trivial: Constructing meta tasks to get initialized parameters suitable for different domains and setting up the prompt model’s verbalizer for classification in the noisy low-resource scenario are challenging. For the former, we propose a multi-domain meta task construction method to learn domain-invariant meta knowledge. For the latter, we propose a prototype verbalizer to summarize category information and design a noise-resistant prototyping strategy to reduce the influence of noise data. Extensive experiments on real-world data demonstrate the superiority of the MAP in new domains of false information detection.  相似文献   

10.
Information residing in multiple modalities (e.g., text, image) of social media posts can jointly provide more comprehensive and clearer insights into an ongoing emergency. To identify information valuable for humanitarian aid from noisy multimodal data, we first clarify the categories of humanitarian information, and define a multi-label multimodal humanitarian information identification task, which can adapt to the label inconsistency issue caused by modality independence while maintaining the correlation between modalities. We proposed a Multimodal Humanitarian Information Identification Model that simultaneously captures the Correlation and Independence between modalities (CIMHIM). A tailor-made dataset containing 4,383 annotated text-image pairs was built to evaluate the effectiveness of our model. The experimental results show that CIMHIM outperforms both unimodal and multimodal baseline methods by at least 0.019 in macro-F1 and 0.022 in accuracy. The combination of OCR text, object-level features, and the decision rule based on label correlations enhances the overall performance of CIMHIM. Additional experiments on a similar dataset (CrisisMMD) also demonstrate the robustness of CIMHIM. The task, model, and dataset proposed in this study contribute to the practice of leveraging multimodal social media resources to support effective emergency response.  相似文献   

11.
This paper proposes an improved model based pipeline leak detection and localization method based on compressed sensing (CS) and event-triggered (ET) particle filter (ET-PF). First, the state space model of the pipeline system is established based on the characteristic line method. Then, the CS method is used to preprocess the sensor signals to recover the potentially lost leak information which is caused by the low sampling frequency of the industrial pipeline sensors, and an event based beetle antennae search (BAS) particle filter (BAS-PF) is proposed to improve the accuracy and efficiency of the pipeline state estimation. Finally, a pipeline leak detection and localization method is developed based on the proposed signal processing, and state estimation algorithms, as well as a pipeline partition strategy. Experiment results show that the proposed method can accurately detect and locate the leak of the pipeline system with a localization error of about 1.4%.  相似文献   

12.
In this paper, we propose an optimization framework to retrieve an optimal group of experts to perform a multi-aspect task. While a diverse set of skills are needed to perform a multi-aspect task, the group of assigned experts should be able to collectively cover all these required skills. We consider three types of multi-aspect expert group formation problems and propose a unified framework to solve these problems accurately and efficiently. The first problem is concerned with finding the top k experts for a given task, while the required skills of the task are implicitly described. In the second problem, the required skills of the tasks are explicitly described using some keywords but each expert has a limited capacity to perform these tasks and therefore should be assigned to a limited number of them. Finally, the third problem is the combination of the first and the second problems. Our proposed optimization framework is based on the Facility Location Analysis which is a well known branch of the Operation Research. In our experiments, we compare the accuracy and efficiency of the proposed framework with the state-of-the-art approaches for the group formation problems. The experiment results show the effectiveness of our proposed methods in comparison with state-of-the-art approaches.  相似文献   

13.
This article aims at investigating the event-triggered (ET) distributed estimation problem for asynchronous sensor networks with randomly occurred unreliable measurements. We propose two ET mechanisms to schedule data transmissions in this paper. One ET mechanism based on dual-criterion is proposed to schedule the transmissions of measurements and avoid the interferences from unreliable measurements. The other ET mechanism is proposed to schedule the transmissions of local estimates. The connotative information in aforementioned ET mechanisms is exploited for taking full use of available information. Then, we provide the corresponding event-triggered asynchronous diffusion estimator based on the diffusion filtering scheme. In the proposed method, a sensor first generates a local estimate by utilizing available information of asynchronous measurements in each estimation period. Then it fuses available information of asynchronous local estimates to generate a fused estimate. Results of simulations in different cases and experiment in an optical-electronic detection network verify the validity and feasibility of the proposed method.  相似文献   

14.
This paper proposes a novel trust-based false data detection method for power systems under false data injection attacks (FDIAs). In order to eliminate the interference posed by false data to the power system in the state estimation process, a trust model is first established to estimate the reliability of the system bus. Then an algorithm is proposed to update the bus trust value, when all the trust value of neighbor buses at one bus node are quite low, then this bus is diagnosed as a malicious node and the false data are detected. This method guarantees that the power systems can estimate the state accurately against FDIAs based on the trust of bus. The simulations on the benchmark IEEE 14-bus, IEEE 30-bus and IEEE 57-bus test systems are used to demonstrate the feasibility and effectiveness of proposed algorithm.  相似文献   

15.
Detecting events in real-time from the Twitter data stream has gained substantial attention in recent years from researchers around the world. Different event detection approaches have been proposed as a result of these research efforts. One of the major challenges faced in this context is the high computational cost associated with event detection in real-time. We propose, TwitterNews+, an event detection system that incorporates specialized inverted indices and an incremental clustering approach to provide a low computational cost solution to detect both major and minor newsworthy events in real-time from the Twitter data stream. In addition, we conduct an extensive parameter sensitivity analysis to fine-tune the parameters used in TwitterNews+ to achieve the best performance. Finally, we evaluate the effectiveness of our system using a publicly available corpus as a benchmark dataset. The results of the evaluation show a significant improvement in terms of recall and precision over five state-of-the-art baselines we have used.  相似文献   

16.
对大数据驱动的管理与决策的相关文献进行研究,得出大数据资源的共享机制及其信息孤岛互联技术是当今大数据研究的前沿课题之一。对国内外政府数据共享交换应用进行研究分析,归纳政府数据资源共享交换存在管理理念问题和原有系统造成数据壁垒的问题。基于云平台,结合数据即服务的理论,提出构建政府全量数据资源的管理框架,在保证不对原有系统做任何改动的前提下,做到数据不搬家、数据不复制、数据不改变原来的管理模式,界定各个运营主体对数据的权利、义务,解决数据共享交换面临的管理理念问题和系统壁垒问题。  相似文献   

17.
资源信息学的发展与展望   总被引:2,自引:0,他引:2  
本文提出了资源信息是资源客体本质、特征和运动规律的属性的基本概念;阐述了系统论和信息论是资源信息学的理论基础,系统工程和信息方法是资源信息学研究的重要方法,计算机等信息技术是它的重要工具的观点;资源信息是资源信息学的研究对象;介绍了资源信息流从产生、传递、控制到应用等的每一个环节的理论、方法、技术等问题;提出了信息综合分析法、资源系统黑箱方法、系统整体优化法、描述方法和推断方法、定性和定量分析法、多维信息环境法等是资源信息学的研究方法;初步确定了资源信息学的学科体系为资源信息学的理论基础、资源信息学的方法论、资源信息学的技术体系、资源信息学的应用和资源信息学的工程体系;文章最后指出了资源信息产生的机理、信息融合、信息挖掘、专家系统、数据仓库、可视化、信息集成技术研究、资源环境模型研究以及资源环境虚拟科研环境构建等问题是当前资源信息学研究的热点问题等.由于资源信息学是一门尚未成熟的学科,有待进一步去探索,故文章中的观点仅供同等们研究时参考,不妥之处,欢迎批评指正.  相似文献   

18.
转基因食品的标签与知情选择的伦理分析   总被引:4,自引:2,他引:4  
由于转基因食品(geneticallymodifiedfoods,GMF)可能有潜在危险,许多国家和消费者要求对转基因食品贴标签,以便消费者能在知情的基础上做出自由选择。但有些国家像美国和许多生产商、销售商不愿意贴标签,他们认为转基因食品与传统食品一样安全,没有必要进行标识。本文对转基因食品的标签与知情选择(labeling&informedchoice)的概念问题作一个简单的阐述,对其伦理问题作了重点分析并进行哲学反思,结论是应该对转基因食品进行标识,以尊重消费者的知情选择权。  相似文献   

19.
In times of major global interconnectedness and environmental change, the pressure to identify, create, and exploit new resources is certain to intensify. Given that there are unavoidable trade-offs, conflicts, and arenas for violence involved when increasingly more material and immaterial things are turned into resources, we call for explicit research on the very process – a process that we label resourcification. The concept of resourcification shifts attention from essentialist queries about the nature of resources to a focus on the social processes through which things are turned into resources. In search of a better understanding of resources in the Anthropocene and, in particular, an understanding about the way resources emerge and are used, resourcification offers a new conceptual framework that allows for a systematic search for knowledge about the diversity of contexts, conditions, modes, and temporalities of resourcification. This Resourcification Manifesto offers a theoretical and empirical framework for a radical and disruptive approach to innovation, sustainability, and management studies and policies.  相似文献   

20.
针对数据流高速、无限连续和动态不确定性等特点,从提高不确定数据流数据管理能力的角度来解决不确定数据流中异常数据识别问题。首先采用小波分析,将连续数据流流量数据的高频与低频分量分离;其次,结合不确定数据流聚类方法找出数据中的异常点。仿真实验证明,该检测方法能够良好地适应数据流的不确定性,在一定条件下可获得相当好的检测效果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号