首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The spread of fake news has become a significant social problem, drawing great concern for fake news detection (FND). Pretrained language models (PLMs), such as BERT and RoBERTa can benefit this task much, leading to state-of-the-art performance. The common paradigm of utilizing these PLMs is fine-tuning, in which a linear classification layer is built upon the well-initialized PLM network, resulting in an FND mode, and then the full model is tuned on a training corpus. Although great successes have been achieved, this paradigm still involves a significant gap between the language model pretraining and target task fine-tuning processes. Fortunately, prompt learning, a new alternative to PLM exploration, can handle the issue naturally, showing the potential for further performance improvements. To this end, we propose knowledgeable prompt learning (KPL) for this task. First, we apply prompt learning to FND, through designing one sophisticated prompt template and the corresponding verbal words carefully for the task. Second, we incorporate external knowledge into the prompt representation, making the representation more expressive to predict the verbal words. Experimental results on two benchmark datasets demonstrate that prompt learning is better than the baseline fine-tuning PLM utilization for FND and can outperform all previous representative methods. Our final knowledgeable model (i.e, KPL) can provide further improvements. In particular, it achieves an average increase of 3.28% in F1 score under low-resource conditions compared with fine-tuning.  相似文献   

2.
In recent years, large-scale Pre-trained Language Models (PLMs) like BERT have achieved state-of-the-art results on many NLP tasks. We explore whether BERT understands deontic logic which is important for the fields of legal AI and digital government. We measure BERT's understanding of deontic logic through the Deontic Modality Classification (DMC) task. Experiments show that without fine-tuning or fine-tuning with only a small amount of data, BERT cannot achieve good performance on the DMC task. Therefore, we propose a new method for BERT fine-tuning and prediction, called DeonticBERT. The method incorporates heuristic knowledge from deontic logic theory as an inductive bias into BERT through a template function and a mapping between category labels and predicted words, to steer BERT understand the DMC task. This can also stimulate BERT to recall the deontic logic knowledge learned in pre-training. We use an English dataset widely used as well as a Chinese dataset we constructed to conduct experiments. Experimental results show that on the DMC task, DeonticBERT can achieve 66.9% and 91% accuracy under zero-shot and few-shot conditions, respectively, far exceeding other baselines. This demonstrates that DeonticBERT does enable BERT to understand deontic logic and can handle related tasks without using much fine-tuning data. Our research helps facilitate applying large-scale PLMs like BERT into legal AI and digital government.  相似文献   

3.
Effective learning schemes such as fine-tuning, zero-shot, and few-shot learning, have been widely used to obtain considerable performance with only a handful of annotated training data. In this paper, we presented a unified benchmark to facilitate the problem of zero-shot text classification in Turkish. For this purpose, we evaluated three methods, namely, Natural Language Inference, Next Sentence Prediction and our proposed model that is based on Masked Language Modeling and pre-trained word embeddings on nine Turkish datasets for three main categories: topic, sentiment, and emotion. We used pre-trained Turkish monolingual and multilingual transformer models which can be listed as BERT, ConvBERT, DistilBERT and mBERT. The results showed that ConvBERT with the NLI method yields the best results with 79% and outperforms previously used multilingual XLM-RoBERTa model by 19.6%. The study contributes to the literature using different and unattempted transformer models for Turkish and showing improvement of zero-shot text classification performance for monolingual models over multilingual models.  相似文献   

4.
Dialectal Arabic (DA) refers to varieties of everyday spoken languages in the Arab world. These dialects differ according to the country and region of the speaker, and their textual content is constantly growing with the rise of social media networks and web blogs. Although research on Natural Language Processing (NLP) on standard Arabic, namely Modern Standard Arabic (MSA), has witnessed remarkable progress, research efforts on DA are rather limited. This is due to numerous challenges, such as the scarcity of labeled data as well as the nature and structure of DA. While some recent works have reached decent results on several DA sentence classification tasks, other complex tasks, such as sequence labeling, still suffer from weak performances when it comes to DA varieties with either a limited amount of labeled data or unlabeled data only. Besides, it has been shown that zero-shot transfer learning from models trained on MSA does not perform well on DA. In this paper, we introduce AdaSL, a new unsupervised domain adaptation framework for Arabic multi-dialectal sequence labeling, leveraging unlabeled DA data, labeled MSA data, and existing multilingual and Arabic Pre-trained Language Models (PLMs). The proposed framework relies on four key components: (1) domain adaptive fine-tuning of multilingual/MSA language models on unlabeled DA data, (2) sub-word embedding pooling, (3) iterative self-training on unlabeled DA data, and (4) iterative DA and MSA distribution alignment. We evaluate our framework on multi-dialectal Named Entity Recognition (NER) and Part-of-Speech (POS) tagging tasks.The overall results show that the zero-shot transfer learning, using our proposed framework, boosts the performance of the multilingual PLMs by 40.87% in macro-F1 score for the NER task, while it boosts the accuracy by 6.95% for the POS tagging task. For the Arabic PLMs, our proposed framework increases performance by 16.18% macro-F1 for the NER task and 2.22% accuracy for the POS tagging task, and thus, achieving new state-of-the-art zero-shot transfer learning performance for Arabic multi-dialectal sequence labeling.  相似文献   

5.
任妮  鲍彤  沈耕宇  郭婷 《情报科学》2021,39(11):96-102
【 目的/意义】开展面向领域的细粒度命名实体识别研究对于提升文本挖掘精度具有重要的意义,本文以番 茄病虫害命名实体为例,探索采用深度学习技术实现面向领域的细粒度命名实体识别研究方法。【目的/意义】文章 以电子书、论文、网页作为数据源,选择品种、病虫害、症状、时间、部位、防治药剂六类实体进行标注,利用BERT和 CBOW 预训练字向量分别输入 BiLSTM-CRF 模型训练,并在识别后补充规则控制实体的边界。【结果/结论】 BERT预训练的字向量和BiLSTM-CRF结合,在补充规则控制后F值达到了81.03%,优于其它模型,在番茄病虫害 领域的实体识别中具有较好的效果。【创新/局限】BERT预训练的字向量可以有效降低番茄病虫害领域实体因分 词错误带来的影响,针对不同实体的特点,补充规则可以有效控制实体边界,提高识别准确率。但本文的规则补充 仅在测试阶段,并没有加入训练过程,整体的准确率还有待提高。  相似文献   

6.
In this paper, we analyze the influence of Twitter users in sharing news articles that may affect the readers’ mood. We collected data of more than 2000 Twitter users who shared news articles from Corriere.it, a daily newspaper that provides mood metadata annotated by readers on a voluntary basis. We automatically annotated personality types and communication styles of Twitter users and analyzed the correlations between personality, communication style, Twitter metadata (such as followig and folllowers) and the type of mood associated to the articles they shared. We also run a feature selection task, to find the best predictors of positive and negative mood sharing, and a classification task. We automatically predicted positive and negative mood sharers with 61.7% F1-measure.  相似文献   

7.
Document-level relation extraction (RE) aims to extract the relation of entities that may be across sentences. Existing methods mainly rely on two types of techniques: Pre-trained language models (PLMs) and reasoning skills. Although various reasoning methods have been proposed, how to elicit learnt factual knowledge from PLMs for better reasoning ability has not yet been explored. In this paper, we propose a novel Collective Prompt Tuning with Relation Inference (CPT-RI) for Document-level RE, that improves upon existing models from two aspects. First, considering the long input and various templates, we adopt a collective prompt tuning method, which is an update-and-reuse strategy. A generic prompt is first encoded and then updated with exact entity pairs for relation-specific prompts. Second, we introduce a relation inference module to conduct global reasoning overall relation prompts via constrained semantic segmentation. Extensive experiments on two publicly available benchmark datasets demonstrate the effectiveness of our proposed CPT-RI as compared to the baseline model (ATLOP (Zhou et al., 2021)), which improve the 0.57% on the DocRED dataset, 2.20% on the CDR dataset, and 2.30 on the GDA dataset in the F1 score. In addition, further ablation studies also verify the effects of the collective prompt tuning and relation inference.  相似文献   

8.
Sentiment analysis concerns about automatically identifying sentiment or opinion expressed in a given piece of text. Most prior work either use prior lexical knowledge defined as sentiment polarity of words or view the task as a text classification problem and rely on labeled corpora to train a sentiment classifier. While lexicon-based approaches do not adapt well to different domains, corpus-based approaches require expensive manual annotation effort.  相似文献   

9.
Few-shot intent recognition aims to identify user’s intent from the utterance with limited training data. A considerable number of existing methods mainly rely on the generic knowledge acquired on the base classes to identify the novel classes. Such methods typically ignore the characteristics of each meta task itself, resulting in the inability to make full use of limited given samples when classifying unseen classes. To deal with such issues, we propose a Contrastive learning-based Task Adaptation model (CTA) for few-shot intent recognition. In detail, we leverage contrastive learning to help achieve task adaptation and make full use of the limited samples of novel classes. First, a self-attention layer is employed in the task adaptation module, which aims to establish interactions between samples of different categories so that new representations are task-specific rather than relying entirely on the base classes. Then, the contrastive-based loss functions and the semantics of the label name are respectively used for reducing the similarity between sample representations in different categories while increasing it in the same categories. Experimental results on a public dataset OOS verify the effectiveness of our proposal by beating the competitive baselines in terms of accuracy. Besides, we conduct the cross-domain experiments on three datasets, i.e., OOS, SNIPS as well as ATIS. We find that CTA gains obvious improvements in terms of accuracy in all cross-domain experiments, indicating that it has a better generalization ability than other competitive baselines in both cross-domain and single-domain settings.  相似文献   

10.
汉语自然语言检索中的词法分析处理   总被引:6,自引:0,他引:6  
耿骞  毛瑞 《情报科学》2004,22(4):466-469
本文对自然语言检索中的词法分析处理进行了探讨。首先讨论了基于词法分析的自然语言检索处理的类型,如加权统计法、N元法、统计学习方法,然后讨论了词法分析的方法和过程,重点对语词切分、词性标注的方法,并分析了相关的过程,特别是对基于概率统计的方法进行了介绍。最后对词法分析中存在的问题进行了探讨。  相似文献   

11.
In spite of the vast amount of work on subjectivity and sentiment analysis (SSA), it is not yet particularly clear how lexical information can best be modeled in a morphologically-richness language. To bridge this gap, we report successful models targeting lexical input in Arabic, a language of very complex morphology. Namely, we measure the impact of both gold and automatic segmentation on the task and build effective models achieving significantly higher than our baselines. Our models exploiting predicted segments improve subjectivity classification by 6.02% F1-measure and sentiment classification by 4.50% F1-measure against the majority class baseline surface word forms. We also perform in-depth (error) analyses of the behavior of the models and provide detailed explanations of subjectivity and sentiment expression in Arabic against the morphological richness background in which the work is situated.  相似文献   

12.
Language modeling (LM), providing a principled mechanism to associate quantitative scores to sequences of words or tokens, has long been an interesting yet challenging problem in the field of speech and language processing. The n-gram model is still the predominant method, while a number of disparate LM methods, exploring either lexical co-occurrence or topic cues, have been developed to complement the n-gram model with some success. In this paper, we explore a novel language modeling framework built on top of the notion of relevance for speech recognition, where the relationship between a search history and the word being predicted is discovered through different granularities of semantic context for relevance modeling. Empirical experiments on a large vocabulary continuous speech recognition (LVCSR) task seem to demonstrate that the various language models deduced from our framework are very comparable to existing language models both in terms of perplexity and recognition error rate reductions.  相似文献   

13.
This paper addresses the problem of the automatic recognition and classification of temporal expressions and events in human language. Efficacy in these tasks is crucial if the broader task of temporal information processing is to be successfully performed. We analyze whether the application of semantic knowledge to these tasks improves the performance of current approaches. We therefore present and evaluate a data-driven approach as part of a system: TIPSem. Our approach uses lexical semantics and semantic roles as additional information to extend classical approaches which are principally based on morphosyntax. The results obtained for English show that semantic knowledge aids in temporal expression and event recognition, achieving an error reduction of 59% and 21%, while in classification the contribution is limited. From the analysis of the results it may be concluded that the application of semantic knowledge leads to more general models and aids in the recognition of temporal entities that are ambiguous at shallower language analysis levels. We also discovered that lexical semantics and semantic roles have complementary advantages, and that it is useful to combine them. Finally, we carried out the same analysis for Spanish. The results obtained show comparable advantages. This supports the hypothesis that applying the proposed semantic knowledge may be useful for different languages.  相似文献   

14.
Aspect-based sentiment analysis aims to determine sentiment polarities toward specific aspect terms within the same sentence or document. Most recent studies adopted attention-based neural network models to implicitly connect aspect terms with context words. However, these studies were limited by insufficient interaction between aspect terms and opinion words, leading to poor performance on robustness test sets. In addition, we have found that robustness test sets create new sentences that interfere with the original information of a sentence, which often makes the text too long and leads to the problem of long-distance dependence. Simultaneously, these new sentences produce more non-target aspect terms, misleading the model because of the lack of relevant knowledge guidance. This study proposes a knowledge guided multi-granularity graph convolutional neural network (KMGCN) to solve these problems. The multi-granularity attention mechanism is designed to enhance the interaction between aspect terms and opinion words. To address the long-distance dependence, KMGCN uses a graph convolutional network that relies on a semantic map based on fine-tuning pre-trained models. In particular, KMGCN uses a mask mechanism guided by conceptual knowledge to encounter more aspect terms (including target and non-target aspect terms). Experiments are conducted on 12 SemEval-2014 variant benchmarking datasets, and the results demonstrated the effectiveness of the proposed framework.  相似文献   

15.
Using lexical chains for keyword extraction   总被引:9,自引:0,他引:9  
Keywords can be considered as condensed versions of documents and short forms of their summaries. In this paper, the problem of automatic extraction of keywords from documents is treated as a supervised learning task. A lexical chain holds a set of semantically related words of a text and it can be said that a lexical chain represents the semantic content of a portion of the text. Although lexical chains have been extensively used in text summarization, their usage for keyword extraction problem has not been fully investigated. In this paper, a keyword extraction technique that uses lexical chains is described, and encouraging results are obtained.  相似文献   

16.
钟小丹  冯宗祥 《科教文汇》2013,(19):125-127
本文基于自建美国第一夫人米歇尔·奥巴马微型演讲语料库和VOA新闻语料库,根据李文中的主题词提取方法及方乐、狄安娜的有关公共演讲理论,利用WordSmith这一词汇检索软件,分析了米歇尔在演讲中不同类型主题词的作用:第一二人称代词,与听众建立直接联系;名词类,诠释共同的价值观、人生观;积极形容词,激励鼓舞听众。本文为英语演讲的研究提供一定的借鉴。  相似文献   

17.
Cross-genre author profiling aims to build generalized models for predicting profile traits of authors that can be helpful across different text genres for computer forensics, marketing, and other applications. The cross-genre author profiling task becomes challenging when dealing with low-resourced languages due to the lack of availability of standard corpora and methods. The task becomes even more challenging when the data is code-switched, which is informal and unstructured. In previous studies, the problem of cross-genre author profiling has been mainly explored for mono-lingual texts in highly resourced languages (English, Spanish, etc.). However, it has not been thoroughly explored for the code-switched text which is widely used for communication over social media. To fulfill this gap, we propose a transfer learning-based solution for the cross-genre author profiling task on code-switched (English–RomanUrdu) text using three widely known genres, Facebook comments/posts, Tweets, and SMS messages. In this article, firstly, we experimented with the traditional machine learning, deep learning and pre-trained transfer learning models (MBERT, XLMRoBERTa, ULMFiT, and XLNET) for the same-genre and cross-genre gender identification task. We then propose a novel Trans-Switch approach that focuses on the code-switching nature of the text and trains on specialized language models. In addition, we developed three RomanUrdu to English translated corpora to study the impact of translation on author profiling tasks. The results show that the proposed Trans-Switch model outperforms the baseline deep learning and pre-trained transfer learning models for cross-genre author profiling task on code-switched text. Further, the experimentation also shows that the translation of RomanUrdu text does not improve results.  相似文献   

18.
Few-Shot Event Classification (FSEC) aims at assigning event labels to unlabeled sentences when limited annotated samples are available. Existing works mainly focus on using meta-learning to overcome the low-resource problem that still requires abundant held-out classes for model learning and selection. Thus we propose to deal with the low-resource problem by utilizing prompts. Further, existing methods suffer from severe trigger biases that may result in ignorance of the context. That is, the correct classifications are gained by looking at only the triggers, which hurts the model’s generalization ability. Thus, we propose a knowledgeable augmented-trigger prompt FSEC framework (AugPrompt), which can overcome the bias issues and alleviates the classification bottleneck brought by insufficient data. In detail, we first design an External Knowledge Injection (EKI) module to incorporate an external knowledge base (Related Words) for trigger augmentation. Then, we propose an Event Prompt Generation (EPG) module to generate appropriate discrete prompts for initializing the continuous prompts. After that, we propose an Event Prompt Tuning (EPT) module to automatically search prompts in the continuous space for FSEC and finally predict the corresponding event types of the inputs. We conduct extensive experiments on two public English datasets for FSEC, i.e., FewEvent and RAMS. The experimental results show the superiority of our proposal over the competitive baselines, where the maximum accuracy increase compared to the strongest baseline reaches 10.8%.  相似文献   

19.
This paper provides the first broad overview of the relation between different interpretation methods and human eye-movement behaviour across different tasks and architectures. The interpretation methods of neural networks provide the information the machine considers important, while the human eye-gaze has been believed to be a proxy of the human cognitive process. Thus, comparing them explains machine behaviour in terms of human behaviour, leading to improvement in machine performance through minimising their difference. We consider three types of natural language processing (NLP) tasks: sentiment analysis, relation classification and question answering, and four interpretation methods based on: simple gradient, integrated gradient, input-perturbation and attention, and three architectures: LSTM, CNN and Transformer. We leverage two corpora annotated with eye-gaze information: the Zuco dataset and the MQA-RC dataset. This research sets up two research questions. First, we investigate whether the saliency (importance) of input-words conform with those from human eye-gaze features. To this end, we compute a saliency distance (SD) between input words (by an interpretation method) and an eye-gaze feature. SD is defined as the KL-divergence between the saliency distribution over input words and an eye-gaze feature. We found that the SD scores vary depending on the combinations of tasks, interpretation methods and architectures. Second, we investigate whether the models with good saliency conformity to human eye-gaze behaviour have better prediction performances. To this end, we propose a novel evaluation device called “SD-performance curve” (SDPC) which represents the cumulative model performance against the SD scores. SDPC enables us to analyse the underlying phenomena that were overlooked using only the macroscopic metrics, such as average SD scores and rank correlations, that are typically used in the past studies. We observe that the impact of good saliency conformity between humans and machines on task performance varies among the combinations of tasks, interpretation methods and architectures. Our findings should be considered when introducing eye-gaze information for model training to improve the model performance.  相似文献   

20.
Conceptual metaphor detection is a well-researched topic in Natural Language Processing. At the same time, conceptual metaphor use analysis produces unique insight into individual psychological processes and characteristics, as demonstrated by research in cognitive psychology. Despite the fact that state-of-the-art language models allow for highly effective automatic detection of conceptual metaphor in benchmark datasets, the models have never been applied to psychological tasks. The benchmark datasets differ a lot from experimental texts recorded or produced in a psychological setting, in their domain, genre, and the scope of metaphoric expressions covered.We present the first experiment to apply NLP metaphor detection methods to a psychological task, specifically, analyzing individual differences. For that, we annotate MetPersonality, a dataset of Russian texts written in a psychological experiment setting, with conceptual metaphor. With a widely used conceptual metaphor annotation procedure, we obtain low annotation quality, which arises from the dataset characteristics uncommon in typical automatic metaphor detection tasks. We suggest a novel conceptual metaphor annotation procedure to mitigate issues in annotation quality, increasing the inter-annotator agreement to a moderately high level. We leverage the annotated dataset and existing metaphor datasets in Russian to select, train and evaluate state-of-the-art metaphor detection models, obtaining acceptable results in the metaphor detection task. In turn, the most effective model is used to detect conceptual metaphor automatically in RusPersonality, a larger dataset containing meta-information on psychological traits of the participant authors. Finally, we analyze correlations of automatically detected metaphor use with psychological traits encoded in the Freiburg Personality Inventory (FPI).Our pioneering work on automatically-detected metaphor use and individual differences demonstrates the possibility of unprecedented large-scale research on the relation between of metaphor use and personality traits and dispositions, cognitive and emotional processing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号