首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 515 毫秒
1.
Nonlinear system identification and prediction is a complex task, and often non-parametric models such as neural networks are used in place of intricate mathematics. To that cause, recently an improved approach to nonlinear system identification using neural networks was presented in Gupta and Sinha (J. Franklin Inst. 336 (1999) 721). Therein a learning algorithm was proposed in which both the slope of the activation function at a neuron, β, and the learning rate, η, were made adaptive. The proposed algorithm assumes that η and β are independent variables. Here, we show that the slope and the learning rate are not independent in a general dynamical neural nétwork, and this should be taken into account when designing a learning algorithm. Further, relationships between η and β are developed which helps reduce the number of degrees of freedom and computational complexity in an optimisation task of training a fully adaptive neural network. Simulation results based on Gupta and Sinha (1999) and the proposed approach support the analysis.  相似文献   

2.
This paper presents a semantically rich document representation model for automatically classifying financial documents into predefined categories utilizing deep learning. The model architecture consists of two main modules including document representation and document classification. In the first module, a document is enriched with semantics using background knowledge provided by an ontology and through the acquisition of its relevant terminology. Acquisition of terminology integrated to the ontology extends the capabilities of semantically rich document representations with an in depth-coverage of concepts, thereby capturing the whole conceptualization involved in documents. Semantically rich representations obtained from the first module will serve as input to the document classification module which aims at finding the most appropriate category for that document through deep learning. Three different deep learning networks each belonging to a different category of machine learning techniques for ontological document classification using a real-life ontology are used.Multiple simulations are carried out with various deep neural networks configurations, and our findings reveal that a three hidden layer feedforward network with 1024 neurons obtain the highest document classification performance on the INFUSE dataset. The performance in terms of F1 score is further increased by almost five percentage points to 78.10% for the same network configuration when the relevant terminology integrated to the ontology is applied to enrich document representation. Furthermore, we conducted a comparative performance evaluation using various state-of-the-art document representation approaches and classification techniques including shallow and conventional machine learning classifiers.  相似文献   

3.
Deep multi-view clustering (MVC) is to mine and employ the complex relationships among views to learn the compact data clusters with deep neural networks in an unsupervised manner. The more recent deep contrastive learning (CL) methods have shown promising performance in MVC by learning cluster-oriented deep feature representations, which is realized by contrasting the positive and negative sample pairs. However, most existing deep contrastive MVC methods only focus on the one-side contrastive learning, such as feature-level or cluster-level contrast, failing to integrating the two sides together or bringing in more important aspects of contrast. Additionally, most of them work in a separate two-stage manner, i.e., first feature learning and then data clustering, failing to mutually benefit each other. To fix the above challenges, in this paper we propose a novel joint contrastive triple-learning framework to learn multi-view discriminative feature representation for deep clustering, which is threefold, i.e., feature-level alignment-oriented and commonality-oriented CL, and cluster-level consistency-oriented CL. The former two submodules aim to contrast the encoded feature representations of data samples in different feature levels, while the last contrasts the data samples in the cluster-level representations. Benefiting from the triple contrast, the more discriminative representations of views can be obtained. Meanwhile, a view weight learning module is designed to learn and exploit the quantitative complementary information across the learned discriminative features of each view. Thus, the contrastive triple-learning module, the view weight learning module and the data clustering module with these fused features are jointly performed, so that these modules are mutually beneficial. The extensive experiments on several challenging multi-view datasets show the superiority of the proposed method over many state-of-the-art methods, especially the large improvement of 15.5% and 8.1% on Caltech-4V and CCV in terms of accuracy. Due to the promising performance on visual datasets, the proposed method can be applied into many practical visual applications such as visual recognition and analysis. The source code of the proposed method is provided at https://github.com/ShizheHu/Joint-Contrastive-Triple-learning.  相似文献   

4.
Edge computing has recently gained momentum as it provides computing services for mobile devices through high-speed networks. In edge computing system optimization, deep reinforcement learning(DRL) enhances the quality of services(QoS) and shorts the age of information(AoI). However, loosely coupled edge servers saturate a noisy data space for DRL exploration, and learning a reasonable solution is enormously costly. Most existing works assume that the edge is an exact observation system and harvests well-labeled data for the pretraining of DRL neural networks. However, this assumption stands in opposition to the motivation of driving DRL to explore unknown information and increases the scheduling and computing costs in large-scale dynamic systems. This article leverages DRL with a distillation module to drive learning efficiency for edge computing with partial observation. We formulate the deadline-aware offloading problem as a decentralized partially observable Markov decision process (Dec-POMDP) with distillation, called fast decentralized reinforcement distillation(Fast-DRD). Each edge server decides makes offloading decisions in accordance with its own observations and learning strategies in a decentralized manner. By defining trajectory observation history(TOH) distillation and trust distillation to avoid overfitting, Fast-DRD learns a suitable offloading model in a noisy partially observed edge system and reduces the cost for communication among servers. Finally, experimental simulations are presented to evaluate and compare the effectiveness and complexity of Fast-DRD.  相似文献   

5.
Irony as a literary technique is widely used in online texts such as Twitter posts. Accurate irony detection is crucial for tasks such as effective sentiment analysis. A text’s ironic intent is defined by its context incongruity. For example in the phrase “I love being ignored”, the irony is defined by the incongruity between the positive word “love” and the negative context of “being ignored”. Existing studies mostly formulate irony detection as a standard supervised learning text categorization task, relying on explicit expressions for detecting context incongruity. In this paper we formulate irony detection instead as a transfer learning task where supervised learning on irony labeled text is enriched with knowledge transferred from external sentiment analysis resources. Importantly, we focus on identifying the hidden, implicit incongruity without relying on explicit incongruity expressions, as in “I like to think of myself as a broken down Justin Bieber – my philosophy professor.” We propose three transfer learning-based approaches to using sentiment knowledge to improve the attention mechanism of recurrent neural models for capturing hidden patterns for incongruity. Our main findings are: (1) Using sentiment knowledge from external resources is a very effective approach to improving irony detection; (2) For detecting implicit incongruity, transferring deep sentiment features seems to be the most effective way. Experiments show that our proposed models outperform state-of-the-art neural models for irony detection.  相似文献   

6.
This study uses data mining techniques to examine the effect of various demographic, cognitive and psychographic factors on Egyptian citizens’ use of e-government services. Data mining uses a broad family of computationally intensive methods that include decision trees, neural networks, rule induction, machine learning and graphic visualization. Three artificial neural network models (multi-layer perceptron neural network [MLP], probabilistic neural network [PNN] and self-organizing maps neural network [SOM]) and three machine learning techniques (classification and regression trees [CART], multivariate adaptive regression splines [MARS], and support vector machines [SVM]) are compared to a standard statistical method (linear discriminant analysis [LDA]). The variable sets considered are sex, age, educational level, e-government services perceived usefulness, ease of use, compatibility, subjective norms, trust, civic mindedness, and attitudes. The study shows how it is possible to identify various dimensions of e-government services usage behavior by uncovering complex patterns in the dataset, and also shows the classification abilities of data mining techniques.  相似文献   

7.
Automated legal text classification is a prominent research topic in the legal field. It lays the foundation for building an intelligent legal system. Current literature focuses on international legal texts, such as Chinese cases, European cases, and Australian cases. Little attention is paid to text classification for U.S. legal texts. Deep learning has been applied to improving text classification performance. Its effectiveness needs further exploration in domains such as the legal field. This paper investigates legal text classification with a large collection of labeled U.S. case documents through comparing the effectiveness of different text classification techniques. We propose a machine learning algorithm using domain concepts as features and random forests as the classifier. Our experiment results on 30,000 full U.S. case documents in 50 categories demonstrated that our approach significantly outperforms a deep learning system built on multiple pre-trained word embeddings and deep neural networks. In addition, applying only the top 400 domain concepts as features for building the random forests could achieve the best performance. This study provides a reference to select machine learning techniques for building high-performance text classification systems in the legal domain or other fields.  相似文献   

8.
With the emergence and development of deep generative models, such as the variational auto-encoders (VAEs), the research on topic modeling successfully extends to a new area: neural topic modeling, which aims to learn disentangled topics to understand the data better. However, the original VAE framework had been shown to be limited in disentanglement performance, bringing their inherent defects to a neural topic model (NTM). In this paper, we put forward that the optimization objectives of contrastive learning are consistent with two important goals (alignment and uniformity) of well-disentangled topic learning. Also, the optimization objectives of contrastive learning are consistent with two key evaluation measures for topic models, topic coherence and topic diversity. So, we come to the important conclusion that alignment and uniformity of disentangled topic learning can be quantified with topic coherence and topic diversity. Accordingly, we are inspired to propose the Contrastive Disentangled Neural Topic Model (CNTM). By representing both words and topics as low-dimensional vectors in the same embedding space, we apply contrastive learning to neural topic modeling to produce factorized and disentangled topics in an interpretable manner. We compare our proposed CNTM with strong baseline models on widely-used metrics. Our model achieves the best topic coherence scores under the most general evaluation setting (100% proportion topic selected) with 25.0%, 10.9%, 24.6%, and 51.3% improvements above the second-best models’ scores reported on four datasets of 20 Newsgroups, Web Snippets, Tag My News, and Reuters, respectively. Our method also gets the second-best topic diversity scores on the dataset of 20Newsgroups and Web Snippets. Our experimental results show that CNTM can effectively leverage the disentanglement ability from contrastive learning to solve the inherent defect of neural topic modeling and obtain better topic quality.  相似文献   

9.
Most existing state-of-the-art neural network models for math word problems use the Goal-driven Tree-Structured decoder (GTS) to generate expression trees. However, we found that GTS does not provide good predictions for longer expressions, mainly because it does not capture the relationships among the goal vectors of each node in the expression tree and ignores the position order of the nodes before and after the operator. In this paper, we propose a novel Recursive tree-structured neural network with Goal Forgetting and information aggregation (RGFNet) to address these limits. The goal forgetting and information aggregation module is based on ordinary differential equations (ODEs) and we use it to build a sub-goal information feedback neural network (SGIFNet). Unlike GTS, which uses two-layer gated-feedforward networks to generate goal vectors, we introduce a novel sub-goal generation module. The sub-goal generation module could capture the relationship among the related nodes (e.g. parent nodes, sibling nodes) using attention mechanism. Experimental results on two large public datasets i.e. Math23K and Ape-clean show that our tree-structured model outperforms the state-of-the-art models and obtains answer accuracy over 86%. Furthermore, the performance on long-expression problems is promising.1  相似文献   

10.
In this paper we study the problem of classification of textual web reports. We are specifically focused on situations in which structured information extracted from the reports is used for classification. We present an experimental classification system based on usage of third party linguistic analyzers, our previous work on web information extraction, and fuzzy inductive logic programming (fuzzy ILP). A detailed study of the so-called ‘Fuzzy ILP Classifier’ is the main contribution of the paper. The study includes formal models, prototype implementation, extensive evaluation experiments and comparison of the classifier with other alternatives like decision trees, support vector machines, neural networks, etc.  相似文献   

11.
Big data generated by social media stands for a valuable source of information, which offers an excellent opportunity to mine valuable insights. Particularly, User-generated contents such as reviews, recommendations, and users’ behavior data are useful for supporting several marketing activities of many companies. Knowing what users are saying about the products they bought or the services they used through reviews in social media represents a key factor for making decisions. Sentiment analysis is one of the fundamental tasks in Natural Language Processing. Although deep learning for sentiment analysis has achieved great success and allowed several firms to analyze and extract relevant information from their textual data, but as the volume of data grows, a model that runs in a traditional environment cannot be effective, which implies the importance of efficient distributed deep learning models for social Big Data analytics. Besides, it is known that social media analysis is a complex process, which involves a set of complex tasks. Therefore, it is important to address the challenges and issues of social big data analytics and enhance the performance of deep learning techniques in terms of classification accuracy to obtain better decisions.In this paper, we propose an approach for sentiment analysis, which is devoted to adopting fastText with Recurrent neural network variants to represent textual data efficiently. Then, it employs the new representations to perform the classification task. Its main objective is to enhance the performance of well-known Recurrent Neural Network (RNN) variants in terms of classification accuracy and handle large scale data. In addition, we propose a distributed intelligent system for real-time social big data analytics. It is designed to ingest, store, process, index, and visualize the huge amount of information in real-time. The proposed system adopts distributed machine learning with our proposed method for enhancing decision-making processes. Extensive experiments conducted on two benchmark data sets demonstrate that our proposal for sentiment analysis outperforms well-known distributed recurrent neural network variants (i.e., Long Short-Term Memory (LSTM), Bidirectional Long Short-Term Memory (BiLSTM), and Gated Recurrent Unit (GRU)). Specifically, we tested the efficiency of our approach using the three different deep learning models. The results show that our proposed approach is able to enhance the performance of the three models. The current work can provide several benefits for researchers and practitioners who want to collect, handle, analyze and visualize several sources of information in real-time. Also, it can contribute to a better understanding of public opinion and user behaviors using our proposed system with the improved variants of the most powerful distributed deep learning and machine learning algorithms. Furthermore, it is able to increase the classification accuracy of several existing works based on RNN models for sentiment analysis.  相似文献   

12.
Graph neural networks have been frequently applied in recommender systems due to their powerful representation abilities for irregular data. However, these methods still suffer from the difficulties such as the inflexible graph structure, sparse and highly imbalanced data, and relatively shallow networks, limiting rate prediction ability for recommendations. This paper presents a novel deep dynamic graph attention framework based on influence and preference relationship reconstruction (DGA-IPR) for recommender systems to learn optimal latent representations of users and items. The entire framework involves a user branch and an item branch. An influence-based dynamic graph attention (IDGA) module, a preference-based dynamic graph attention (PDGA) module, and an adaptive fine feature extraction (AFFE) module are respectively constructed for each branch. Concretely, the first two attention modules concentrate on reconstructing influence and preference relationship graphs, breaking imbalanced and fixed constraints of graph structures. Then a deep feature aggregation block and an adaptive feature fusion operation are built, improving the network depth and capturing potential high-order information expressions. Besides, AFFE is designed to acquire finer latent features for users and items. The DGA-IPR architecture is formed by integrating IDGA, PDGA, and AFFE for users and items, respectively. Experiments reveal the superiority of DGA-IPR over existing recommendation models.  相似文献   

13.
This paper introduces an alternative method artificial neural networks (ANN) used to obtain numerical solutions of mathematical models of dynamic systems, represented by ordinary differential equations (ODEs) and partial differential equations (PDEs). The proposed trial solution of differential equations (DEs) consists of two parts: The initial and boundary conditions (BCs) should be satisfied by the first part. However, the second part is not affected from initial and BCs, but it only tries to satisfy DE. This part involves a feedforward ANN containing adjustable parameters (weight and bias). The proposed solution satisfying boundary and initial condition uses a feedforward ANN with one hidden layer varying the neuron number in the hidden layer according to complexity of the considered problem. The ANN having appropriate architecture has been trained with backpropagation algorithm using an adaptive learning rate to satisfy DE. Moreover, we have, first, developed the general formula for the numerical solutions of nth-order initial-value problems by using ANN.For numerical applications, the ODEs that are the mathematical models of linear and non-linear mass-damper-spring systems and the second- and fourth-order PDEs that are the mathematical models of the control of longitudinal vibrations of rods and lateral vibrations of beams have been considered. Finally, the responses of the controlled and non-controlled systems have been obtained. The obtained results have been graphically presented and some conclusion remarks are given.  相似文献   

14.
大学数学是大学中受众面非常广的一门公共基础课程,其核心是培养学生的逻辑思维、创造性思维、建模意识等高阶思维能力。深度学习的目标就是发展学习者的高阶思维能力,元认知能力是指个体对于自身认知过程的调节能力,很多研究者认为深度学习与元认知能力之间存在相互促进的关系。该文从学习科学视角将元认知训练与大学数学深度学习过程的一般模型相结合,以期通过提高大学生的元认知能力实现大学数学深度学习。  相似文献   

15.
Augmented reality is very useful in medical education because of the problem of having body organs in a regular classroom. In this paper, we propose to apply augmented reality to improve the way of teaching in medical schools and institutes. We propose a novel convolutional neural network (CNN) for gesture recognition, which recognizes the human's gestures as a certain instruction. We use augmented reality technology for anatomy learning, which simulates the scenarios where students can learn Anatomy with HoloLens instead of rare specimens. We have used the mesh reconstruction to reconstruct the 3D specimens. A user interface featured augment reality has been designed which fits the common process of anatomy learning. To improve the interaction services, we have applied gestures as an input source and improve the accuracy of gestures recognition by an updated deep convolutional neural network. Our proposed learning method includes many separated train procedures using cloud computing. Each train model and its related inputs have been sent to our cloud and the results are returned to the server. The suggested cloud includes windows and android devices, which are able to install deep convolutional learning libraries. Compared with previous gesture recognition, our approach is not only more accurate but also has more potential for adding new gestures. Furthermore, we have shown that neural networks can be combined with augmented reality as a rising field, and the great potential of augmented reality and neural networks to be employed for medical learning and education systems.  相似文献   

16.
Representation learning has recently been used to remove sensitive information from data and improve the fairness of machine learning algorithms in social applications. However, previous works that used neural networks are opaque and poorly interpretable, as it is difficult to intuitively determine the independence between representations and sensitive information. The internal correlation among data features has not been fully discussed, and it may be the key to improving the interpretability of neural networks. A novel fair representation algorithm referred to as FRC is proposed from this conjecture. It indicates how representations independent of multiple sensitive attributes can be learned by applying specific correlation constraints on representation dimensions. Specifically, dimensions of the representation and sensitive attributes are treated as statistical variables. The representation variables are divided into two parts related to and unrelated to the sensitive variables by adjusting their absolute correlation coefficient with sensitive variables. The potential impact of sensitive information on representations is concentrated in the related part. The unrelated part of the representation can be used in downstream tasks to yield fair results. FRC takes the correlation between dimensions as the key to solving the problem of fair representation. Empirical results show that our representations enhance the ability of neural networks to show fairness and achieve better fairness-accuracy tradeoffs than state-of-the-art works.  相似文献   

17.
This paper studies the problem of adaptive neural network (NN) output-feedback control for a group of uncertain nonlinear multi-agent systems (MASs) from the viewpoint of cooperative learning. It is assumed that all MASs have identical unknown nonlinear dynamic models but carry out different periodic control tasks, i.e., each agent system has its own periodic reference trajectory. By establishing a network topology among systems, we propose a new consensus-based distributed cooperative learning (DCL) law for the unknown weights of radial basis function (RBF) neural networks appearing in output-feedback control laws. The main advantage of such a learning scheme is that all estimated weights converge to a small neighborhood of the optimal value over the union of all system estimated state orbits. Thus, the learned NN weights have better generalization ability than those obtained by traditional NN learning laws. Our control approach also guarantees the convergence of tracking errors and the stability of closed-loop system. Under the assumption that the network topology is undirected and connected, we give a strict proof by verifying the cooperative persisting excitation condition of RBF regression vectors. This condition is defined in our recent work and plays a key role in analyzing the convergence of adaptive parameters. Finally, two simulation examples are provided to verify the effectiveness and advantages of the control scheme proposed in this paper.  相似文献   

18.
Machine learning algorithms enable advanced decision making in contemporary intelligent systems. Research indicates that there is a tradeoff between their model performance and explainability. Machine learning models with higher performance are often based on more complex algorithms and therefore lack explainability and vice versa. However, there is little to no empirical evidence of this tradeoff from an end user perspective. We aim to provide empirical evidence by conducting two user experiments. Using two distinct datasets, we first measure the tradeoff for five common classes of machine learning algorithms. Second, we address the problem of end user perceptions of explainable artificial intelligence augmentations aimed at increasing the understanding of the decision logic of high-performing complex models. Our results diverge from the widespread assumption of a tradeoff curve and indicate that the tradeoff between model performance and explainability is much less gradual in the end user’s perception. This is a stark contrast to assumed inherent model interpretability. Further, we found the tradeoff to be situational for example due to data complexity. Results of our second experiment show that while explainable artificial intelligence augmentations can be used to increase explainability, the type of explanation plays an essential role in end user perception.  相似文献   

19.
Detecting sentiments in natural language is tricky even for humans, making its automated detection more complicated. This research proffers a hybrid deep learning model for fine-grained sentiment prediction in real-time multimodal data. It reinforces the strengths of deep learning nets in combination to machine learning to deal with two specific semiotic systems, namely the textual (written text) and visual (still images) and their combination within the online content using decision level multimodal fusion. The proposed contextual ConvNet-SVMBoVW model, has four modules, namely, the discretization, text analytics, image analytics, and decision module. The input to the model is multimodal text, m ε {text, image, info-graphic}. The discretization module uses Google Lens to separate the text from the image, which is then processed as discrete entities and sent to the respective text analytics and image analytics modules. Text analytics module determines the sentiment using a hybrid of a convolution neural network (ConvNet) enriched with the contextual semantics of SentiCircle. An aggregation scheme is introduced to compute the hybrid polarity. A support vector machine (SVM) classifier trained using bag-of-visual-words (BoVW) for predicting the visual content sentiment. A Boolean decision module with a logical OR operation is augmented to the architecture which validates and categorizes the output on the basis of five fine-grained sentiment categories (truth values), namely ‘highly positive,’ ‘positive,’ ‘neutral,’ ‘negative’ and ‘highly negative.’ The accuracy achieved by the proposed model is nearly 91% which is an improvement over the accuracy obtained by the text and image modules individually.  相似文献   

20.
Although deep learning breakthroughs in NLP are based on learning distributed word representations by neural language models, these methods suffer from a classic drawback of unsupervised learning techniques. Furthermore, the performance of general-word embedding has been shown to be heavily task-dependent. To tackle this issue, recent researches have been proposed to learn the sentiment-enhanced word vectors for sentiment analysis. However, the common limitation of these approaches is that they require external sentiment lexicon sources and the construction and maintenance of these resources involve a set of complexing, time-consuming, and error-prone tasks. In this regard, this paper proposes a method of sentiment lexicon embedding that better represents sentiment word's semantic relationships than existing word embedding techniques without manually-annotated sentiment corpus. The major distinguishing factor of the proposed framework was that joint encoding morphemes and their POS tags, and training only important lexical morphemes in the embedding space. To verify the effectiveness of the proposed method, we conducted experiments comparing with two baseline models. As a result, the revised embedding approach mitigated the problem of conventional context-based word embedding method and, in turn, improved the performance of sentiment classification.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号