首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
[目的/意义]为了帮助情报学学科背景的就业人员掌握市场对情报学人才的具体需要,为情报学的教育者拟定情报学的教育体系和人才培养的目标提供指导。[方法/过程]采集国内各大招聘网站情报学相关职位招聘公告,构建情报学招聘语料库,基于CRF机器学习模型和Bi-LSTM-CRF、BERT、BERT-Bi-LSTM-CRF深度学习模型,从语料库中抽取5类情报学招聘实体进行挖掘分析。[结果/结论]通过在已有2000篇经过标注的职位招聘公告语料库上开展情报学招聘实体自动抽取对比实验,识别效果最佳的CRF模型的整体F值为85.07%,其中对"专业要求"实体的识别F值达到了91.67%。BERT模型在"专业要求"实体识别任务中更是取得了92.10%的F值。使用CRF模型对全部符合要求的5287篇招聘公告进行实体抽取,构建了情报学招聘实体社会网络,并通过信息计量分析与社会网络分析的方式挖掘隐含知识。  相似文献   

[目的/意义]基于数据科学与情报学领域的密切联系,对数据科学任职要求知识进行深入挖掘,有利于掌握社会对于情报学相关领域人才的需求,从而完善情报学教育的培养方案,帮助实现社会需求与高校教育的良好对接。[方法/过程]文章采集了国内主流招聘网站中数据科学相关工作岗位的招聘信息,并对数据进行解析、去重等清洗工作,对招聘信息中的任职要求实体进行人工标注,比较了LSTM,BiLSTM-CRF和BERT三种深度学习模型应用于实体识别的效果。[结果/结论]结果表明,BiLSTM-CRF模型对任职要求实体的识别效果最好,相较于其他两种深度学习模型具有一定的优势。文章根据抽取出的任职要求实体从实践能力、学历要求、脚本语言、数据处理、综合素质等方面总结了目前情报学人才应当具备的技能和素质,并由此提出了针对情报学教育的人才培养方案。  相似文献   

与传统的人才招聘方式比较,网上招聘突破了时间和空间的限制,具有成本低、信息量大、速度快、个性化服务的优点。人才招聘网站基于ASP.NET平台,采用C≠≠语言进行开发,系统为求职者和招聘单位提供了一个方便快捷的求职招聘平台,使得求职和招聘工作轻松易行。本文介绍了系统的开发背景、系统设计和部分模块的实现代码。  相似文献   

The number of patent documents is currently rising rapidly worldwide, creating the need for an automatic categorization system to replace time-consuming and labor-intensive manual categorization. Because accurate patent classification is crucial to search for relevant existing patents in a certain field, patent categorization is a very important and useful field. As patent documents are structural documents with their own characteristics distinguished from general documents, these unique traits should be considered in the patent categorization process. In this paper, we categorize Japanese patent documents automatically, focusing on their characteristics: patents are structured by claims, purposes, effects, embodiments of the invention, and so on. We propose a patent document categorization method that uses the k-NN (k-Nearest Neighbour) approach. In order to retrieve similar documents from a training document set, some specific components to denote the so-called semantic elements, such as claim, purpose, and application field, are compared instead of the whole texts. Because those specific components are identified by various user-defined tags, first all of the components are clustered into several semantic elements. Such semantically clustered structural components are the basic features of patent categorization. We can achieve a 74% improvement of categorization performance over a baseline system that does not use the structural information of the patent.  相似文献   

分析了当前国内就业形势,针对当前“招工难”与“就业难”问题并存的现状,提出了一个基于向量相似度的招聘就业双向推荐模型.模型首先按条件对候选推荐信息进行筛选;然后将招聘和求职信息转化为向量,为不同分量建立相应的量化规则并进行量化,使之可计算;最后采用夹角余弦公式计算向量间的相似度,并以此作为双向推荐的标准.模型在测试数据集和实际数据集上均取得了较好的运行效率,准确率高,达到了最优化推荐,一定程度上缓解了江门市目前招聘就业困难的压力,取得了良好的社会效益.  相似文献   

Document classification, with the blooming of the Internet information delivery, has become indispensable required and is expected to be disposed by an automatic text categorization. This paper presents a text categorization system to solve the multi-class categorization problem. The system consists of two modules: the processing module and the classifying module. In the first module, ICF and Uni are used as the indictors to extract the relevant terms. While the fuzzy set theory is incorporated into the OAA-SVM in the classifying module, we specifically propose an OAA-FSVM classifier to implement a multi-class classification system. The performances of OAA-SVM and OAA-FSVM are evaluated by macro-average performance index.  相似文献   

张志明  杨爱元 《科教文汇》2020,(9):102-103,116
在理论层面,工作分析主要包括工作分析系统选择、数据信息收集以及统计分析三要素。在工作分析过程中融入创新思维,通过跨学科、多视角的创新思维训练,丰富数据收集手段,保证工作分析数据信度,提高工作分析成果效度。在实践层面,通过案例研究实验、仿真实训不断强化工作分析解决实际问题的能力。  相似文献   

朱永武 《现代情报》2013,33(2):117-120
文章在国内外研究基础上,提出了相应的研究框架。通过预测试形成包含30个题项的正式调查问卷。在选取南京、江阴、连云港等地部分企事业单位中的知识型员工展开调查,回收171份有效问卷之后,运用SPSS13.0统计软件,通过相关分析和回归分析,探讨了信息素养、信息素养各维度,包括信息知识、信息意识、信息能力、信息道德与工作绩效之间的关系,从而对我国知识型员工信息素养的培养和工作绩效的改善有一点的参考价值。  相似文献   

A challenge for sentence categorization and novelty mining is to detect not only when text is relevant to the user’s information need, but also when it contains something new which the user has not seen before. It involves two tasks that need to be solved. The first is identifying relevant sentences (categorization) and the second is identifying new information from those relevant sentences (novelty mining). Many previous studies of relevant sentence retrieval and novelty mining have been conducted on the English language, but few papers have addressed the problem of multilingual sentence categorization and novelty mining. This is an important issue in global business environments, where mining knowledge from text in a single language is not sufficient. In this paper, we perform the first task by categorizing Malay and Chinese sentences, then comparing their performances with that of English. Thereafter, we conduct novelty mining to identify the sentences with new information. Experimental results on TREC 2004 Novelty Track data show similar categorization performance on Malay and English sentences, which greatly outperform Chinese. In the second task, it is observed that we can achieve similar novelty mining results for all three languages, which indicates that our algorithm is suitable for novelty mining of multilingual sentences. In addition, after benchmarking our results with novelty mining without categorization, it is learnt that categorization is necessary for the successful performance of novelty mining.  相似文献   

A new dictionary-based text categorization approach is proposed to classify the chemical web pages efficiently. Using a chemistry dictionary, the approach can extract chemistry-related information more exactly from web pages. After automatic segmentation on the documents to find dictionary terms for document expansion, the approach adopts latent semantic indexing (LSI) to produce the final document vectors, and the relevant categories are finally assigned to the test document by using the k-NN text categorization algorithm. The effects of the characteristics of chemistry dictionary and test collection on the categorization efficiency are discussed in this paper, and a new voting method is also introduced to improve the categorization performance further based on the collection characteristics. The experimental results show that the proposed approach has the superior performance to the traditional categorization method and is applicable to the classification of chemical web pages.  相似文献   

在信息管理职业的基本知识技能需求分析基础上,采用随机方式对51job.com网站的企业招聘信息进行抽样调查,分别从总体知识技能需求、不同性质企业的知识技能需求和不同经济发展水平的知识技能需求3个方面进行详细分析。结果表明:无论是不同性质的企业还是不同经济发展水平的企业,它们对信息管理专业的总体预期都趋于一致,主要的职业知识技能需求是信息分析(含情报分析)、信息系统开发与管理、管理统计和企业系统知识。  相似文献   

Most previous works of feature selection emphasized only the reduction of high dimensionality of the feature space. But in cases where many features are highly redundant with each other, we must utilize other means, for example, more complex dependence models such as Bayesian network classifiers. In this paper, we introduce a new information gain and divergence-based feature selection method for statistical machine learning-based text categorization without relying on more complex dependence models. Our feature selection method strives to reduce redundancy between features while maintaining information gain in selecting appropriate features for text categorization. Empirical results are given on a number of dataset, showing that our feature selection method is more effective than Koller and Sahami’s method [Koller, D., & Sahami, M. (1996). Toward optimal feature selection. In Proceedings of ICML-96, 13th international conference on machine learning], which is one of greedy feature selection methods, and conventional information gain which is commonly used in feature selection for text categorization. Moreover, our feature selection method sometimes produces more improvements of conventional machine learning algorithms over support vector machines which are known to give the best classification accuracy.  相似文献   

Text categorization is an important research area and has been receiving much attention due to the growth of the on-line information and of Internet. Automated text categorization is generally cast as a multi-class classification problem. Much of previous work focused on binary document classification problems. Support vector machines (SVMs) excel in binary classification, but the elegant theory behind large-margin hyperplane cannot be easily extended to multi-class text classification. In addition, the training time and scaling are also important concerns. On the other hand, other techniques naturally extensible to handle multi-class classification are generally not as accurate as SVM. This paper presents a simple and efficient solution to multi-class text categorization. Classification problems are first formulated as optimization via discriminant analysis. Text categorization is then cast as the problem of finding coordinate transformations that reflects the inherent similarity from the data. While most of the previous approaches decompose a multi-class classification problem into multiple independent binary classification tasks, the proposed approach enables direct multi-class classification. By using generalized singular value decomposition (GSVD), a coordinate transformation that reflects the inherent class structure indicated by the generalized singular values is identified. Extensive experiments demonstrate the efficiency and effectiveness of the proposed approach.  相似文献   

Since changes in job characteristics in areas such as Industry 4.0 are rapid, fast tool for analysis of job advertisements is needed. Current knowledge about competencies required in Industry 4.0 is scarce. The goal of this paper is to develop a profile of Industry 4.0 job advertisements, using text mining on publicly available job advertisements, which are often used as a channel for collecting relevant information about the required knowledge and skills in rapid-changing industries. We searched website, which publishes job advertisements, related to Industry 4.0, and performed text mining analysis on the data collected from those job advertisements. Analysis of the job advertisements revealed that most of them were for full time entry; associate and mid-senior level management positions and mainly came from the United States and Germany. Text mining analysis resulted in two groups of job profiles. The first group of job profiles was focused solely on the knowledge related to Industry 4.0: cyberphysical systems and the Internet of things for robotized production; and smart production design and production control. The second group of job profiles was focused on more general knowledge areas, which are adapted to Industry 4.0: supply change management, customer satisfaction, and enterprise software. Topic mining was conducted on the extracted phrases generating various multidisciplinary job profiles. Higher educational institutions, human resources professionals, as well as experts that are already employed or aspire to be employed in Industry 4.0 organizations, would benefit from the results of our analysis.  相似文献   

The chronic shortage of doctors in rural India seriously impacts the quality of health care available to villagers. In recent years, there has been considerable excitement in digital diagnostics as a possible answer to this situation by allowing non-doctors to diagnose and treat patients. In this article, the author focuses on one such diagnostic tool that has gained serious traction among transnational health foundations and state governments alike. The focus is on the customization and localization of this software through a pilot study in central Himalayas. A baseline survey and extensive interviews are conducted for categorization and population of health data content. This entailed analyzing the segmentation and transfer of health information on disease history and symptoms from the patient to the software as well as situating this study in the larger understanding of the healthcare system in this community. In doing so, the author argues that much of such health information is difficult to categorize and sufficiently vague to not provide for a confident diagnosis. Further, the data population of the treatment segment is deeply political and sociocultural. This article thereby problematizes the innate assumption underlying the design of such software, that it is possible to diagnose and treat patients based on pure information.  相似文献   

In text categorization, it is quite often that the numbers of documents in different categories are different, i.e., the class distribution is imbalanced. We propose a unique approach to improve text categorization under class imbalance by exploiting the semantic context in text documents. Specifically, we generate new samples of rare classes (categories with relatively small amount of training data) by using global semantic information of classes represented by probabilistic topic models. In this way, the numbers of samples in different categories can become more balanced and the performance of text categorization can be improved using this transformed data set. Indeed, the proposed method is different from traditional re-sampling methods, which try to balance the number of documents in different classes by re-sampling the documents in rare classes. Such re-sampling methods can cause overfitting. Another benefit of our approach is the effective handling of noisy samples. Since all the new samples are generated by topic models, the impact of noisy samples is dramatically reduced. Finally, as demonstrated by the experimental results, the proposed methods can achieve better performance under class imbalance and is more tolerant to noisy samples.  相似文献   

文本自动分类是文本信息处理中的一项基础性工作。将范例推理应用于文本分类中,并利用词语间的词共现信息从文本中抽取主题词和频繁词共现项目集,以及借助聚类算法对范例库进行索引,实现了基于范例推理的文本自动分类系统。实验表明,与基于TFIDF的文本表示方法和最近邻分类算法相比,基于词共现信息的文本表示方法和范例库的聚类索引能有效地改善分类的准确性和效率,从而拓宽了范例推理的应用领域。  相似文献   

为提高中文文本分类科研与教学人员的工作效率,本文针对国内现有中文文本分类系统的研发现状,构建一个包括预处理、特征选择、权值计算、自动分类和分类效果测评等文本分类全过程的管理平台。开发过程中,本文使用系统集成思想和方法将自编软件代码与相关的开源软件代码进行集成。经测试,该系统实现了文本自动分类过程的全部功能。  相似文献   

层次分析法在高校教师招聘胜任力模型建构中的应用   总被引:1,自引:0,他引:1  
胜任力模型的应用.有助于在招聘高校教师时时其所需要具备的各胜任特征的总体考察,尤其是职业个性和求职动机等深层次特征的考察.各胜任指标是有不同的权重的,因此,应用层次分析法对各指标权重进行分析,是建构高校教师招聘胜任力评价模型的前提.应用实例评估的结果表明,所建立的高校教师招聘胜任力评价模型是可操作的.  相似文献   

The rapid expansion of Big Data Analytics is forcing companies to rethink their Human Resource (HR) needs. However, at the same time, it is unclear which types of job roles and skills constitute this area. To this end, this study pursues to drive clarity across the heterogeneous nature of skills required in Big Data professions, by analyzing a large amount of real-world job posts published online. More precisely we: 1) identify four Big Data ‘job families’; 2) recognize nine homogeneous groups of Big Data skills (skill sets) that are being demanded by companies; 3) characterize each job family with the appropriate level of competence required within each Big Data skill set. We propose a novel, semi-automated, fully replicable, analytical methodology based on a combination of machine learning algorithms and expert judgement. Our analysis leverages a significant amount of online job posts, obtained through web scraping, to generate an intelligible classification of job roles and skill sets. The results can support business leaders and HR managers in establishing clear strategies for the acquisition and the development of the right skills needed to leverage Big Data at best. Moreover, the structured classification of job families and skill sets will help establish a common dictionary to be used by HR recruiters and education providers, so that supply and demand can more effectively meet in the job marketplace.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号