首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
This study tests the assumption of Markovity in library circulation data. The data was obtained from The Ohio State University Libraries for the years 1975-1978. The test instrument was Rodda's "An Analytic System for Statistical Analysis of Markov Chains." Tests of the time independence of the data revealed a time dependence in the data sets. Tests of the Markovity of the circulation, where second order Markovity is assumed and first order Markovity is tested, revealed all fourteen data groups to be without Markovity. It appears from this study that further tests of the Markovity of library circulation should be conducted before models based on the same assumptions of state and time periods are constructed.  相似文献   

文章在对云计算环境下数字图书馆云存储安全模型分析的基础上,描述了云数据存储中可能发生的安全问题,提出了一种数字图书馆云计算环境下的数据安全存储方案。该方案从数据传输到存储实现了有效的安全防护。  相似文献   

This paper responds to two observations about current government service delivery. First, despite reasonable efforts to improve the design of forms and to establish single points of contact in one-stop shops, citizens still perceive forms as cumbersome. Second, citizens expect governments to act proactively by initiating appropriate government services themselves, instead of relying on requests for services from citizens. To address these two issues, this paper proposes a transition from a one-stop shop to a no-stop shop, where the citizen does not have to perform any action or fill in any forms to receive government services. The contribution of this paper is an e-government stage model that extends existing models. Stage models are suitable tools with which to inspire future developments, and ours extends previous models that guide progress toward the one-stop shop by describing two further stages: the limited no-stop shop and the no-stop shop. We define three dimensions along which to progress: integration of data collection, integration of data storage, and purpose of data use. We provide a first test of the model's validity through three case studies: the e-government practices in Austria, Estonia, and an Australian state government. Our work complements existing research on e-government stage models and proactive government service delivery.  相似文献   

大数据环境下中美高校信息素养培养模式比较研究   总被引:7,自引:3,他引:4  
[目的/意义]通过对大数据环境下中美高校信息素养培养模式的比较分析,指导我国高校培养符合大数据时代需求的专业型和创新型人才。[方法/过程]运用文献收集、网络调查和比较分析的研究方法,选取美国和中国的4所高校作为研究样本,通过比较分析两国高校在大数据环境下信息素养培养模式方面的特点,为我国高校在大数据环境下的信息素养能力培养提供借鉴。[结果/结论]通识教育培养模式侧重对大数据环境下大学生数据意识和信息伦理道德能力的培养;基于软件服务平台的培养模式侧重大数据环境下数据和信息分析处理技能的提高;嵌入学科专业培养模式侧重对学生的实践能力的培养。  相似文献   

This paper is concerned with Markov processes for computing page importance. Page importance is a key factor in Web search. Many algorithms such as PageRank and its variations have been proposed for computing the quantity in different scenarios, using different data sources, and with different assumptions. Then a question arises, as to whether these algorithms can be explained in a unified way, and whether there is a general guideline to design new algorithms for new scenarios. In order to answer these questions, we introduce a General Markov Framework in this paper. Under the framework, a Web Markov Skeleton Process is used to model the random walk conducted by the web surfer on a given graph. Page importance is then defined as the product of two factors: page reachability, the average possibility that the surfer arrives at the page, and page utility, the average value that the page gives to the surfer in a single visit. These two factors can be computed as the stationary probability distribution of the corresponding embedded Markov chain and the mean staying time on each page of the Web Markov Skeleton Process respectively. We show that this general framework can cover many existing algorithms including PageRank, TrustRank, and BrowseRank as its special cases. We also show that the framework can help us design new algorithms to handle more complex problems, by constructing graphs from new data sources, employing new family members of the Web Markov Skeleton Process, and using new methods to estimate these two factors. In particular, we demonstrate the use of the framework with the exploitation of a new process, named Mirror Semi-Markov Process. In the new process, the staying time on a page, as a random variable, is assumed to be dependent on both the current page and its inlink pages. Our experimental results on both the user browsing graph and the mobile web graph validate that the Mirror Semi-Markov Process is more effective than previous models in several tasks, even when there are web spams and when the assumption on preferential attachment does not hold.  相似文献   

基于贝叶斯网络建模的非常规危机事件网络舆情预警研究   总被引:1,自引:0,他引:1  
网络舆情态势作为衡量社情民意的主要指标,在管理实践和学术研究中显得尤为重要。针对网络舆情作用主体复杂多样、作用关系难以预知、作用程度难以计量等特点,将贝叶斯网络建模方法和网络舆情态势评估相结合,同时基于贝叶斯网络三个重要特点--复杂关联关系表示能力、概率不确定表示能力以及因果推理能力,提出基于贝叶斯网络建模的网络舆情态势评估方法。通过对关键指标数据进行仿真和学习,建立网络舆情态势评估模型,从而对网络舆情态势进行有效评估和预测。  相似文献   

梁兴堃 《图书情报工作》2022,66(20):148-161
[目的/意义] 以我国图情领域为例,测量论文的新颖性和传统性并探究其对论文学术影响力的作用进而揭示学术创新的规律。[方法/过程] 采用基于马尔科夫链蒙特卡罗(Markov chain Monte Carlo,MCMC)的方法,对我国2000年至2019年20年间在中文社会科学引文索引(CSSCI)中收录的图书馆学情报学领域的70 207篇研究论文的新颖性、传统性进行测量,并分析论文新颖性和传统性对论文学科影响力的作用。[结果/结论] 结果显示,其他因素不变时,论文新颖性提高1个单位,论文成为高被引论文的优势比增加11%,而论文传统性提高1个单位,论文成为高被引论文的优势比增加33%。边际效应分析显示,同时具有较高的新颖性和传统性的论文较之于其他类型的论文具有更高的成为高被引论文的可能性。此外,随着时间推移,新颖性对论文成为高被引论文概率的影响逐渐削弱,而传统性的影响逐渐增强。同时,作者团队规模对于论文的新颖性存在显著影响,这种影响随着时间的推移而增强。这些发现凸显我国图情领域守正创新的特点,为理解我国图情领域的学术创新规律提供新的实证基础。同时,也提出一种不同于传统信息计量的基于贝叶斯统计的新方法。  相似文献   

网络空间已成为各国争夺的重要领域和战场。文章在分析各国网络空间安全战略的基础上,分析了大数据对国家安全战略的影响,提出构建基于大数据的网络空间安全战略及需关注的重点领域。  相似文献   

在知识服务这一大的研究环境下,基于分类表这一知识分类的工具探究知识组织的具体情况。从揭示知识内在联系的角度,细致地分析了分类表的知识组织结构。针对关系数据库的数据机器存储方式在分类表知识更新、删除、添加上存在的不足之处,给出分类表的图形数据库存储方式以及具体的检索案例。  相似文献   

选择CWM规范作为多源数据整合的参考,构建基于CWM的统一元数据存储区,分析统一元数据存储区的 数据冲突,提出解决方案,并以电动汽车产业为例,从该领域决策支持的业务需求分析出发,重点描述平台的数据类 型及业务信息处理的主要逻辑,构建元数据管理模型及存储区。  相似文献   

文章通过网络问卷调查收集数据,并利用软件SPSS对数据进行详细分析,了解网络迷航现象在不同人群中发生的状况、成因等一系列相关问题。同时,根据数据分析结果从多方面提出了对网络迷航现象改善的建议,为有效减少用户网络迷航现象的工作提供参考。  相似文献   

随着专利数据规模的不断增长,对专利数据的深入挖掘也变得日益重要,特别是专利数据中所蕴含的技术功效等信息具有较高的价值。本文提出了一种基于隐马尔科夫模型的专利功效词识别方法,通过词法与句法分析筛选出候选功效词,在此基础上,采用隐马尔科夫模型并结合专利发明改进的特征设计了功效词识别算法,对候选功效词进行过滤。在新能源汽车等不同领域的专利数据集上,以准确率与召回率作为评价标准,验证所提出方法的有效性。实验结果表明,此方法有效提高了识别准确率与召回率。  相似文献   

XML数据的存储策略研究   总被引:4,自引:0,他引:4  
如何有效存储大量的XML数据是数据管理必须面对的重要研究课题。本文基于XML数据的半结构化特性,分析了目前XML数据的四种主要存储技术,提出了具有现实意义的存储实施策略。  相似文献   

贝叶经分散保存于世界许多国家,对于这份珍贵的文献遗产,各国保存机构的保存方式和保护措施各不相同。中国、尼泊尔和泰国在过去的几十年时间里先后在贝叶经写本保护、缩微复制抢救、数字化利用等方面开展工作,形成了具有特色的保护方式。本文通过分析三个国家开展的在全球引起较大反响的贝叶经保护项目,对比其保护的任务、内容及成果等,归纳得出三种保护模式。文章在分析属地保护、属地保存+跨国合作保护、集中保存与保护三种模式特点的基础上,构建了面向利用的贝叶经保护体系。贝叶经保护的三种模式是不同时代、不同环境、不同文化背景的产物,数字化抢救、加强贝叶经信息资源的建设、构建利用共享平台是实现“藏”与“用”并重的重要举措。  相似文献   

Word embeddings and convolutional neural networks (CNN) have attracted extensive attention in various classification tasks for Twitter, e.g. sentiment classification. However, the effect of the configuration used to generate the word embeddings on the classification performance has not been studied in the existing literature. In this paper, using a Twitter election classification task that aims to detect election-related tweets, we investigate the impact of the background dataset used to train the embedding models, as well as the parameters of the word embedding training process, namely the context window size, the dimensionality and the number of negative samples, on the attained classification performance. By comparing the classification results of word embedding models that have been trained using different background corpora (e.g. Wikipedia articles and Twitter microposts), we show that the background data should align with the Twitter classification dataset both in data type and time period to achieve significantly better performance compared to baselines such as SVM with TF-IDF. Moreover, by evaluating the results of word embedding models trained using various context window sizes and dimensionalities, we find that large context window and dimension sizes are preferable to improve the performance. However, the number of negative samples parameter does not significantly affect the performance of the CNN classifiers. Our experimental results also show that choosing the correct word embedding model for use with CNN leads to statistically significant improvements over various baselines such as random, SVM with TF-IDF and SVM with word embeddings. Finally, for out-of-vocabulary (OOV) words that are not available in the learned word embedding models, we show that a simple OOV strategy to randomly initialise the OOV words without any prior knowledge is sufficient to attain a good classification performance among the current OOV strategies (e.g. a random initialisation using statistics of the pre-trained word embedding models).  相似文献   

Most recent document standards like XML rely on structured representations. On the other hand, current information retrieval systems have been developed for flat document representations and cannot be easily extended to cope with more complex document types. The design of such systems is still an open problem. We present a new model for structured document retrieval which allows computing scores of document parts. This model is based on Bayesian networks whose conditional probabilities are learnt from a labelled collection of structured documents—which is composed of documents, queries and their associated assessments. Training these models is a complex machine learning task and is not standard. This is the focus of the paper: we propose here to train the structured Bayesian Network model using a cross-entropy training criterion. Results are presented on the INEX corpus of XML documents.  相似文献   

田稷  田鹏 《情报学报》2003,22(5):577-583
海量空间信息的存储与处理是当前信息处理技术和数字地球的研究热点之一.传统的空间数据组织方式存在严重的不足,无法满足海量空间信息的存储与处理要求.本文提出了基于可扩展的关系对象型数据库系统建立多比例尺空间信息系统的大比例尺单精度空间数据库(SDSDB)的思想,并给出了系统框架和原型实现.  相似文献   

FC SAN与IP SAN架构在数字图书馆中的应用研究   总被引:6,自引:0,他引:6  
认为针对数字图书馆海量数据存储的现状及需求,构建一个安全、高效的存储后台已成为数字图书馆建设中的重要任务,而目前较常用的DAS、NAS、SAN三种网络存储模式各具特点,且SAN架构具有集中管理和高性能存储的特点。结合具体应用实例对SAN架构中以Fibre Channel搭建的FC SAN和以iSCSI搭建的IP SAN进行综合分析,提出一种基于FC SAN与IP SAN架构的综合网络存储方案。  相似文献   

《Communication monographs》2012,79(3):208-214

This study focused on patterns of verbal behavior embedded in the interview process rather than on externalfactors as predictors of outcome. A Markov model was used to map the relationships between interviewer styles, time, and patterns of communication. Each interview was conceptualized as a system, and the categories of verbal behavior were treated as the finite number of states the system could ocupy. Thirteen‐state and three‐state models of the interview systems were constructed which displayed both state probabilities and transition probabilities of the system's states. The basic finding was that although each interview system had different probability structures, the structure of any one system was quite stable over time.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号