期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

李华云《现代情报》2006,26(11):205-206

潜在语义分析（Iatent Semantic Analysis，简称LSA）通过奇异值分解（Singular Value Decomposition，简称SVD）分析文本集之间的关系．是产生关键词——语义之间映射规则的方法。而随后又出现的PLSA（Probabilistic Latent Semantic Analysis）时基于奇异值分解的LSA又进行统计学的极大似然估计重新解释。LSA最初应用在文本信息检索领域，随着应用领域的不断拓展．LSA在信息过滤、跨语言检索、认知科学和数据挖掘中的信息理解、判断和预测等众多领域中得到了广泛的应用。相似文献

2.

PLSA在图情领域专家专长识别中的应用

张晓娟陆伟程齐凯《现代图书情报技术》2012,(2):76-81

基于图情领域权威期刊论文数据集,利用概率潜在语义分析(PLSA)算法对表征专家专长的文档进行处理,以此来定位图情领域专家的研究领域。实验结果表明,该方法具有可行性并取得较好的实验结果。相似文献

3.

QPLSA: Utilizing quad-tuples for aspect identification and rating

Wenjuan Luo Fuzhen Zhuang Weizhong Zhao Qing He Zhongzhi Shi 《Information processing & management》2015

Aspect level sentiment analysis is important for numerous opinion mining and market analysis applications. In this paper, we study the problem of identifying and rating review aspects, which is the fundamental task in aspect level sentiment analysis. Previous review aspect analysis methods seldom consider entity or rating but only 2-tuples, i.e., head and modifier pair, e.g., in the phrase “nice room”, “room” is the head and “nice” is the modifier. To solve this problem, we novelly present a Quad-tuple Probability Latent Semantic Analysis (QPLSA), which incorporates entity and its rating together with the 2-tuples into the PLSA model. Specifically, QPLSA not only generates fine-granularity aspects, but also captures the correlations between words and ratings. We also develop two novel prediction approaches, the Quad-tuple Prediction (from the global perspective) and the Expectation Prediction (from the local perspective). For evaluation, systematic experiments show that: Quad-tuple PLSA outperforms 2-tuple PLSA significantly on both aspect identification and aspect rating prediction for publication datasets. Moreover, for aspect rating prediction, QPLSA shows significant superiority over state-of-the-art baseline methods. Besides, the Quad-tuple Prediction and the Expectation Prediction also show their strong ability in aspect rating on different datasets. 相似文献

4.

基于并行计算的概率潜在语义分析算法研究

赵伟《安徽职业技术学院学报》2014,(3):1-3

概率潜在语义分析(Probabilistic Latent Semantic Analysis,PLSA)中通过将文档—单词关系转变成文档—主题—单词关系对文档进行排序、过滤、分类等操作,计算量巨大。文章设计了基于MPI(Message Passing Interface)的PLSA高效并行方案,对模型系统和训练数据处理以及并行算法加以优化,提出了一种大数据条件下PLSA并行算法,解决了以往数据规模太大难以计算的问题,算法较优化前训练速度有较大提升,具有扩展性和可行性。相似文献

5.

基于PLSA模型的Web页面语义标注算法研究

王云英《情报杂志》2013,(1):141-144

高效的Web页面语义标注方法是提高Web信息资源利用效率和知识创新的关键。针对当前Web页面语义标注方法存在的问题和Web页面表现出的结构特征和文本特征及其主题分布规律,设计了基于PLSA主题模型的Web页面语义标注算法。该算法分别对Web页面的结构特征和文本特征构建独立的PLSA主题模型,采用自适应不对称学习算法对这些独立的PLSA主题模型进行集成和优化,最终形成新的综合性的PLSA主题模型进行未知Web页面的自动语义标注。实验结果表明,该算法能够显著提高Web页面语义标注的准确率和效率,可以有效地解决大规模Web页面语义标注问题。相似文献

6.

基于PLSA的面向用户的网络搜索

于芳陈冬玲王大玲于戈鲍玉斌《东南大学学报》2007,23(3):347-351

针对当前的搜索引擎提供面向查询、而非面向用户的服务,从而导致搜索引擎无法满足用户个性化的需求这一问题,提出了一种基于PLSA的新方法,将面向查询词的搜索转变成面向用户的搜索.首先,通过分析用户查询历史和浏览记录建立代表用户模型的用户兴趣向量,在用户发出查询时用户的查询词根据用户兴趣向量被映射到兴趣分类上,最终根据面向用户排序算法将返回结果列表重新排序.实验表明该面向用户搜索系统能够充分考虑用户的偏好,从而更好地满足不同用户的信息需求. 相似文献

7.

Test Data Likelihood for PLSA Models 总被引：2，自引：0，他引：2

Thorsten?Brants Email author 《Information Retrieval》2005,8(2):181-196

Probabilistic Latent Semantic Analysis (PLSA) is a statistical latent class model that has recently received considerable attention. In its usual formulation it cannot assign likelihoods to unseen documents. Furthermore, it assigns a probability of zero to unseen documents during training. We point out that one of the two existing alternative formulations of the Expectation-Maximization algorithms for PLSA does not require this assumption. However, even that formulation does not allow calculation ofthe actual likelihood values. We therefore derive a new test-data likelihood substitute for PLSA and compare it to three existing likelihood substitutes. An empirical evaluation shows that our new likelihood substitute produces the best predictions about accuracies in two different IR tasks and is therefore best suited to determine the number of EM steps when training PLSA models. The new likelihood measure and its evaluation also suggest that PLSA is not very sensitive to overfitting for the two tasks considered. This renders additions like tempered EM that especially address overfitting unnecessary.The work reported here was carried out while the author was at the Palo Alto Research Center (PARC). 相似文献