An efficient page ranking approach based on vector norms using sNorm(p) algorithm |
| |
Authors: | Shubham Goel Ravinder Kumar Munish Kumar Vikram Chopra |
| |
Institution: | 1. Computer Science & Engineering Department, Thapar Institute of Engineering & Technology, Patiala, India;2. Department of Computational Sciences, Maharaja Ranjit Singh Punjab Technical University, Bathinda, India;3. Electrical & Instrumentation Engineering Department, Thapar Institute of Engineering & Technology, Patiala, India |
| |
Abstract: | In the whole world, the internet is exercised by millions of people every day for information retrieval. Even for a small to smaller task like fixing a fan, to cook food or even to iron clothes persons opt to search the web. To fulfill the information needs of people, there are billions of web pages, each having a different degree of relevance to the topic of interest (TOI), scattered throughout the web but this huge size makes manual information retrieval impossible. The page ranking algorithm is an integral part of search engines as it arranges web pages associated with a queried TOI in order of their relevance level. It, therefore, plays an important role in regulating the search quality and user experience for information retrieval. PageRank, HITS, and SALSA are well-known page ranking algorithm based on link structure analysis of a seed set, but ranking given by them has not yet been efficient. In this paper, we propose a variant of SALSA to give sNorm(p) for the efficient ranking of web pages. Our approach relies on a p-Norm from Vector Norm family in a novel way for the ranking of web pages as Vector Norms can reduce the impact of low authority weight in hub weight calculation in an efficient way. Our study, then compares the rankings given by PageRank, HITS, SALSA, and sNorm(p) to the same pages in the same query. The effectiveness of the proposed approach over state of the art methods has been shown using performance measurement technique, Mean Reciprocal Rank (MRR), Precision, Mean Average Precision (MAP), Discounted Cumulative Gain (DCG) and Normalized DCG (NDCG). The experimentation is performed on a dataset acquired after pre-processing of the results collected from initial few pages retrieved for a query by the Google search engine. Based on the type and amount of in-hand domain expertise 30 queries are designed. The extensive evaluation and result analysis are performed using MRR, email protected], MAP, DCG, and NDCG as the performance measuring statistical metrics. Furthermore, results are statistically verified using a significance test. Findings show that our approach outperforms state of the art methods by attaining 0.8666 as MRR value, 0.7957 as MAP value. Thus contributing to the improvement in the ranking of web pages more efficiently as compared to its counterparts. |
| |
Keywords: | Corresponding author Page ranking HITS SALSA Vector Norm Mean reciprocal rank p-Norm |
本文献已被 ScienceDirect 等数据库收录! |
|