Rank hash similarity for fast similarity search期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Rank hash similarity for fast similarity search

Authors:	Min Lu YaLou Huang MaoQiang Xie Jie Liu

Institution:	1. College of Information Technical Science, Nankai University, Tianjin, China;2. College of Software, Nankai University, Tianjin, China;3. Information Technology Research Base of Civil Aviation Administration of China, Civil Aviation University of China, China

Abstract:	The paper is concerned with similarity search at large scale, which efficiently and effectively finds similar data points for a query data point. An efficient way to accelerate similarity search is to learn hash functions. The existing approaches for learning hash functions aim to obtain low values of Hamming distances for the similar pairs. However, these methods ignore the ranking order of these Hamming distances. This leads to the poor accuracy about finding similar items for a query data point. In this paper, an algorithm is proposed, referred to top k RHS (Rank Hash Similarity), in which a ranking loss function is designed for learning a hash function. The hash function is hypothesized to be made up of l binary classifiers. The issue of learning a hash function can be formulated as a task of learning l binary classifiers. The algorithm runs l rounds and learns a binary classifier at each round. Compared with the existing approaches, the proposed method has the same order of computational complexity. Nevertheless, experiment results on three text datasets show that the proposed method obtains higher accuracy than the baselines.

Keywords:	Similarity search Hash function Similarity ranking
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏