Interpolative coding of integer sequences supporting log-time random access期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Interpolative coding of integer sequences supporting log-time random access

Authors:	J Teuhola

Institution:	Department of Information Technology, University of Turku, Finland

Abstract:	Sequences of integers are common data types, occurring either as primary data or ancillary structures. The sizes of sequences can be large, making compression an interesting option. Effective compression presupposes variable-length coding, which destroys the regular alignment of values. Yet it would often be desirable to access only a small subset of the entries, either by position (ordinal number) or by content (element value), without having to decode most of the sequence from the start. Here such a random access technique for compressed integers is described, with the special feature that no auxiliary index is needed. The solution applies a method called interpolative coding, which is one of the most efficient non-statistical codes for integers. Indexing is avoided by address calculation guaranteeing sufficient space for codes even in the worst case. The additional redundancy, compared to regular interpolative coding, is only about 1 bit per source integer for uniform distribution. The time complexity of random access is logarithmic with respect to the source size for both position-based and content-based retrieval. According to experiments, random access is faster than full decoding when the number of accessed integers is not more than approximately 0.75 · n/log₂n for sequence length n. The tests also confirm that the method is quite competitive with other approaches to random access coding, suggested in the literature.

Keywords:	Source coding Data compression Random access Interpolative coding Inverted index
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏