CNR Institutional Research Information System

Dictionary-based compression schemes provide fast decoding operation, typically at the expense of reduced compression effectiveness compared to statistical or probability-based approaches. In this work, we apply dictionary-based techniques to the compression of inverted lists, showing that the high degree of regularity that these integer sequences exhibit is a good match for certain types of dictionary methods, and that an important new trade-off balance between compression effectiveness and compression efficiency can be achieved. Our observations are supported by experiments using the document-level inverted index data for two large text collections, and a wide range of other index compression implementations as reference points. Those experiments demonstrate that the gap between efficiency and effectiveness can be substantially narrowed.

Fast dictionary-based compression for inverted indexes

Pibiri G E;Petri M;Moffat A

2019

Abstract

Dictionary-based compression schemes provide fast decoding operation, typically at the expense of reduced compression effectiveness compared to statistical or probability-based approaches. In this work, we apply dictionary-based techniques to the compression of inverted lists, showing that the high degree of regularity that these integer sequences exhibit is a good match for certain types of dictionary methods, and that an important new trade-off balance between compression effectiveness and compression efficiency can be achieved. Our observations are supported by experiments using the document-level inverted index data for two large text collections, and a wide range of other index compression implementations as reference points. Those experiments demonstrate that the gap between efficiency and effectiveness can be substantially narrowed.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Parole chiave
	
				Compression
Decoding
Efficiency
Inverted index
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_402784-doc_140201.pdf accesso aperto Descrizione: Fast dictionary-based compression for inverted indexes Tipologia: Versione Editoriale (PDF) Dimensione 579.43 kB Formato Adobe PDF Visualizza/Apri	579.43 kB	Adobe PDF	Visualizza/Apri
prod_402784-doc_164452.pdf non disponibili Descrizione: Fast dictionary-based compression for inverted indexes Tipologia: Versione Editoriale (PDF) Dimensione 1.13 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.13 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/365126

Citazioni

ND

21

15

social impact