Since the seminal work by Shannon, theoreticians have focused on designing compressors targeted at minimizing the output size without sacrificing much of the compression/decompression efficiency. On the other hand, software engineers have deployed several heuristics to implement compressors aimed at trading compressed space versus compression/decompression efficiency in order to match their application needs. In this paper we fill this gap by introducing the bicriteria data-compression problem that seeks to determine the shortest compressed file that can be decompressed in a given time bound. Then, inspired by modern data-storage applications, we instantiate the problem onto the family of Lempel--Ziv-based compressors (such as Snappy and LZ4) and solve it by combining in a novel and efficient way optimization techniques, string-matching data structures, and shortest path algorithms over properly (bi-)weighted graphs derived from the data-compression problem at hand. An extensive set of experiments complements our theoretical achievements by showing that the proposed algorithmic solution is very competitive with respect to state-of-the-art highly engineered compressors. Read More: https://epubs.siam.org/doi/abs/10.1137/17M1121457

Bicriteria Data Compression

Paolo Ferragina;
2019-01-01

Abstract

Since the seminal work by Shannon, theoreticians have focused on designing compressors targeted at minimizing the output size without sacrificing much of the compression/decompression efficiency. On the other hand, software engineers have deployed several heuristics to implement compressors aimed at trading compressed space versus compression/decompression efficiency in order to match their application needs. In this paper we fill this gap by introducing the bicriteria data-compression problem that seeks to determine the shortest compressed file that can be decompressed in a given time bound. Then, inspired by modern data-storage applications, we instantiate the problem onto the family of Lempel--Ziv-based compressors (such as Snappy and LZ4) and solve it by combining in a novel and efficient way optimization techniques, string-matching data structures, and shortest path algorithms over properly (bi-)weighted graphs derived from the data-compression problem at hand. An extensive set of experiments complements our theoretical achievements by showing that the proposed algorithmic solution is very competitive with respect to state-of-the-art highly engineered compressors. Read More: https://epubs.siam.org/doi/abs/10.1137/17M1121457
2019
File in questo prodotto:
File Dimensione Formato  
paper - versione stampata.pdf

non disponibili

Licenza: Copyright dell'editore
Dimensione 890.11 kB
Formato Adobe PDF
890.11 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11382/566824
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
social impact