Big Data Compression

In this chapter

QueryIO supports big data compression on the server side. If compression is enabled, then whatever big data you write to the server is first compressed and then saved on the cluster. When you fetch the data, the data is first decompressed and then returned to you.

Compression is useful because it helps reduce resource usage, such as data storage space transmission capacity.

QueryIO supports various algorithms for compression.

Compression algorithms supported by QueryIO are:

You can select compression type while data import.

See REST API server documentation [Put Object Action] to see how you can enable compression for files.

Snappy

Snappy is a lossless data compression algorithms. It aims for very high speed and reasonable compression.

Snappy encoding is byte-oriented (only whole bytes are emitted or consumed from a stream).

org.xerial.snappy.SnappyInputStream and org.xerial.snappy.SnappyOutputStream are imported for compression and decompression of streams. SnappyInputStream compresses the target input stream and returns a compressed input stream. SnappyOutputStream compresses the target output stream and returns a compressed output stream.
Click here to see how to use SNAPPY compression with S3 compatible REST API.

gzip

GZIP is a lossless data compression algorithms. gzip produces files with a .gz extension. gunzip can decompress files created by gzip, compress or pack. The detection of the input format is automatic.

gzip compression works by finding similar strings within a text file, and replacing those strings temporarily to make the overall file size smaller. This form of compression is particularly well-suited for the web because HTML and CSS files usually contain plenty of repeated strings, such as whitespace, tags, and style definitions.

Click here to see how to use gzip compression with S3 compatible REST API.

LZ4

LZ4 is a very fast lossless compression algorithm. It allows for an easy code, and fast execution. It provides better compression ratio for text files and reaches impressive decompression speed.

Click here to see how to use LZ4 compression with S3 compatible REST API.



Copyright 2017 QueryIO Corporation. All Rights Reserved.

QueryIO, "Big Data Intelligence" and the QueryIO Logo are trademarks of QueryIO Corporation. Apache, Hadoop and HDFS are trademarks of The Apache Software Foundation.