Google Zopfli - Nice Compression Shame About The Speed
Written by Harry Fairhead   
Friday, 01 March 2013

Google is offering us a new lossless compression tool called Zopfli. it can produce zip files that are up to 5% smaller with no negative effects on the end user. There is a catch - Zopfli is much slower than your average zip.

 

Compression algorithms are often asymmetric in the workload they place on the compressor and the decompressor. If you can save a few more bytes by working the compressor harder then it can be a great benefit to the decompressor with no extra work to do.  So it is with the LZ77-based Deflate algorithm, which is used by a number of compression programs - Pkzip, and so on - to produce zip format files.

LZ77 or Liv-Zempel 77 is a dictionary-based compression method that scans the original file and creates a dictionary of bit patterns that are repeated in the file. Compression is achieved by replacing the bit patterns in the file with a fixed size dictionary reference. This works well in practice and it is the basis for the Deflate method, which first uses LZ77 compression which it follows up with a Huffman encoding.

How effective LZ77 is depends on how large and how frequent the dictionary entries are. If you replace a lot of substrings with dictionary references then how much storage you save depends on how big they are and how many times they occur in the original file. The quality of the compression depends on how you select the dictionary entries.

Notice that the compression achieved is lossless - that is the decompressed file is identical to the original and the work that the decompressing program has to do is virtually independent of the quality of the dictionary.

If you spend longer on the compression you might able to achieve higher compression ratios without having to modify the decompression tools in any way.

In practice Deflate is even more complicated because it uses two compression methods, LZ77 and Huffman encoding, and both can be tweaked to modify how much compression is achieved.

 

zopf

 

Zopfli is named after a Swiss bread recipe - why I'm not sure. It is an open source C program that you can use to implement your own version of the compression algorithm. It achieves a 3-8% smaller compressed file size by implementing  an exhaustive search method for the best dictionary. It is based on iterating entropy modeling and shortest path search to find a low bit cost path through the graph of all possible deflate data representations.

Once you have compressed a file using Zopfli you can decompress is using standard gzip, Zip, PNG or HTTP decompression. In other words, everyone wins and smaller files mean faster downloads, less bandwidth and longer battery life on mobile devices.

Now to the downside.

Zopfli takes up to 100 time longer to compress the file in the first place. This means that it certainly isn't worth using it for on-the-fly compression. However, for static resources that can be compressed once and then served many many times, it might be worth the extra server time.

More Information

Zopfli Code
Google Code Blog

Paper detailing performance

Related Articles

Data compression the dictionary way

Coding Theory

Network Coding Speeds Up Wireless by 1000%

Information Theory

Zip for the Genome - G-SQueeZ

Fractal Image Compression

 

To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, Facebook, Google+ or Linkedin,  or sign up for our weekly newsletter.

 

espbook

 

Comments




or email your comment to: comments@i-programmer.info

 

Banner


Ursina - A Game Engine Powered by Python
08/11/2024

Ursina is a new open source game engine in which you can code any type of game in Python, be it 2-D, 3-D, an application, a visualization, you name it.



52nd Mersenne Prime Found
27/10/2024

It has been nearly six years since the last Mersenne prime was discovered. Now, at last, we have Mersenne prime number 52 and it has 41,024,320 digits!


More News

Last Updated ( Friday, 01 March 2013 )