gzip — Fast File Compression

Practical guide to gzip: compress, decompress and test files with LZ77 — plus zcat, zgrep, zless and pigz for everyday use.

gzip is the classic Unix compression tool and squeezes single files using the fast LZ77 algorithm. You reach for it daily on logs, database dumps or as the backend of tar. One important gotcha: gzip replaces the original file with the .gz version by default – without -k the original is gone afterwards. This guide walks you through compressing, decompressing and testing, plus the helpers zcat, zgrep and the parallel pigz.

Compress

gzip <file> — Compress a file (replaces original with .gz version).

gzip access.log

gzip -k <file> — Compress and keep the original file.

gzip -k access.log

gzip -<level> <file> — Compress with a specific level (1=fastest, 9=best compression).

gzip -9 access.log

gzip -r <directory> — Recursively compress all files in a directory.

gzip -r /var/log/old/

gzip -c <file> > <output.gz> — Compress to stdout, allowing custom output filename.

gzip -c data.json > data.json.gz

Decompress

gzip -d <file.gz> — Decompress a gzipped file.

gzip -d access.log.gz

gunzip <file.gz> — Decompress a gzipped file (same as gzip -d).

gunzip access.log.gz

gzip -dk <file.gz> — Decompress and keep the compressed file.

gzip -dk access.log.gz

gunzip -r <directory> — Recursively decompress all .gz files in a directory.

gunzip -r /var/log/old/

gzip -dc <file.gz> > <output> — Decompress to stdout with custom output filename.

gzip -dc backup.sql.gz > backup.sql

Information & Testing

gzip -l <file.gz> — List compression info: compressed/uncompressed size, ratio, name.

gzip -l access.log.gz

gzip -t <file.gz> — Test the integrity of a compressed file.

gzip -t backup.sql.gz

gzip -v <file> — Verbose mode. Show filename and compression ratio.

gzip -v access.log

Batch Operations

gzip *.log — Compress all .log files in the current directory.

gzip *.log

gunzip *.gz — Decompress all .gz files in the current directory.

gunzip *.gz

find . -name '*.log' -exec gzip {} \; — Find and compress all .log files recursively.

find /var/log -name '*.log' -exec gzip {} \;

find . -name '*.gz' -mtime +30 -delete — Delete compressed files older than 30 days.

find /var/log -name '*.gz' -mtime +30 -delete

zcat <file.gz> — View contents of a gzipped file without decompressing.

zcat access.log.gz

zgrep <pattern> <file.gz> — Search inside a gzipped file without decompressing.

zgrep "404" access.log.gz

zless <file.gz> — Page through a gzipped file with less.

zless access.log.gz

zdiff <file1.gz> <file2.gz> — Compare two gzipped files without decompressing.

zdiff old.log.gz new.log.gz

pigz <file> — Parallel gzip compression using multiple CPU cores.

pigz largefile.sql

pigz -d <file.gz> — Parallel gzip decompression.

pigz -d largefile.sql.gz

Conclusion

gzip is fast, available everywhere and ideal for single files such as logs or dumps. Always remember that it replaces the original – use -k when you want to keep it, and gzip -t before you rely on an archive. gzip has no container for multiple files: to pack a whole directory into one file, combine it with tar (tar czf), and for large data sets the parallel pigz pays off.

Further Reading

  • 7z – high-ratio archiver with its own .7z format
  • tar – bundles many files into one archive, often paired with gzip
  • zip – cross-platform archive format with compression in one step