gzip — Fast File Compression
Practical guide to gzip: compress, decompress and test files with LZ77 — plus zcat, zgrep, zless and pigz for everyday use.
gzip is the classic Unix compression tool and squeezes single files using the fast LZ77 algorithm. You reach for it daily on logs, database dumps or as the backend of tar. One important gotcha: gzip replaces the original file with the .gz version by default – without -k the original is gone afterwards. This guide walks you through compressing, decompressing and testing, plus the helpers zcat, zgrep and the parallel pigz.
Compress
gzip <file> — Compress a file (replaces original with .gz version).
gzip access.loggzip -k <file> — Compress and keep the original file.
gzip -k access.loggzip -<level> <file> — Compress with a specific level (1=fastest, 9=best compression).
gzip -9 access.loggzip -r <directory> — Recursively compress all files in a directory.
gzip -r /var/log/old/gzip -c <file> > <output.gz> — Compress to stdout, allowing custom output filename.
gzip -c data.json > data.json.gzDecompress
gzip -d <file.gz> — Decompress a gzipped file.
gzip -d access.log.gzgunzip <file.gz> — Decompress a gzipped file (same as gzip -d).
gunzip access.log.gzgzip -dk <file.gz> — Decompress and keep the compressed file.
gzip -dk access.log.gzgunzip -r <directory> — Recursively decompress all .gz files in a directory.
gunzip -r /var/log/old/gzip -dc <file.gz> > <output> — Decompress to stdout with custom output filename.
gzip -dc backup.sql.gz > backup.sqlInformation & Testing
gzip -l <file.gz> — List compression info: compressed/uncompressed size, ratio, name.
gzip -l access.log.gzgzip -t <file.gz> — Test the integrity of a compressed file.
gzip -t backup.sql.gzgzip -v <file> — Verbose mode. Show filename and compression ratio.
gzip -v access.logBatch Operations
gzip *.log — Compress all .log files in the current directory.
gzip *.loggunzip *.gz — Decompress all .gz files in the current directory.
gunzip *.gzfind . -name '*.log' -exec gzip {} \; — Find and compress all .log files recursively.
find /var/log -name '*.log' -exec gzip {} \;find . -name '*.gz' -mtime +30 -delete — Delete compressed files older than 30 days.
find /var/log -name '*.gz' -mtime +30 -deleteRelated Tools
zcat <file.gz> — View contents of a gzipped file without decompressing.
zcat access.log.gzzgrep <pattern> <file.gz> — Search inside a gzipped file without decompressing.
zgrep "404" access.log.gzzless <file.gz> — Page through a gzipped file with less.
zless access.log.gzzdiff <file1.gz> <file2.gz> — Compare two gzipped files without decompressing.
zdiff old.log.gz new.log.gzpigz <file> — Parallel gzip compression using multiple CPU cores.
pigz largefile.sqlpigz -d <file.gz> — Parallel gzip decompression.
pigz -d largefile.sql.gz Conclusion
gzip is fast, available everywhere and ideal for single files such as logs or dumps. Always remember that it replaces the original – use -k when you want to keep it, and gzip -t before you rely on an archive. gzip has no container for multiple files: to pack a whole directory into one file, combine it with tar (tar czf), and for large data sets the parallel pigz pays off.
Further Reading
- GNU gzip – official project page – downloads, manual and release notes
- GNU gzip Manual – complete reference of all options
- gzip – Wikipedia – background on the format, history and LZ77