split — Break Files into Smaller Pieces
Practical guide to split — break large files into pieces by line count, byte size or number of chunks. Useful for processing, transfer and parallel jobs.
split breaks a file into several smaller pieces – by line count (-l), byte size (-b) or a fixed number of chunks (-n). That is handy for making huge log files manageable, chopping large archives for transfer, or splitting data for parallel processing. By default the pieces get an alphabetic suffix (xaa, xab, …), which you can tailor with a custom prefix as well as numeric suffixes and extensions. You reassemble the pieces with a simple cat, as long as you preserve their order.
Split by Lines
split -l <n> <file> — Split file into pieces of n lines each.
split -l 1000 largefile.txtsplit -l <n> <file> <prefix> — Split with a custom output prefix.
split -l 500 data.csv part_split -l 1 <file> — Split each line into its own file.
split -l 1 urls.txt url_Split by Size
split -b <size> <file> — Split into pieces of specified byte size (K, M, G suffixes).
split -b 10M largefile.tar.gz chunk_split -C <size> <file> — Split at line boundaries, keeping pieces under the size limit.
split -C 1M logfile.txt log_split -b 100K <file> <prefix> — Split into 100KB chunks with a custom prefix.
split -b 100K backup.sql sql_Split by Count
split -n <n> <file> — Split into exactly n files of roughly equal size.
split -n 5 largefile.txt part_split -n l/<n> <file> — Split into n files without breaking lines.
split -n l/4 data.csv quarter_split -n r/<n> <file> — Distribute lines round-robin across n files.
split -n r/3 tasks.txt worker_Output Options
split -d <file> — Use numeric suffixes (00, 01, 02...) instead of alphabetic (aa, ab, ac...).
split -d -l 1000 data.csv part_split -a <n> <file> — Set the suffix length (default is 2).
split -a 4 -l 100 huge.txt piece_split --additional-suffix='.txt' <file> — Add a file extension to output files.
split -l 500 --additional-suffix='.csv' data.csv part_split --verbose <file> — Print a message for each output file created.
split --verbose -l 1000 data.txt chunk_split --filter='<cmd>' <file> — Pipe each piece through a command instead of writing to files.
split -l 1000 --filter='gzip > $FILE.gz' data.txt part_Common Patterns
split -b 25M file.tar.gz part_ && cat part_* > file.tar.gz — Split a large file for transfer, then reassemble.
split -b 25M backup.tar.gz upload_ && cat upload_* > backup_restored.tar.gzsplit -l 1000 data.csv batch_ && for f in batch_*; do process "$f"; done — Split data and process each batch.
split -l 1000 users.csv batch_ && for f in batch_*; do ./import.sh "$f"; donewc -l <file> && split -n l/<n> <file> — Check line count, then split evenly for parallel processing.
wc -l data.csv && split -n l/$(nproc) data.csv worker_ Conclusion
split is the go-to when a file is too big for a tool, an upload or a chunk of memory. Remember the difference between the modes: -b cuts strictly by bytes (and may break mid-line), -C and -n l/… respect line boundaries, and -n r/… distributes round-robin. Reassembly comes down to order alone – cat prefix_* works because the shell sorts the alphabetic suffixes correctly; so allow enough suffix digits (-a) for the sort to hold even with many pieces. After reassembling large binary files, it is wise to verify the checksum.
Further Reading
- GNU coreutils: split – official reference with all options
- split(1) man page – the Linux manual page for split