# split — Break Files into Smaller Pieces

> Practical guide to split — break large files into pieces by line count, byte size or number of chunks. Useful for processing, transfer and parallel jobs.

Source: https://www.jpkc.com/db/en/cheatsheets/files-text/split/

<!-- PROSE:intro -->
split breaks a file into several smaller pieces – by line count (`-l`), byte size (`-b`) or a fixed number of chunks (`-n`). That is handy for making huge log files manageable, chopping large archives for transfer, or splitting data for parallel processing. By default the pieces get an alphabetic suffix (`xaa`, `xab`, …), which you can tailor with a custom prefix as well as numeric suffixes and extensions. You reassemble the pieces with a simple `cat`, as long as you preserve their order.
<!-- PROSE:intro:end -->

## Split by Lines

`split -l <n> <file>` — Split file into pieces of n lines each.

```bash
split -l 1000 largefile.txt
```

`split -l <n> <file> <prefix>` — Split with a custom output prefix.

```bash
split -l 500 data.csv part_
```

`split -l 1 <file>` — Split each line into its own file.

```bash
split -l 1 urls.txt url_
```

## Split by Size

`split -b <size> <file>` — Split into pieces of specified byte size (K, M, G suffixes).

```bash
split -b 10M largefile.tar.gz chunk_
```

`split -C <size> <file>` — Split at line boundaries, keeping pieces under the size limit.

```bash
split -C 1M logfile.txt log_
```

`split -b 100K <file> <prefix>` — Split into 100KB chunks with a custom prefix.

```bash
split -b 100K backup.sql sql_
```

## Split by Count

`split -n <n> <file>` — Split into exactly n files of roughly equal size.

```bash
split -n 5 largefile.txt part_
```

`split -n l/<n> <file>` — Split into n files without breaking lines.

```bash
split -n l/4 data.csv quarter_
```

`split -n r/<n> <file>` — Distribute lines round-robin across n files.

```bash
split -n r/3 tasks.txt worker_
```

## Output Options

`split -d <file>` — Use numeric suffixes (00, 01, 02...) instead of alphabetic (aa, ab, ac...).

```bash
split -d -l 1000 data.csv part_
```

`split -a <n> <file>` — Set the suffix length (default is 2).

```bash
split -a 4 -l 100 huge.txt piece_
```

`split --additional-suffix='.txt' <file>` — Add a file extension to output files.

```bash
split -l 500 --additional-suffix='.csv' data.csv part_
```

`split --verbose <file>` — Print a message for each output file created.

```bash
split --verbose -l 1000 data.txt chunk_
```

`split --filter='<cmd>' <file>` — Pipe each piece through a command instead of writing to files.

```bash
split -l 1000 --filter='gzip > $FILE.gz' data.txt part_
```

## Common Patterns

`split -b 25M file.tar.gz part_ && cat part_* > file.tar.gz` — Split a large file for transfer, then reassemble.

```bash
split -b 25M backup.tar.gz upload_ && cat upload_* > backup_restored.tar.gz
```

`split -l 1000 data.csv batch_ && for f in batch_*; do process "$f"; done` — Split data and process each batch.

```bash
split -l 1000 users.csv batch_ && for f in batch_*; do ./import.sh "$f"; done
```

`wc -l <file> && split -n l/<n> <file>` — Check line count, then split evenly for parallel processing.

```bash
wc -l data.csv && split -n l/$(nproc) data.csv worker_
```

<!-- PROSE:outro -->
## Conclusion

split is the go-to when a file is too big for a tool, an upload or a chunk of memory. Remember the difference between the modes: `-b` cuts strictly by bytes (and may break mid-line), `-C` and `-n l/…` respect line boundaries, and `-n r/…` distributes round-robin. Reassembly comes down to order alone – `cat prefix_*` works because the shell sorts the alphabetic suffixes correctly; so allow enough suffix digits (`-a`) for the sort to hold even with many pieces. After reassembling large binary files, it is wise to verify the checksum.

## Further Reading

- [GNU coreutils: split](https://www.gnu.org/software/coreutils/manual/html_node/split-invocation.html) – official reference with all options
- [split(1) man page](https://man7.org/linux/man-pages/man1/split.1.html) – the Linux manual page for split
<!-- PROSE:outro:end -->

## Related Commands

- [cut](https://www.jpkc.com/db/en/cheatsheets/files-text/cut/) – extract fields, characters or byte ranges from lines
- [head](https://www.jpkc.com/db/en/cheatsheets/files-text/head/) – show the first lines or bytes of a file
- [tail](https://www.jpkc.com/db/en/cheatsheets/files-text/tail/) – show the last lines of a file or follow it live