# sort — Sort Lines of Text Files

> Practical guide to sort: order lines alphabetically, numerically or by version — with keys, fields, locale control and tuning for large files.

Source: https://www.jpkc.com/db/en/cheatsheets/files-text/sort/

<!-- PROSE:intro -->
sort orders the lines of a file or a data stream – alphabetically by default, but on demand also numerically, by version, by month name or by human-readable sizes such as 2K or 1G. With keys (`-k`) and a freely chosen field separator (`-t`) you sort by specific columns, for example in CSV files or logs. In pipelines, sort teamed up with `uniq` is the classic duo for frequency analysis. For huge files you can control the memory buffer, the temp directory and even parallel processing. This guide walks you through the options you actually reach for day to day.
<!-- PROSE:intro:end -->

## Basic Usage

`sort <file>` — Sort lines alphabetically (default: ascending, case-sensitive).

```bash
sort names.txt
```

`sort -r <file>` — Sort in reverse (descending) order.

```bash
sort -r names.txt
```

`sort -o <output> <file>` — Write result to a file. Safe to use same file as input and output.

```bash
sort -o sorted.txt data.txt
```

`sort -u <file>` — Sort and remove duplicate lines (unique).

```bash
sort -u emails.txt
```

`sort -c <file>` — Check if a file is already sorted. Prints first unsorted line and exits with error.

```bash
sort -c data.txt
```

## Sorting Modes

`sort -n <file>` — Sort numerically instead of alphabetically.

```bash
sort -n scores.txt
```

`sort -h <file>` — Sort by human-readable numbers (e.g. 2K, 1G, 3M).

```bash
du -sh * | sort -h
```

`sort -V <file>` — Sort version numbers naturally (e.g. 1.2 < 1.10).

```bash
sort -V versions.txt
```

`sort -M <file>` — Sort by month name (Jan < Feb < Mar ...).

```bash
sort -M months.txt
```

`sort -R <file>` — Sort in random order (shuffle lines).

```bash
sort -R playlist.txt
```

`sort -g <file>` — Sort by general numeric value. Supports scientific notation (e.g. 1.5e3).

```bash
sort -g measurements.txt
```

## Key & Field Selection

`sort -k <field> <file>` — Sort by a specific field (1-based). Default separator is whitespace.

```bash
sort -k 2 data.txt
```

`sort -k <start>,<end> <file>` — Sort by field range from start to end (inclusive).

```bash
sort -k 2,2 data.txt
```

`sort -t '<sep>' -k <field> <file>` — Set field separator and sort by a specific field.

```bash
sort -t ',' -k 3 data.csv
```

`sort -t ':' -k 3 -n <file>` — Sort by numeric value of a specific field with custom delimiter.

```bash
sort -t ':' -k 3 -n /etc/passwd
```

`sort -k <field>n <file>` — Sort a specific field numerically. Modifier is appended to the key.

```bash
sort -k 2n scores.txt
```

`sort -k <f1>,<f1> -k <f2>,<f2>n <file>` — Sort by multiple keys. First key is primary, second is secondary.

```bash
sort -k 1,1 -k 2,2n students.txt
```

## Case & Locale

`sort -f <file>` — Fold lower case to upper case (case-insensitive sorting).

```bash
sort -f mixed-case.txt
```

`sort -d <file>` — Dictionary order. Consider only blanks and alphanumeric characters.

```bash
sort -d words.txt
```

`sort -i <file>` — Ignore non-printable characters when sorting.

```bash
sort -i data.txt
```

`LC_ALL=C sort <file>` — Sort using byte values (C locale). Faster and consistent across systems.

```bash
LC_ALL=C sort large-file.txt
```

## Whitespace & Stability

`sort -b <file>` — Ignore leading blanks when determining sort keys.

```bash
sort -b indented.txt
```

`sort -s <file>` — Stable sort. Preserve original order of lines with equal keys.

```bash
sort -s -k 1,1 data.txt
```

`sort -z <file>` — Use NUL as line delimiter instead of newline. Useful with find -print0.

```bash
find . -print0 | sort -z
```

## Merging & Large Files

`sort -m <file1> <file2>` — Merge already sorted files without re-sorting.

```bash
sort -m sorted1.txt sorted2.txt
```

`sort -S <size> <file>` — Use specified amount of memory for sorting buffer.

```bash
sort -S 2G huge-file.txt
```

`sort -T <dir> <file>` — Use specified directory for temporary files instead of /tmp.

```bash
sort -T /data/tmp huge-file.txt
```

`sort --parallel=<n> <file>` — Run up to N sorts concurrently (GNU sort).

```bash
sort --parallel=4 huge-file.txt
```

## Pipelines

`<command> | sort` — Sort the output of any command.

```bash
ls -1 | sort
```

`<command> | sort -n` — Sort command output numerically.

```bash
wc -l *.txt | sort -n
```

`<command> | sort | uniq -c | sort -rn` — Count occurrences and show most frequent first.

```bash
awk '{print $1}' access.log | sort | uniq -c | sort -rn
```

`<command> | sort -u` — Sort and deduplicate output in one step.

```bash
cat file1.txt file2.txt | sort -u
```

`<command> | sort -t '<sep>' -k <field> -rn | head -n <count>` — Get the top N entries sorted by a numeric field.

```bash
du -s */ | sort -rn | head -n 10
```

## Common Patterns

`sort -t ',' -k 2,2 -k 3,3n <file>` — Sort CSV by column 2 (alphabetic), then column 3 (numeric).

```bash
sort -t ',' -k 2,2 -k 3,3n employees.csv
```

`sort -t '.' -k 1,1n -k 2,2n -k 3,3n -k 4,4n <file>` — Sort IP addresses numerically by each octet.

```bash
sort -t '.' -k 1,1n -k 2,2n -k 3,3n -k 4,4n ips.txt
```

`sort -rn <file> | head -n 1` — Find the largest numeric value in a file.

```bash
sort -rn numbers.txt | head -n 1
```

`sort <file1> <file2> | uniq -d` — Find common lines between two files.

```bash
sort users1.txt users2.txt | uniq -d
```

`sort <file1> <file2> | uniq -u` — Find lines unique to either file (not in both).

```bash
sort old-list.txt new-list.txt | uniq -u
```

`tail -n +2 <file> | sort -t ',' -k <field>n` — Sort a CSV file by a numeric column, skipping the header row.

```bash
tail -n +2 sales.csv | sort -t ',' -k 3n
```

<!-- PROSE:outro -->
## Conclusion

sort is one of the most versatile text tools in the shell and combines seamlessly with `uniq`, `head` and `cut`. Mind the locale: in a UTF-8 environment sort behaves differently than with `LC_ALL=C`, which can produce surprising orderings and noticeable speed differences – for reproducible, fast results prefix `LC_ALL=C`. With `-o` you can safely write back to the same file; a plain `>` would instead truncate the input before sort reads it. And remember: a downstream `uniq` only removes immediately adjacent duplicates, which is why a sort almost always belongs in front of it.

## Further Reading

- [GNU Coreutils manual: sort](https://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html) – complete reference for every option
- [man7.org: sort(1)](https://man7.org/linux/man-pages/man1/sort.1.html) – the Linux manual page
<!-- PROSE:outro:end -->

## Related Commands

- [uniq](https://www.jpkc.com/db/en/cheatsheets/files-text/uniq/) – filter and count adjacent duplicate lines
- [wc](https://www.jpkc.com/db/en/cheatsheets/files-text/wc/) – count lines, words and bytes
- [cut](https://www.jpkc.com/db/en/cheatsheets/files-text/cut/) – extract columns and fields from lines

