sort — Sort Lines of Text Files
Practical guide to sort: order lines alphabetically, numerically or by version — with keys, fields, locale control and tuning for large files.
sort orders the lines of a file or a data stream – alphabetically by default, but on demand also numerically, by version, by month name or by human-readable sizes such as 2K or 1G. With keys (-k) and a freely chosen field separator (-t) you sort by specific columns, for example in CSV files or logs. In pipelines, sort teamed up with uniq is the classic duo for frequency analysis. For huge files you can control the memory buffer, the temp directory and even parallel processing. This guide walks you through the options you actually reach for day to day.
Basic Usage
sort <file> — Sort lines alphabetically (default: ascending, case-sensitive).
sort names.txtsort -r <file> — Sort in reverse (descending) order.
sort -r names.txtsort -o <output> <file> — Write result to a file. Safe to use same file as input and output.
sort -o sorted.txt data.txtsort -u <file> — Sort and remove duplicate lines (unique).
sort -u emails.txtsort -c <file> — Check if a file is already sorted. Prints first unsorted line and exits with error.
sort -c data.txtSorting Modes
sort -n <file> — Sort numerically instead of alphabetically.
sort -n scores.txtsort -h <file> — Sort by human-readable numbers (e.g. 2K, 1G, 3M).
du -sh * | sort -hsort -V <file> — Sort version numbers naturally (e.g. 1.2 < 1.10).
sort -V versions.txtsort -M <file> — Sort by month name (Jan < Feb < Mar ...).
sort -M months.txtsort -R <file> — Sort in random order (shuffle lines).
sort -R playlist.txtsort -g <file> — Sort by general numeric value. Supports scientific notation (e.g. 1.5e3).
sort -g measurements.txtKey & Field Selection
sort -k <field> <file> — Sort by a specific field (1-based). Default separator is whitespace.
sort -k 2 data.txtsort -k <start>,<end> <file> — Sort by field range from start to end (inclusive).
sort -k 2,2 data.txtsort -t '<sep>' -k <field> <file> — Set field separator and sort by a specific field.
sort -t ',' -k 3 data.csvsort -t ':' -k 3 -n <file> — Sort by numeric value of a specific field with custom delimiter.
sort -t ':' -k 3 -n /etc/passwdsort -k <field>n <file> — Sort a specific field numerically. Modifier is appended to the key.
sort -k 2n scores.txtsort -k <f1>,<f1> -k <f2>,<f2>n <file> — Sort by multiple keys. First key is primary, second is secondary.
sort -k 1,1 -k 2,2n students.txtCase & Locale
sort -f <file> — Fold lower case to upper case (case-insensitive sorting).
sort -f mixed-case.txtsort -d <file> — Dictionary order. Consider only blanks and alphanumeric characters.
sort -d words.txtsort -i <file> — Ignore non-printable characters when sorting.
sort -i data.txtLC_ALL=C sort <file> — Sort using byte values (C locale). Faster and consistent across systems.
LC_ALL=C sort large-file.txtWhitespace & Stability
sort -b <file> — Ignore leading blanks when determining sort keys.
sort -b indented.txtsort -s <file> — Stable sort. Preserve original order of lines with equal keys.
sort -s -k 1,1 data.txtsort -z <file> — Use NUL as line delimiter instead of newline. Useful with find -print0.
find . -print0 | sort -zMerging & Large Files
sort -m <file1> <file2> — Merge already sorted files without re-sorting.
sort -m sorted1.txt sorted2.txtsort -S <size> <file> — Use specified amount of memory for sorting buffer.
sort -S 2G huge-file.txtsort -T <dir> <file> — Use specified directory for temporary files instead of /tmp.
sort -T /data/tmp huge-file.txtsort --parallel=<n> <file> — Run up to N sorts concurrently (GNU sort).
sort --parallel=4 huge-file.txtPipelines
<command> | sort — Sort the output of any command.
ls -1 | sort<command> | sort -n — Sort command output numerically.
wc -l *.txt | sort -n<command> | sort | uniq -c | sort -rn — Count occurrences and show most frequent first.
awk '{print $1}' access.log | sort | uniq -c | sort -rn<command> | sort -u — Sort and deduplicate output in one step.
cat file1.txt file2.txt | sort -u<command> | sort -t '<sep>' -k <field> -rn | head -n <count> — Get the top N entries sorted by a numeric field.
du -s */ | sort -rn | head -n 10Common Patterns
sort -t ',' -k 2,2 -k 3,3n <file> — Sort CSV by column 2 (alphabetic), then column 3 (numeric).
sort -t ',' -k 2,2 -k 3,3n employees.csvsort -t '.' -k 1,1n -k 2,2n -k 3,3n -k 4,4n <file> — Sort IP addresses numerically by each octet.
sort -t '.' -k 1,1n -k 2,2n -k 3,3n -k 4,4n ips.txtsort -rn <file> | head -n 1 — Find the largest numeric value in a file.
sort -rn numbers.txt | head -n 1sort <file1> <file2> | uniq -d — Find common lines between two files.
sort users1.txt users2.txt | uniq -dsort <file1> <file2> | uniq -u — Find lines unique to either file (not in both).
sort old-list.txt new-list.txt | uniq -utail -n +2 <file> | sort -t ',' -k <field>n — Sort a CSV file by a numeric column, skipping the header row.
tail -n +2 sales.csv | sort -t ',' -k 3n Conclusion
sort is one of the most versatile text tools in the shell and combines seamlessly with uniq, head and cut. Mind the locale: in a UTF-8 environment sort behaves differently than with LC_ALL=C, which can produce surprising orderings and noticeable speed differences – for reproducible, fast results prefix LC_ALL=C. With -o you can safely write back to the same file; a plain > would instead truncate the input before sort reads it. And remember: a downstream uniq only removes immediately adjacent duplicates, which is why a sort almost always belongs in front of it.
Further Reading
- GNU Coreutils manual: sort – complete reference for every option
- man7.org: sort(1) – the Linux manual page