At Reverse Australia, we do a lot of work with large data files. In this work, it’s not uncommon for us to want to check that the data we parse from a file, matches up with the raw data in the file. The grep command to extract numbers from a raw file is as follows:
grep '\(\(([0-9]\{2\})\)\|[0-9]\{4\}\) [0-9]\{3,4\} [0-9]\{3,4\}' -o data.txt
This matches numbers in the forms:
(02) 1234 1234
0412 123 123
You can easily get all the unique numbers like so:
grep '\(\(([0-9]\{2\})\)\|[0-9]\{4\}\) [0-9]\{3,4\} [0-9]\{3,4\}' -o data.txt | sort -u
Or just get a count for comparison:
grep '\(\(([0-9]\{2\})\)\|[0-9]\{4\}\) [0-9]\{3,4\} [0-9]\{3,4\}' -o data.txt | sort -u | wc -l

Recent Comments