Skip to content
andychu edited this page Feb 7, 2018 · 33 revisions

The shell has interacts with a set of Unix tools in /bin and so forth. However, in many cases, those tools have grown functionality that overlaps with shell.

Unix Tools ...

Related: Ad Hoc Protocols in Unix

That Start Processes (in parallel)

  • make and other build tools. make -j for parallel builds.
  • xargs, -P for parallel execution, -I {} for substitution
    • Also GNU Parallel, which is mentioned in the bash manual
  • find -exec and -exec +

That Have Expression Languages

Expression languages must be fully recursive to count here.

With no lexer:

  • find -- -a -o ! ( )
  • test -- -a -o ! ( )
  • expr -- arithmetic, subsumed by $(())

Languages with lexers:

  • awk
  • dtrace -- modelled after awk.

Honorable mention:

  • strace also has a little expression language, but it's not fully recursive

That Use Regexes

  • grep, grep -E
  • sed, sed --regexp-extended in GNU sed
  • awk (extended only)
  • expr
  • find -regex
  • bash itself

That Receive Code Snippets (Remote Evaluation)

  • tar has a --sed option

That Have Printf-Style Formatting

See Appendix A: How to Quickly and Correctly* Generate a Git Log in HTML

  • find -printf (arbitrary filenames)
  • stat -c (arbitrary filenames)
  • curl --write-out %{response_code} -- URLs can't have arbitrary characters?
  • printf itself (coreutils)
  • time (/usr/bin/time) -- mostly numbers
  • date -- mostly numbers
  • bash
    • the printf builtin
    • the time builtin and the TIMEFORMAT string -- mostly numbers
    • the prompt string: \h \W
  • ps --format

Non-standard tools:

NOTE: grep should have a syntax for captures, like $1 $2 name: $name age: $age. sed just has & for the matched group.

With Quoting/Escaping Algorithms

  • ls -q -b for unprintable chars in filenames
  • printf %q for spaces in args
  • ${var@Q} which is different than printf %q!!! See help-bash@ thread.

With Single Field Substitution

These should be replaced with $_ or @_ ("it").

  • xargs -I {} -- echo {}
  • find -exec {} +

With Tabular Output

  • find / ls
  • ps
  • df (has -h and -H human-readable option, --output[=FIELD_LIST] but no format string)
  • du -- has -0 for NUL output
  • TODO: look at netstat, iostat, lsof, etc. Brendan Gregg's pages.

With File System Path Matching

  • du --exclude
  • rsync --include --exclude
  • find -name, -regex, -wholename, etc.

Misc Expression Languages

  • getopts builtin spec, and /usr/bin/getopt
    • leading : means to do different error handling! Instead of the arg. Gah.

The Worst Offender

find starts processes (in parallel), it a recursive boolean expression language, it has regexes (and globs), and it has field substitution. It should be part of the shell!

It also doesn't give good parse error messages. Sometimes it just says "find: invalid expression" with no location information.

Wow this is crazy too:

The regular expressions understood by find are by default Emacs Regular Expressions, but this can be changed with the -regextype option.

$ find -regextype -help
find: Unknown regular expression type ‘-help’; valid types are ‘findutils-default’, ‘awk’, ‘egrep’, ‘ed’, ‘emacs’, ‘gnu-awk’, ‘grep’, ‘posix-awk’, ‘posix-basic’, ‘posix-egrep’, ‘posix-extended’, ‘posix-minimal-basic’, ‘sed’.

I didn't know there were that many regex types! And emacs is a really bad default!

Families of Unix Tools

Misc Problems

Clone this wiki locally